Current Members

This page contains the current STARLab members.

Prof. Dr. Meersman, Robert

| courses | papers | conferences | contact |
 
datanews pic sept 2009

  Ph.D in Mathematics at the Vrije Universiteit Brussel (VUB) in 1976.
  Appointed as Full Professor at VUB in 1995. Earlier positions include the
  University of Antwerp (UIA, 1975-78), Control Data Corp. (Data
  Management Research Lab, Brussels, Belgium, 1978-83). Worked there
  on the definition of the NIAM (now ORM) method as well as on its query
  and constraint languages (RIDL) and on the first tools for this
  methodology. Founded the first InfoLab at University of Hasselt (Belgium, 1983-86) and its second incarnation at University of Tilburg (The Netherlands, 1986-95). Current research is called DOGMA and focused on ontologies and their relationship with and use in databases, semantic web and social process-driven semantic design methodologies and tools.

Member and Past Chairperson (1983-92) of the IFIP WG2.6 on Database, and of WG12.7 on Social Semantics and Collective Intelligence (2010-12). Past Chairperson of IFIP TC 12 (Artificial Intelligence, 1987-92), and of TC 2 (Software Theory and Practice, 2003-08). Co-Founder of the International Foundation for Cooperative Information Systems (IFCIS, now CoopIS, since 1994) and current president of the Distributed Objects Applications Institute (DOAI, since 2001). General co-Chair of the annual OnTheMove federated conferences and workshops covering many aspects of distributed and ubiquitous computing.

Founded the Semantics Technology and Applications Research Laboratory (STARLab) at VUB in 1995. Director of STARLab since. Current scientific interests include ontologies, database semantics, domain and database modeling, interoperability and meaningful use of databases in applications such as enterprise knowledge management, the Semantic Web, and community-driven computing in general.

My LinkedIn profile here

Robert Meersman's Facebook profile

 

Courses

Database Theory (MSc) (syllabus material available on Pointcarré)

Informatiesystemen (Information Systems; MSc) (slides available on Pointcarré)

Open Information Systems (new 2012; MSc; slides and material available on Pointcarré)

Seminar on Advanced Databases & Applications (slides and reader available courses link above)

Conferences

OnTheMove '12: Rome, Italy, September 10 - 14, 2012

Contact Information
Office 10G-730A
E-Mail meersman [at] vub.ac.be
Telephone +32 (0)2 629 1237
Telefax +32 (0)2 629 3819

Debruyne, Christophe

Christophe Debruyne

 

Current activities

I am currently a researcher at the Semantics Technology & Applications Research Lab of the Vrije Universiteit Brussel, which is part of the Computer Science Department of the Faculty of Science and Bio-engineering Sciences. My research context includes ontology engineering, ontology reuse, and reasoning about ontological commitments. 

Contact and other information

Agile adaptation and validation of business rules

A company always has business concepts that are actually business rules for other business concepts. For example, in a bank, the concept of a "Gold Credit Card Customer" is a customer who is entitled to a Gold Credit Card. A formal definition of this concept might be "Gold Credit Card Customer is a Customer having a Retail Checking Account with Account Balance > € 50 000, where" Retail Checking Account "is a type of account in the bank. Using semantics, it is possible to establish such rules, and relate them to the actual data.

The goal of this thesis is to investigate how easy it is for the user to quickly adjust the definition (e.g., from Retail Checking Account to Savings Account) to validate the impact (e.g., based on the first definition, we had x Gold Credit Card Customers, based on the updated definition we now have y - which one is best suited?).

Applying semantics to improve data quality

Because companies are dependent on their data to make the right decisions, Data Quality is a very active domain in the world of enterprise. It is about making sure that the data is complete, valid, consistent, timely and accurate for a specific use (e.g., determine the credit worthiness of a customer of a bank asking for a loan). Companies have a wide variety of tools to measure and improve their data quality (e.g., data profiling, data cleansing, data validation, ...). However, there is a gap between the actual business value (e.g., as regulated via business rules) and the actual checks that are being done in data quality tools at the moment.

In this thesis you will study the state of the art of data quality (and possible impacts - e.g., in Business Intelligence), and build a prototype that shows how business semantics can be used to improve data quality tooling.

Away with doubles: object identification in data

Doubles in data can occur for many reasons. First, both "A. Turing 8b Manchester Oxford Road" as well as "Alan Turing Oxford Rd Manchester 8b MI39PL" can be found in a database. Two different entries that are likely to refer the same person and same address. Such duplicates can sometimes be found by measuring the textual agreement. Sometimes this is not the case.

Assume a database consisting of publications (eg DBLP, http://www.informatik.uni-trier.de/~ley/db /index.html). Suppose there are two authors with similar names, for example, Alon Levy and Alon Halevy. Although these are textually very similar, this is not sufficient to conclude that they refer to the same person. But looking at the list of co-authors of both people can help to prove that they are 90% identical. For example, it can turn out that Alon Levy  changed his surname to Halevy in 1999. So here is structural information such as links to co-authors that can help to identify individuals. The research topic to examine objects that are the same in the real world has many names: record linkage, merge / purge, de-duplication, reference matching, object identification, object reconciliation, identity uncertainty, ...

This thesis consists of two parts. In the first part the scientific literature regarding Object identification is studied (this includes techniques of mining, machine learning and probability). In the second part, these techniques are applied in a case where the data is provided by Collibra (www.collibra.com - a company specializing in data governance declared by the news as the most promising start-up in Belgium).

Bachelor Project: Een conceptuele querytaal voor RDF door middel van RIDL*

Op STARLab verrichten we onderzoek naar methoden en applicaties voor het collaboratief creëren van ontologieën. Ontologieën zijn gedeelde beschrijvingen van een (stuk van de) wereld in een bepaald formalisme dat gegevensuitwisseling tussen autonoom ontwikkelde en beheerde informatiesystemen toelaat. Het DOGMA [1] ontologie raamwerk dat op STARLab werd ontwikkeld is deels gebaseerd op Object Role Modeling [2] (ORM). Een methode en applicatie werden ontwikkeld bovenop dat raamwerk dat ook de sociale interacties en definities in natuurlijke taal in achting nemen [3]. Individuele applicaties worden dan geannoteerd met de ontologieën, wat een “ontological commitment” heet. Zo’n commitment beschrijft hoe applicatiesymbolen zich tot de ontologie verhouden, en welke (extra) beperkingen op de relaties in de ontologie voor de applicatie gelden.
 
De commitments laten toe om RDF uit de geannoteerde applicaties te destilleren. In de jaren 80 werd een query taal ontwikkeld voor NIAM (een methode vergelijkbaar met ORM), die gebruikers toeliet queries te formuleren aan de hand van de feit types. Deze taal heet RIDL* [4] en werd recentelijk hergebruikt voor het queryen van RDF [5] data. Een manier om “dingen” te beschrijven aan de hand van een simpel model (subject-predicate-object) en URI om die “dingen” te identificeren. In dit project wordt er van de student verwacht het gedane werk over te nemen om enerzijds het prototype te verbeteren, en anderzijds te vervolledigen.
De basisvereisten voor dit project zijn als volgt:
 
1) het uitwerken van de grammatica van RIDL* voor RDF (aan de hand van ANTLR [6])
2) het verbeteren en vervolledigen van het prototype
3) het ontwikkelen van een client voor het queryen van data
 
Extra’s binnen dit project kunnen omvatten
 
1) de mogelijkheid van het terzelfdertijd consulteren van meerdere bronnen
2) integratie van het prototype binnen het collaboratief platform
 
Referenties
 
[1] http://starlab.vub.ac.be/website/research
[2] Halpin & Morgan. Information Modeling and Relational Databases 
[3] Christophe Debruyne, Robert Meersman: GOSPL: A Method and Tool for Fact-Oriented Hybrid Ontology Engineering. ADBIS 2012: 153-166 (paper available at http://starlab.vub.ac.be/website/node/773)
[4] http://starlab.vub.ac.be/website/files/RIDL_UserGuide_1.pdf
[5] http://www.w3.org/TR/rdf-primer/
[6] http://www.antlr.org/

Bachelor Project: Een grafische editor voor binaire ORM-diagrammen

Op STARLab verrichten we onderzoek naar methoden en applicaties voor het collaboratief creëren van ontologieën. Ontologieën zijn gedeelde beschrijvingen van een (stuk van de) wereld in een bepaald formalisme dat gegevensuitwisseling tussen autonoom ontwikkelde en beheerde informatiesystemen toelaat. Het DOGMA [1] ontologie raamwerk dat op STARLab werd ontwikkeld is deels gebaseerd op Object Role Modeling [2] (ORM). Een methode en applicatie werden ontwikkeld bovenop dat raamwerk dat ook de sociale interacties en definities in natuurlijke taal in achting nemen [3]. Individuele applicaties worden dan geannoteerd met de ontologieën, wat een “ontological commitment” heet. Zo’n commitment beschrijft hoe applicatiesymbolen zich tot de ontologie verhouden, en welke (extra) beperkingen op de relaties in de ontologie voor de applicatie gelden.
 
In dit project, wordt er van de student verwacht een grafische editor voor dergelijke commitments te ontwikkelen met behulp van het Eclipse Modeling Framework Project [4]. Dit raamwerk laat toe makkelijk tools te ontwikkelen voor het bekijken, manipuleren, etc. van artefacten gebaseerd op een gestructureerd model. De basisvereisten voor dit project zijn.
 
1) een studie van het EMF raamwerk en Object Role Modeling (indien nodig).
2) het ontwikkelen van een grafische editor voor binaire ORM-diagrammen;
3) het ontwikkelen van een grafische editor voor ontologische commitments (dit houdt in: verwijzen naar een ontologie en dat model inladen, toelaten extra feit types en beperkingen te modelleren, en het beheren van de mappings).
 
Extra's voor dit project kunnen zijn:
1) Het modelleren van unaire of n-aire feittypes
2) Analyse van RM-refereerbaarheid
3) Nested fact types
 
Referenties
 
[1] http://starlab.vub.ac.be/website/research
[2] Halpin & Morgan. Information Modeling and Relational Databases 
[3] Christophe Debruyne, Robert Meersman: GOSPL: A Method and Tool for Fact-Oriented Hybrid Ontology Engineering. ADBIS 2012: 153-166 (paper available at http://starlab.vub.ac.be/website/node/773)
[4] http://www.eclipse.org/modeling/emf/
 
Contactpersoon: Christophe Debruyne 

Database Atomization for Linked Data

Database Atomization
 
In this thesis, the student is expected to develop a method and tool for annotating databases and publish the content of databases as Linked Data [1] on the Web.
 
Off the shelve solutions such as D2R Server [2] provide a great way to transform database content into triples, but they are not appropriately annotated with an ontology resulting from the collaboration of members between a community.
STARLab has developed a controlled natural language for annotating databases called Ω-RIDL [3,4] that allows the user to describe their databases (amongst others) in terms of sentences, called a commitment.
 
The student is expected to study how Ω-RIDL can be deployed to provide the necessary annotations to the triples generated with off the shelve solutions. Furthermore, a prototype demonstrating these principles needs to be developed. Those commitments, which contain constraints capturing well the intended semantics of an application, can furthermore be used to validate the data (as the described constraints do not necessarily correspond with the stored data).
This thesis is both suited for the 1 year Master of Applied Computerscience as well as both 2 year programs. In the case of the latter, the problem statement will be extended in collaboration with the student.
 
Contact: Christophe Debruyne, chrdebru@vub.ac.be 
 
[1] http://www.linkeddata.org/
[2] http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/
[3] Pieter Verheyden, Jan De Bo, Robert Meersman: Semantically Unlocking Database Content Through Ontology-Based Mediation. SWDB 2004: 109-126
[4] Damien Trog, Yan Tang, Robert Meersman: Towards Ontological Commitments with Omega -RIDL Markup Language. RuleML 2007: 92-106

Dipping the Web of Data for Support for Statements


Before communities can use information and interoperability between information systems is established, a consensus on an ontology needs to be achieved among its different stakeholders. Application symbols are mapped onto concepts in that ontology once the community reaches an agreement. Such a mapping is called an application commitment. Members of a community might enter an observation (hypothesis) while working on an ontology that might be true for their application, but not for the applications of other stakeholders.

Counterexamples for such an observation result in the refusal of that observation, refinement of the ontology or the detection of mistakes in the data sets. A hypothesis can also result in an inconsistent schema that needs to be communicated to the community. Even the dialogue/interaction between the community members can be used to determine which actions should be taken (e.g., argumentation theory in multi-agent settings).

The goal of this thesis is to develop a method and tool to test statements (e.g., via a query) and validate the results via the commitment. The output of this will then be used to trigger various ontology engineering processes.

Contact: Christophe Debruyne (chrdebru@vub.ac.be) 

  1. Debruyne, C., (2010) On the Social Dynamics of Ontological Commitments. In Proc. of On the Move to Meaningful Internet Systems 2010: OTM Workshops - OTMA (OTMA 2010), LNCS, Springer
  2. Meersman, R. and Debruyne, C. (2010) Hybrid Ontologies and Social Semantics. In Proc. of 4th IEEE International Conference on Digital Ecosystems and Technologies (DEST 2010), IEEE Pres
  3. Martin Hepp, Pieter De Leenheer, Aldo de Moor, York Sure (Eds.) (2008) Ontology Management, Semantic Web, Semantic Web Services, and Business Applications. Semantic Web And Beyond Computing for Human Experience Vol. 7 Springer 2008, ISBN 978-0-387-69900-4

 

Grounding Business Processes with Social Processes and Natural Language


The goal of this research is to explore the adoption of collaborative knowledge engineering applied to business processes. The specific method would be GOSPL, which takes into account natural language definitions of concepts. The research takes into account the following steps:

  1. Requirements analysis
  2. State-of-the-art on collaborative process modeling
  3. Adopting a suitable workflow language (and engine), e.g., BPEL [1], YAWL [2], etc.
  4. Extending the GOSPL [1] prototype
  5. Developing a demonstrator

Implementation of the prototype: Java, JQuery, etc.

References

  1. http://docs.oasis-open.org/wsbpel/2.0/wsbpel-v2.0.pdf
  2. http://www.yawlfoundation.org/
  3. Debruyne, C. and Meersman, R. (2012) GOSPL: a Method and Tool for Fact-oriented Hybrid Ontology Engineering. In Proc. of Advances in Databases and Information Systems 2012 (ADBIS 2012)

Contact person: Christophe Debruyne

 

Knowledge Elicitation from the Crowd

In this thesis, the student will develop a method and tool for eliciting knowledge from the crowd via a semantic enabled portal or wiki. The knowledge elicited from the crowd need to be connected with a method for ontology engineering; which thus necessitates a layered approach. Key in this thesis are thus provenance and traceability.

Steps to be taken for this thesis are:

  1. Requirements analysis and state-of-the-art
  2. Development of the wiki or portal
  3. Connection with a collaborative knowledge engineering method
  4. Developing a case study

References

  1. http://www.w3.org/TR/sparql11-update/
  2. http://trdf.sourceforge.net/provenance/ns.html

Mining Social Processes and Actions for a Reputation Framework


Members in a community interact and perform actions to achieve a common goal (e.g., an information source such as a wiki around a television show, wikipedia, platforms supporting a software engineering method, etc.). One of such goals is reaching a common understand of a shared reality called an ontology.

On a collaborative ontology-engineering platform you have two types of interactions: one between the system and the user (e.g., to manipulate the ontology) and between users (e.g., negotiation, requests for review, etc.). Users naturally evolve towards using a series of action they feel most comfortable with and built a certain expertise and trust from others around it. Mining those interactions allows for different types of users to be clustered. Depending on the type of user, an action has to follow different path or not (e.g., skipping an approval stage). Of course, a temporal aspect has to be taken into account for allowing evolving expertise.

The goal of this thesis is to define a model for these processes (starting from earlier work on this subject [1]), a method to cluster types of users and a reputation framework for ontology engineering. This framework will then be used to define different processes for the same action depending on the type or expertise of user.

Contact: Christophe Debruyne (chrdebru@vub.ac.be) 

  1. De Leenheer, P., Debruyne, C., and Peeters, J. (2009Towards Social Performance Indicators for Community-based Ontology Evolution. In Proc. of Workshop on Collaborative Construction, Management and Linking of Structured Knowledge (CK2009), collocated with the 8th International Semantic Web Conference (ISWC 2009), CEUR-WS
  2. Debruyne, C., (2010) On the Social Dynamics of Ontological Commitments. In Proc. of On the Move to Meaningful Internet Systems 2010: OTM Workshops - OTMA (OTMA 2010), LNCS, Springer
  3. Meersman, R. and Debruyne, C. (2010) Hybrid Ontologies and Social Semantics. In Proc. of 4th IEEE International Conference on Digital Ecosystems and Technologies (DEST 2010), IEEE Press
  4. Martin Hepp, Pieter De Leenheer, Aldo de Moor, York Sure (Eds.) (2008) Ontology Management, Semantic Web, Semantic Web Services, and Business Applications. Semantic Web And Beyond Computing for Human Experience Vol. 7 Springer 2008, ISBN 978-0-387-69900-4

Mining the "right" semantics

If a company keeps good diagrams and definitions, the problem is often that there are too many. All too much, an organization is faced with multiple different definitions of a business concept (e.g.,  Customer in a "Customer Support" context versus Customer in an "Accounting" context, ...). For the organization,  it is a nightmare to find out exactly where the differences are and how they can get a more harmonized business semantics.

Through appropriate support from technology, this can be greatly improved. By loading and analyzing existing data (e.g., data, schemas, definitions, ...), you can assist the user in identifying similarities and differences using appropriate algorithms and visualizations. The aim of the thesis is to examine both the mining techniques as well as possible visualization techniques that support analysis by business users.

Semantic dependency management

A company always has a lot of different ways to manage their business concepts, as well as the related taxonomy and business rules. If these need to be managed in a correct way there is a need for proper dependency management. For example, you can have a central concept "Party" (that is managed in a core business unit of the company), which further down branches into other types of parties (e.g., "Customer", "supplier", ...), each of these managed in other departments. The relationship between these concepts implies a certain dependency (e.g., you cannot change "Party" unless it is ok with respect to "Customer"). This continues to the level of the actual data (eg "Alan Turing" is an instance of a "customer" - what does a change of meaning to "Party" mean for poor old Alan?)

The goal of this project is to examine the various possible  dependencies, how this impact analysis (impact and repair) can be achieved, and how the best possible support can be achieved if you stay within the rules of dependency management.

Using Social Network Analysis for Social Performance Indicators in Business Semantics Management

Social performance indicators (SPI's) monitor the performance of employees in the community who have a role in Business Semantics Management. The quality of a semantic pattern depends on who defines it. This could be done by analyzing which individual has edited which semantic element and when. Also, it allows the BSM-related communication in the community to be observed using social network analysis (SNA) (Welser et al, 2008). SNA shows the "degree" or "centrality" of any person or document, can provide valuable information for the SPI's. Some semantic patterns are created or commented on by highly positioned or strategically connected people, while others are peripheral or at least connected to a department or group. Monitoring of SPI's can contribute to the descriptive quality and thus the validity of business semantics in development.

In this thesis, the existing literature on Social Performance Indicators and Social Network Analysis is processed, and roles and responsibilities are validated by building a prototype.

Hallot, Frédéric

Hello! My name is Frédéric Hallot.

 

Cristian Vasquez Paulus

Second year PhD Student, engineer, linked data specialist.

Topic: Using our close communities 'semantics' in conjunction with our personal 'context'.

Application: Improving our personal data retrieval capabilities in the long term.

Interests: 

  • Structured data in the Web.
  • Community Semantics.
  • Agreement/Disagreement technologies. 
  • Interpretivist/Positivist positions in IT.

Publications (at STARLabs)

Contact Information: 

Office  
E-Mail cvasquez [at] vub.ac.be
Telephone +32 (0)2 629 2081
   

Dr. Pieter De Leenheer


 Pieter De LeenheerDr. Pieter GM De Leenheer is assistant professor in Business, Web and Media at VU University Amsterdam. He is also co-founder and research director of Collibra, a Brussels-based semantic software company that spun off from the Vrije Universiteit Brussel (VUB). From 2002-2009, Pieter worked as a scientist at VUB STARLab, and he was lecturer at the same university. Pieter holds a PhD in computer science, and a BSc and MSc in principle computer science, both from VUB. His main interest lies in the social aspects of collaborative business semantics management and its applications.

Pieter authored more than 30 publications in various books, international journals and conferences, among which he co-edited the Springer book "Ontology Management for the Semantic Web". He gives master lectures including Database Theory, (Web) Information Systems, Requirements Engineering, and Semantic Web languages.  Being active in EU initiatives for many years, he has extensive experience in acquiring of and participation in projects. In addition he is engaged by the EU commission as an FPx Expert for evaluating proposals and reviewing projects. He is member of ACM and IEEE, and referent/peer reviewer in several international conferences and journals. 

Find out more about his activities on his personal website: http://www.pieterdeleenheer.be


Sven Van Laere

Bio

Sven Van Laere, born in Waregem (Belgium), got a Professional Bachelor in Applied Informatics in 2009. After this he joined the VUB for an academic Master in Engineering Science: Computer Science. Here he followed Web & Information Systems Engineering and did his thesis at STARLab of Prof. Meersman.