Keywords
Computer Science and Digital Science
- A3.1.1. Modeling, representation
- A3.2.1. Knowledge bases
- A3.2.3. Inference
- A3.2.5. Ontologies
- A7.2. Logic in Computer Science
- A9.1. Knowledge
- A9.6. Decision support
- A9.7. AI algorithmics
- A9.8. Reasoning
Other Research Topics and Application Domains
- B3.1. Sustainable development
- B9.5.6. Data science
- B9.7.2. Open data
1 Team members, visitors, external collaborators
Research Scientists
- Jean-François Baget [Inria, Researcher]
- Pierre Bisquert [Institut national de recherche pour l'agriculture, l'alimentation et l'environnement, Researcher]
- David Carral Martinez [Inria, Researcher]
Faculty Members
- Marie-Laure Mugnier [Team leader, Univ de Montpellier, Professor, HDR]
- Michel Chein [Univ de Montpellier, Emeritus, HDR]
- Madalina Croitoru [Univ de Montpellier, Professor, HDR]
- Michel Leclère [Univ de Montpellier, Associate Professor]
- Federico Ulliana [Univ de Montpellier, Associate Professor, HDR]
PhD Students
- Martin Jedwabny [Univ de Montpellier]
- Elie Najm [Inria]
- Guillaume Perution Kihli [Inria]
- Olivier Rodriguez [Inria]
Technical Staff
- Florent Tornil [Inria, Engineer]
Interns and Apprentices
- Mael Abily [École Normale Supérieure de Lyon, from May 2021 until Jul 2021]
- Riadh Guemache [Inria, from Feb 2021 until Jul 2021]
- Lucas Larroque [Ecole normale supérieure Paris-Saclay, from Jun 2021 until Jul 2021]
- Tom Salembien [Telecom SudParis, from Jun 2021 until Jul 2021]
- Quentin Yeche [Inria, from Jun 2021 until Jul 2021]
Administrative Assistant
- Annie Aliaga [Inria]
External Collaborators
- Meghyn Bienvenu [CNRS, HDR]
- Patrice Buche [Institut national de recherche pour l'agriculture, l'alimentation et l'environnement, HDR]
- Alain Gutierrez [CNRS]
- Rallou Thomopoulos [Institut national de recherche pour l'agriculture, l'alimentation et l'environnement, HDR]
2 Overall objectives
2.1 Logic and graph-based KR
The main research domain of GraphIK is Knowledge Representation and Reasoning (KR), which studies paradigms and formalisms for representing knowledge and reasoning on these representations. A large part of our work is strongly related to data management and database theory.
We develop logical languages, which mainly correspond to fragments of first-order logic. However, we also use graphs and hypergraphs (in the graph-theoretic sense) as basic objects. Indeed, we view labelled graphs as an abstract representation of knowledge that can be expressed in many KR languages: different kinds of conceptual graphs—historically our main focus—the Semantic Web language RDFS, expressive rules equivalent to so-called tuple-generating-dependencies in databases, some description logics dedicated to query answering, etc. For these languages, reasoning can be based on the structure of objects (thus on graph-theoretic notions) while being sound and complete with respect to entailment in the associated logical fragments. An important issue is to study trade-offs between the expressivity and the computational tractability of (sound and complete) reasoning in these languages.
2.2 From theory to applications, and vice-versa
We study logic- and graph-based KR formalisms from three perspectives:
- theoretical (structural properties, expressiveness, translations between languages, problem complexity, algorithm design);
- software (developing tools to implement theoretical results);
- applications (formalizing practical issues and solving them with our techniques, which also feeds back into theoretical work).
2.3 Main challenges
GraphIK focuses on some of the main challenges in KR:
- ontological query answering: querying large, complex or heterogeneous datasets, provided with an ontological layer;
- reasoning with rule-based languages;
- reasoning in presence of inconsistencies and
- decision making.
2.4 Scientific directions
Our research work is currently organized into two research lines, both with theoretical and applied aspects:
- Ontology-mediated query answering (OMQA). Modern information systems are often structured around an ontology, which provides a high-level vocabulary, as well as knowledge relevant to the target domain, and enables a uniform access to possibly heterogeneous data sources. As many complex tasks can be recast in terms of query answering, the question of querying data while taking into account inferences enabled by ontological knowledge has become a fundamental issue. This gives rise to the notion of a knowledge base, composed of an ontology and a factbase, both described using a KR language. The factbase can be seen as an abstraction of several data sources, and may actually remain virtual. The topical ontology-mediated query answering (OMQA) problem asks for all answers to queries that are logically entailed by the given knowledge base.
- Reasoning with imperfect knowledge and decision support. To solve real-world problems we often need to consider features that cannot be expressed purely (or naturally) in classical logic. Indeed, information is often “imperfect”: it can be partially contradictory, vague or uncertain, etc. These last years, we mostly considered reasoning in presence of conflicts, where contradictory information may come from the data or from the ontology. This requires to define appropriate semantics, able to provide meaningful answers to queries while taming the computational complexity increase. Reasoning becomes more complex from a conceptual viewpoint as well, hence how to explain results to an end-user is also an important issue. Such questions are natural extensions to those studied in the first axis. On the other hand, the work of this axis is also motivated by applications provided by our INRAE partners, where the knowledge to be represented intrinsically features several viewpoints and involves different stakeholders with divergent priorities, while a decision has to be made. Beyond the representation of conflictual knowledge itself, this raises arbitration issues. The aim here is to support decision making by tools that help eliciting and representing relevant knowledge, including the stakeholders' preferences and motivations, compute syntheses of compatible options, and propose justified decisions.
3 Research program
3.1 Logic-based knowledge representation and reasoning
We follow the mainstream logic-based approach to knowledge representation (KR). First-order logic (FOL) is the reference logic in KR and most formalisms in this area can be translated into fragments (i.e., particular subsets) of FOL. This is in particular the case for description logics and existential rules, two well-known KR formalisms studied in the team.
A large part of research in this domain can be seen as studying trade-offs between the expressivity of languages and the complexity of (sound and complete) reasoning in these languages. The fundamental problem in KR languages is entailment checking: is a given piece of knowledge entailed by other pieces of knowledge, for instance from a knowledge base (KB)? Another important problem is consistency checking: is a set of knowledge pieces (for instance the knowledge base itself) consistent, i.e., is it sure that nothing absurd can be entailed from it? The ontology-mediated query answering problem is a topical problem (see Section 3.3). It asks for the set of answers to a query in the KB. In the case of Boolean queries (i.e., queries with a yes/no answer), it can be recast as entailment checking.
3.2 Graph-based knowledge representation and reasoning
Besides logical foundations, we are interested in KR formalisms that comply, or aim at complying with the following requirements: to have good computational properties and to allow users of knowledge-based systems to have a maximal understanding and control over each step of the knowledge base building process and use.
These two requirements are the core motivations for our graph-based approach to KR. We view labelled graphs as an abstract representation of knowledge that can be expressed in many KR languages (different kinds of conceptual graphs —historically our main focus— the Semantic Web language RDF (Resource Description Framework), its extension RDFS (RDF Schema), expressive rules equivalent to the so-called tuple-generating-dependencies in databases, some description logics dedicated to query answering, etc.). For these languages, reasoning can be based on the structure of objects, thus based on graph-theoretic notions, while staying logically founded.
More precisely, our basic objects are labelled graphs (or hypergraphs) representing entities and relationships between these entities. These graphs have a natural translation in first-order logic. Our basic reasoning tool is graph homomorphism. The fundamental property is that graph homomorphism is sound and complete with respect to logical entailment i.e., given two (labelled) graphs and , there is a homomorphism from to if and only if the formula assigned to is entailed by the formula assigned to . In other words, logical reasoning on these graphs can be performed by graph mechanisms. These knowledge constructs and the associated reasoning mechanisms can be extended (to represent rules for instance) while keeping this fundamental correspondence between graphs and logics.
3.3 Ontology-mediated query answering
Querying knowledge bases has become a central problem in knowledge representation and in databases. A knowledge base is classically composed of a terminological part (metadata, ontology) and an assertional part (facts, data). Queries are supposed to be at least as expressive as the basic queries in databases, i.e., conjunctive queries, which can be seen as existentially closed conjunctions of atoms or as labelled graphs. The challenge is to define good trade-offs between the expressivity of the ontological language and the complexity of querying data in presence of ontological knowledge. Description logics have been so far the prominent family of formalisms for representing and reasoning with ontological knowledge. However, classical description logics were not designed for efficient data querying. On the other hand, database languages are able to process complex queries on huge databases, but without taking the ontology into account. There is thus a need for new languages and mechanisms, able to cope with the ever growing size of knowledge bases in the Semantic Web or in scientific domains.
This problem is related to two other problems identified as fundamental in KR:
- Query answering with incomplete information. Incomplete information means that it might be unknown whether a given assertion is true or false. Databases classically make the so-called closed-world assumption: every fact that cannot be retrieved or inferred from the base is assumed to be false. Knowledge bases classically make the open-world assumption: if something cannot be inferred from the base, and neither can its negation, then its truth status is unknown. The need of coping with incomplete information is a distinctive feature of querying knowledge bases with respect to querying classical databases (however, as explained above, this distinction tends to disappear). The presence of incomplete information makes the query answering task much more difficult.
- Reasoning with rules. Researching types of rules and adequate manners to process them is a mainstream topic in the Semantic Web, and, more generally a crucial issue for knowledge-based systems. For several years, we have been studying rules, both in their logical and their graph form, which are syntactically very simple but also very expressive. These rules, known as existential rules or Datalog, can be seen as an abstraction of ontological knowledge expressed in the main languages used in the context of KB querying.
3.4 Inconsistency and decision making
While classical FOL is the kernel of many KR languages, to solve real-world problems we often need to consider features that cannot be expressed purely (or not naturally) in classical logic. The logic and graph-based formalisms used for previous points have thus to be extended with such features. The following requirements have been identified from scenarios in decision making, privileging the agronomy domain:
- to cope with inconsistency;
- to cope with defeasible knowledge;
- to take into account different and potentially conflicting viewpoints;
- to integrate decision notions (priorities, gravity, risk, benefit).
Although the solutions we develop require to be validated on the applications that motivated them, we also want them to be sufficiently generic to be applied in other contexts. One approach (but not the only possible one) consists in increasing the expressivity of our core languages, while trying to preserve their essential combinatorial properties, so that algorithmic optimizations can be transferred to these extensions.
4 Application domains
4.1 Agronomy
Agronomy is a strong expertise domain in the area of Montpellier. Some members of GraphIK are INRAE researchers (computer scientists). We closely collaborate with the Montpellier research laboratory IATE, a join unit of INRAE and other organisms. A major issue for INRAE and more specifically IATE applications is modeling agrifood chains (i.e., the chain of all processes leading from the plants to the final products, including waste treatment). This modeling has several objectives. It provides better understanding of the processes from begin to end, which aids in decision making, with the aim of improving the quality of the products and decreasing the environmental impact. It also facilitates knowledge sharing between researchers, as well as the capitalization of expert knowledge and “know how”. This last point is particularly important in areas strongly related to local know how (like in cheese or wine making), where knowledge is transmitted by experience, with the risk of non-sustainability of the specific skills. An agrifood chain analysis is a highly complex procedure since it relies on numerous criteria of various types: environmental, economical, functional, sanitary, etc. Quality objectives involve different stakeholders, technicians, managers, professional organizations, end-users, public organizations, etc. Since the goals of the implied stakeholders may be divergent dedicated knowledge and representation techniques are to be employed.
4.2 Data journalism
One of today’s major issues in data science is to design techniques and algorithms that allow analysts to efficiently infer useful information and knowledge by inspecting heterogeneous information sources, from structured data to unstructured content. We take data journalism as an emblematic use-case, which stands at the crossroad of multiple research fields: content analysis, data management, knowledge representation and reasoning, visualization and human-machine interaction. We are particularly interested in issues raised by the design of data and knowledge management systems that will support data journalism. These systems include an ontology (which typically expresses domain knowledge), heterogeneous data sources (provided with their own vocabulary and querying capabilities), and mappings that relate these data sources to the ontological vocabulary. Ontologies play a central role as they act both as a mediation layer that glue together pieces of knowledge extracted from data sources, and as an inference layer that allow to draw new knowledge.
Besides pure knowledge representation and reasoning issues, querying such systems raise issues at the crossroad of data and knowledge management. In particular, although mappings have been widely investigated in databases, they need to be revisited in the light of the reasoning capabilities enabled by the ontology. More generally, the consistency and the efficiency of the system cannot be ensured by considering the components of the system in isolation (i.e., the ontology, data sources and mappings), but require to study the interactions between these components and to consider the system as a whole.
5 Social and environmental responsibility
Since January 2020, Pierre Bisquert is a member of the national INRAE DigigrAL thinking group. This group aims at providing reflections under the form of reports about the technological, societal and ethical impacts of digital technologies in agriculture. Some questions of interest are, among others: In what way digitalization might redefine power relation between citizens, consumers and industries? Where lies the responsability when using a decision support tool? How to sustain massive data production? This group meets monthly and is composed of 13 researchers, each representing a department of the INRAE institute.
6 Highlights of the year
The team has built a new research team project with a twelve-year horizon (BOREAL) under the leadership of Federico Ulliana, who defended his habilitation (HDR) in November 2021 25. The creation of BOREAL at 01/01/2022 was recommended by the SAM Comité des Projets in October 2021.
6.1 Awards
Ray Reiter Best paper award at the conference KR 2021 (ranked A*): “Capturing Homomorphism-Closed Decidable Queries with Existential Rules”, Camille Bourgaux, David Carral, Markus Krötzsch, Sebastian Rudolph, and Michaël Thomazo, 18th International Conference on Principles of Knowledge Representation and Reasoning 17.
7 New software and platforms
Let us describe new/updated software.
7.1 New software
7.1.1 GRAAL
-
Keywords:
Knowledge database, Ontologies, Querying, Data management
-
Scientific Description:
Graal is a Java toolkit dedicated to querying knowledge bases within the framework of existential rules, aka Datalog+/-.
-
Functional Description:
Graal has been designed in a modular way, in order to facilitate software reuse and extension. It should make it easy to test new scenarios and techniques, in particular by combining algorithms. The main features of Graal are currently the following: (1) a data layer that provides generic interfaces to store various kinds of data and query them with (union of) conjunctive queries, currently: MySQL, PostgreSQL, Sqlite, in memory graph and linked list structures, (2) an ontological layer, where an ontology is a set of existential rules, (3) a knowledge base layer, where a knowledge base is composed of a fact base (abstraction of the data via generic interfaces) and an ontology, (4) algorithms to process ontology-mediated queries, based on query rewriting and/or forward chaining (or chase), (5) a rule analyzer, which performs a syntactic and structural analysis of an existential rule set, (6) several IO formats, including imports from OWL 2.
-
Release Contributions:
2020: Beta version, with improved chase algorithms. Available for internal use on gite.lirmm.fr
2018: Version 1.3.1, with small bug fixes and minor improvements.
2017: New stable version 1.3.0. Moreover, Graal website has been deeply restructured and enriched with new tools, available online or for download, and documentation including tutorials, examples of use, and technical documentation about all Graal modules. Website: http://graphik-team.github.io/graal/
-
News of the Year:
2021: Design and development of a major improved version of the tool (ongoing work). Refactoring of the API, and of several modules for knowledge base representation, data storage, query answering and forward-chaining reasoning (chase). Development of new modules for handling heterogeneous data: mappings and federations. Code deposited on https://gitlab.inria.fr/rules/graal-v2. This work will lead to the release of a novel library which is planned for 2022.
- URL:
- Publications:
-
Contact:
Federico Ulliana
-
Participants:
Marie-Laure Mugnier, Jean-Francois Baget, Michel Leclère, Federico Ulliana, Guillaume Perution Kihli, Olivier Rodriguez, Florent Tornil
8 New results
In this section, we present this year's results according to our main research lines, namely (1) ontology-mediated query answering and (2) reasoning with conflicts and decision support.
Note that we do not recall some results formally published in 2021 but already contained in last year's report (namely, 19, a paper at the conference IJCAI 2020 that was actually postponed to January 2021, and 13, a paper in the TPLP journal, which was online in 2020).
8.1 Ontology-mediated query answering
Participants: Jean-François Baget, David Carral, Michel Leclère, Marie-Laure Mugnier, Elie Najm, Guillaume Pérution-Kihli, Olivier Rodriguez, Federico Ulliana.
Ontology-mediated query answering is the issue of querying data through a conceptual layer formalized by knowledge (often called an ontology). From an abstract viewpoint, this gives rise to knowledge bases, composed of an ontology and a factbase (which can be seen as a database under incomplete data assumption). Answers to queries are logically entailed from the knowledge base, therefore taking into account inferences enabled by the ontology. In practice, the factbase may be not be given but instead computed from (several) data sources. This gives rise to a three-level architecture comprising the ontology, the data sources and the mapping between the two, known as OBDA (Ontology-Based Data Access). The key idea in OBDA is that a user expresses queries at a conceptual level, thereby abstracting from the actual data storage. The factbase may be materialised by triggering the mappings or it may remain virtual. In the latter case, the system rewrites these queries into queries on the data via the mapping, while integrating ontological reasoning.
To represent knowledge and reason with it, we study an expressive rule language, known as existential rules (or datalog+, as this framework generalizes the deductive database language datalog). Existential rules are also able to express powerful relational mappings (i.e., Global-Local-As-View mappings), which provide great flexibility to integrate heterogeneous data sources. Hence, a uniform language can be used to express both knowledge and mappings.
A fundamental tool to do reasoning with existential rules is a forward chaining process, called the chase: the rules are repeatedly applied to enrich the factbase, and query answering can then be solved by evaluating the query against the saturated factbase (as in a classical database system, i.e., with forgetting the ontological knowledge). In an OBDA context the factbase may remain virtual, in which case query-rewriting-based techniques are used, however the chase remains relevant as a reference tool to prove the correctness of the developed techniques. Indeed, the set of answers to a (conjunctive) query evaluated on the saturated factbase is exactly the set of answers to its rewriting evaluated on the data sources.
This year, we investigated three main issues:
- Optimisation of the chase by a new technique avoiding redundant computation;
- Optimisation of query answering in the presence of mapping by compiling reasoning into the mapping;
- Expressivity of existential rules as a query language.
8.1.1 Optimization of the chase
A general problem of the chase algorithm is that it usually performs many redundant computations since entailed formulas can often be obtained in many different ways from an input ontology. To counter this problem, we study Trigger Graphs (TGs), which are directed structures that guide the application of the rules during the computation of the chase with the goal avoiding redundant computations. In summary, trigger graphs are used to efficiently compute the chase in the same manner that query plans are employed to efficiently execute SQL queries.
Our resarch naturally gives rise to an extensive theoretical and empirical study that seeks to answer when and how TGs can be computed and what are the benefits of TGs when applied over real-world knowledge bases. Our results include the development of algorithms that compute minimal TGs. Moreover, we implemented our approach in a new engine (a fork of Vlog), and our experiments show that it can be significantly more efficient than the chase enabling us to materialize KBs with 17B facts in less than 40 min on commodity machines.
- Published at VLDB 2021. With Efthymia Tsamoura (Samsung AI Research, UK), Enrico Malizia (University of Bologna, Italy) and Jacopo Urbani (Vrije Universiteit Amsterdam, The Netherlands) 16.
8.1.2 Compilation of reasoning into the mapping
We consider an OBDA framework where the factbase is not materialized but virtually defined by the mapping and the data. An incoming query, expressed at the conceptual level, has to be rewritten, first with the ontology (to take reasoning into account), then with the mapping, which yields a query directly asked on the data. As rewriting is performed at query time, speeding up this process is a crucial issue. A practically efficient approach consists in compiling (part of, or all) the reasoning into the mapping. However, this approach has been implemented so far in very restricted cases of ontologies (basically, hierarchies of classes and properties) and of mappings (so-called Global-As-View mappings). Whether it can be developed in more expressive settings like existential rules is an open issue.
Our first aim is to characterize the sets of rules for which reasoning can be totally compiled into the mapping, which leads us to explore the following question: under which conditions on the rules can the chase be simulated in a single (breadth-first) step? We introduce the notion of `parallelisable sets' of existential rules, for which the chase can be computed in a single step from any factbase. Then, we characterize parallelisable rule sets in two different ways. One characterization relies on the behavior of rules during the chase: we prove that parallelisable rules are exactly those rule sets both bounded for the chase and belonging to a novel class of rules, called pieceful; the pieceful class includes expressive already known classes (in particular, frontier-guarded existential rules and datalog). Another characterization relies on the behavior of rules during rewriting, which leads us to study rule composition. These first results pave the way for further development of query answering techniques exploiting parallelisation and the specificities of existential rule dialects.
- Published at KR 2021. With Maxime Buron (University of Oxford, UK) and Michael Thomazo (VALDA Inria team) 18.
8.1.3 Expressivity of existential rules as a query language
At the core of contemporary logic-based knowledge representation is the concept of querying databases. The classical decision problem related to such knowledge-aware querying is Boolean query entailment. From an abstract point of view, a Boolean query identifies a class of databases D – those that satisfy the query, i.e., to which the query “matches”. This view allows us to define and investigate properties of (abstract) queries independently from the syntax used to specify them.
A very popular querying formalism are existential rules, also referred to as tuple-generating dependencies. In our work, we study which queries can be expressed using existential rule sets that terminate with respect to the restricted (aka standard) chase. In fact, we learn that this fragment can express a Boolean query q if and only if q is decidable and closed under homomorphisms:
- The query query q is decidable if there is a sound, complete, and terminating algorithm that determines if an input database D “matches” q.
- The query q is closed under homomorphisms if the following implication holds: if q “matches” a database D and D can be homomorphically embedded into another database D', then q also “matches” D'.
For instance, the Boolean query that detects if there is a path in a graph database is closed under homomorphisms: if a database D features a cycle and this database can be homomorphically embedded in another database D', then D' must also feature a cycle.
- Published at KR 2021 (Best paper award). With Camille Bourgaux and Michaël Thomazo (VALDA Inria team), and Markus Krötzsch and Sebastian Rudolph (TU Dresden) 17
8.2 Reasoning with conflicts and decision support
Participants: Pierre Bisquert, Patrice Buche, Madalina Croitoru, Martin Jedwabny, Rallou Thomopoulos.
In real-world applications, data is likely to generate inconsistencies in the presence of ontological knowledge, specially when it comes from several independent sources. In particular, data coming from different stakeholders, such as preferences and opinions, is generally conflicting. In order to use this data, for instance in a decision support setting, it is thus necessary to be able to reason in the presence of inconsistencies. In such a context, classical logical reasoning fails because any statement can be derived from a contradiction. Argumentation is one approach to this problem, where inference steps are represented as possibly conflicting arguments. To a set of arguments is naturally associated a graph in which arguments are nodes and conflicts are edges.
One interest of the argumentation framework is that it allows to define a variety of semantics for reasoning in the presence of inconsistencies, some of them having been shown to be semantically equivalent to repair-based approaches. Second, this framework naturally benefits from the explanatory potential of graphs, which is particularly interesting to help the users better understand the results of the reasoning.
This year, we have continued our investigations of the management of knowledge bases in presence of conflicts. Our work has been centered around two main topics: the management of conflicts among factual information and the management of conflicts between user preferences.
- In the first context we have investigated either the source of information (in our case the agent that is stating the facts) makes a difference in the management techniques. In order to do this we have proposed a framework that allows to determine the agent that has most contributed to an exchange. Furthermore, we have also investigated whether the query in itself might induce some changes in the management.
- Regarding the second context, we have placed ourselves in an applicative context that is getting more and more traction: ethical scenarios. In these scenarios the ethical preferences of the user play a major role but it is yet not clear how to elicit these preferences. To handle this issue, we have studied the interest of case-based reasoning (CBR) and probabilistic inductive logic programming (PILP).
8.2.1 Conflicts among factual information
Regarding this context, we first investigated what goes around argumentation in itself: the speakers that enunciate arguments, and how their impact can be measured with regards to the importance of their arguments. Indeed, in the context of several agents exchanging arguments, understanding which agent is the most influencial may be a crucial information about which arguments should be assessed in priority. In order to study this often overlooked aspect of argumentation, we introduce the notion of authorship in the context of abstract argumentation framework and define new gradual semantics to account for the impact of the agents on the arguments. More precisely, we first define the notion of indirect impact of an argument on another argument: in a nutshell, we observe the difference in strength a particular argument A might undergo when another argument B is removed from the framework. This defines the impact of B on A. The impact of an agent on an argument C is then an aggregation of the impact of all her arguments, i.e. the argument of which she is the author. To explore this notion formally, we propose a set of desirable and intuitive principles that such a semantics of agent impact should satisfy in order to be rational, for instance that an agent without argument should not have an impact, and studied to which extent these principles are satisfied by two semantics instantiated from two popular gradual argumentation semantics.
- Published at SGAI 2021. With Bruno Yun (University of Aberdeen) 20.
On another note, we studied a novel way of handling inconsistency in a knowledge base using a notion of dynamic compartmentalisation. More precisely, given a query and a potentially inconsistent knowledge base, our system only loads in working memory the consistent knowledge which is the most related to the query. In other words, when a query arrives, the system looks for the formulas in the knowledge base that share the most variables while avoiding inconsistencies, add them to the working memory and repeat the process with the newly added formulas until nothing more can be added or the arbitrary size of the working memory is reached. This approach happens to have interesting complexity and experimental results compared to more classical ways of handling inconsistency such as maxi-consistent sub-bases, while keeping acceptable level of accuracy.
- Published at NMR 2021. With Florence Dupin de Saint Cyr - Bannay (University of Toulouse) 22.
8.2.2 Conflicts between user preferences
Rules and principles that govern human agents' decisions are often complex and hard to model and, consequently, it is difficult to design a system for an autonomous agent versatile enough to be ethical in several distincts contexts. We tried to circumvent this issue by introducing a mechanism based on CBR that chooses the most relevant ethical decision for a situation given past similar experiences. More precisely, situations and decisions are decomposed in sets of ethical features (benevolent, honest, etc.) which are used in a similarity function. A decision that has been chosen in past situations that share the most ethical features with the current situation is then a good candidate. In that sense, modeling ethical behavior becomes a matter of providing enough relevant past situations compared to giving a list of "dos and don'ts". This requires consequently large enough data sets of situations, and a classical solution is to crowd-source this data, essentially asking to a lot of people their preferences on what should be done in particular ethical situations. Obviously, such preferences are bound to be inconsistent, which cannot be handled in classical CBR. In order to solve this problem, we used PILP to induce probabilistic rules expressing statement such as "there is 80% chance that it is valid to choose an action that is benevolent and just", for instance. This model has been implemented and run on a real-world inspired scenario in the context of a ethical dilemmas.
Complementary work on decision support was carried out by our associate collaborators at INRAE, see 12, 23, 15.
9 Bilateral contracts and grants with industry
Participants: Madalina Croitoru.
CIFRE PhD grant with EDF Paris, to begin in January 2022 on the automatic generation of testing scenarios for the verification of complex systems - under the direction of Madalina Croitoru.
10 Partnerships and cooperations
10.1 International initiatives
R4Agri
Participants: Pierre Bisquert, David Carral, Marie-Laure Mugnier, Federico Ulliana.
-
Title:
Reasoning on Agricultural Data: Integrating metrics and qualitative perspectives
-
Partner Institution(s):
- Inria
- DFKI, Germany
-
Date/Duration:
36 months, beginning postponed to 01/01/2022
-
Additional info:
AI tools supporting competitive and sustainable agriculture need to exploit highly diverse kinds of data and knowledge, from raw data provided by sensors to high level expertise knowledge. Taking numerical agriculture as the targeted application domain, the overall goal of the R4Agri project is to provide a framework for reasoning about knowledge based on heterogeneous data, with a focus on multi-modal and multi-scale sensor data. Main challenges include context-dependent interpretation of sensor data, which involves reasoning about prior knowledge, and query answering techniques that exploit domain knowledge and accommodate the specificities of data sources in a flexible manner. The application potential in this field of world-wide societal and ecological impact will be demonstrated in realistic use cases.
10.2 International research visitors
10.2.1 Visits of international scientists
Other international visits to the team
-
Visitor: Nofar Carmeli
- Status: Postdoctoral Researcher
- Institution of origin: ENS Paris; Valda Inria Research Team
- Country: France
- Dates: three days, November 2021
- Context of the visit: Presenter at “Decidable fragments of Horn First-Order Logic” seminar, application for a CRCN researcher position
-
Visitor: Lucía Gómez Álvarez
- Status: Postdoctoral Researcher
- Institution of origin: TU Dresden
- Country: Germany
- Dates: two months, December 2021 & January 2022
- Context of the visit: Seminar, research stay
-
Visitor: Tim Lyon
- Status: Postdoctoral Researcher
- Institution of origin: TU Dresden
- Country: Germany
- Dates: one week, November 2021
- Context of the visit: Presenter at “Decidable fragments of Horn First-Order Logic” seminar, research collaboration.
-
Visitor: Piotr Ostropolski-Nalewaja
- Status: Postdoctoral Researcher
- Institution of origin: TU Dresden
- Country: Germany
- Dates: one week, November 2021
- Context of the visit: Presenter at “Decidable fragments of Horn First-Order Logic” seminar, research collaboration.
-
Visitor: Sebastian Rudolph
- Status: Professor
- Institution of origin: TU Dresden
- Country: Germany
- Dates: two weeks, November 2021
- Context of the visit: Seminar, member of Federico Ulliana's HDR jury, research collaboration
-
Visitor: Michaël Thomazo
- Status: CRCN Researcher
- Institution of origin: ENS Paris; Valda Inria Reserach Team
- Country: France
- Dates: three days, November 2021
- Context of the visit: Presenter at “Decidable fragments of Horn First-Order Logic” seminar, research collaboration.
-
Visitor: Jacopo Urbani
- Status: Professor
- Institution of origin: Vrije Universiteit Amsterdam
- Country: Netherlands
- Dates: two days, November 2021
- Context of the visit: Presenter at “Decidable fragments of Horn First-Order Logic” seminar, research collaboration.
10.3 European initiatives
10.3.1 FP7 & H2020 projects
GLOPACK (H2020, June. 2018- July. 2022)
Participants: Pierre Bisquert, Patrice Buche, Madalina Croitoru.
GLOPACK is led by the University ofMontpellier (IATE laboratory). It proposes a cutting-edge strategy addressing the technical and societal barriers to spread in our social system, innovative eco-efficient packaging able to reduce food environmental footprint. Focusing on accelerating the transition to a circular economy concept, GLOPACK aims to support users and consumers’ access to innovative packaging solutions enabling the reduction and circular management of agro-food, including packaging, wastes. Validation of the solutions including compliance with legal requirements, economic feasibility and environmental impact will push forward the technologies tested and the related decision-making tool to TRL 7 for a rapid and easy market uptake contributing therefore to strengthen European companies’ competitiveness in an always more globalised and connected world.
10.4 National initiatives
CQFD (ANR PRC, Jan. 2019-Dec. 2024)
Participants: Jean-François Baget, David Carral, Michel Leclère, Marie-Laure Mugnier, Guillaume Pérution-Kihli, Olivier Rodriguez, Florent Tornil, Federico Ulliana.
CQFD (Complex ontological Queries over Federated heterogeneous Data), coordinated by Federico Ulliana (GraphIK), involves participants from Inria Saclay (CEDAR team), Inria Paris (VALDA team), Inria Nord Europe (SPIRALS team), IRISA, LIG, LTCI, and LaBRI. The aim of this project is tackle two crucial challenges in OMQA (Ontology Mediated Query Answering), namely, heterogeneity, that is, the possibility to deal with multiple types of data-sources and database management systems, and federation, that is, the possibility of cross-querying a collection of heterogeneous datasources. By featuring 8 different partners in France, this project aims at consolidating a national community of researchers around the OMQA issue.
Convergence institute #DigitAg (2017-2023)
Participants: Jean-François Baget, Patrice Buche, Madalina Croitoru, Marie-Laure Mugnier, Elie Najm, Rallou Thomopoulos, Federico Ulliana.
Located in Montpellier, #DigitAg (for Digital Agriculture) gathers 17 founding members: research institutes, including Inria, the University of Montpellier and higher-education institutes in agronomy, transfer structures and companies. Its objective is to support the development of digital agriculture. GraphIK is involved in this project on the issues of designing data and knowledge management systems adapted to agricultural information systems, and of developing methods for integrating different types of information and knowledge (generated from data, experts, models). A PhD thesis (Elie Najm) investigates knowledge representation and reasoning for the design of new agroecological systems, in collaboration with the research laboratory ABSys - Biodiversified Agrosystems (formerly UMR SYSTEM).
ICODA (Inria Project Lab, 2017-2021)
Participants: Jean-François Baget, Michel Chein, Alain Gutierrez, Marie-Laure Mugnier.
The iCODA project (Knowledge-mediated Content and Data Interactive Analytics—The case of data journalism), coordinated by Guillaume Gravier and Laurent Amsaleg (LINKMEDIA), takes together four Inria teams: LINKMEDIA, CEDAR, ILDA and GraphIK, as well as three press partners: Ouest France, Le Monde (les décodeurs) and AFP.
Taking data journalism as an emblematic use-case, the goal of the project is to develop the scientific and technological foundations for knowledge-mediated user-in-the-loop big data analytics jointly exploiting data and content, and to demonstrate the effectiveness of the approach in realistic, high-visibility use-cases.
11 Dissemination
11.1 Promoting scientific activities
11.1.1 Scientific events: organisation
An international workshop has been co-organized by a member of the team:
-
GKR 2020 (Graph Structures for Knowledge Representation and Reasoning), co-located with the virtual ECAI 2020, 5 September 2020 - co-organized by Madalina Croitoru.
Two scientific events have been co-organized within LIRMM cross-cutting axes:
11.1.2 Scientific events: paper selection
Chair of conference program committees
Marie-laure Mugnier is program co-chair of the 35th International Workshop on Description Logics (DL 2022), part of the Federated Logic Conference (FLoC 2022). DL is the main international event of the description logic research community.
Scientific chair
Teams members have acted as area chairs in two major conferences:
- KR 2021 (18th International Conference on Principles of Knowledge Representation and Reasoning): Area chair - Marie-Laure Mugnier
- AAMAS 2022 (21st International Conference on Autonomous Agents and Multiagent Systems): Area chair - Madalina Croitoru
Member of the conference program committees
-
International
AAAI 2021 (35th AAAI Conference on Artificial Intelligence): PC member - David Carral
IJCAI 2021 (30th International Joint Conference on Artificial Intelligence): senior PC members - David Carral, Marie-Laure Mugnier
KR 2021 (18th International Conference on Principles of Knowledge Representation and Reasoning): PC members - Pierre Bisquert, David Carral
ICCS 2021 (26th International Conferences on Conceptual Structures): PC member - Pierre Bisquert
DL 2021 (34th International Workshop on Description Logics): PC member - David Carral
-
National
EGC 2021 - Conférence sur l’Extraction et Gestion des Connaissances : PC member - Federico Ulliana
JIAF 2021 (Journées d' Intelligence Artificielle Fondamentale): PC member - Marie-laure Mugnier
11.1.3 Journal
Reviewer for Artificial Intelligence Journal (AIJ) - Jean-François Baget
Reviewer for Transactions on Computational Logic (TOCL) - Marie-Laure Mugnier
11.1.4 Invitations
- Pierre Bisquert - Invited talk Argumentation et décision dans le contexte agronomique - Journée D2K - Labex Digicosme, 23/11/2021
- David Carral - Invited participant to the Dagstuhl seminar Extending the Synergies Between SAT and Description Logics, September 2021 Details
- Michel Chein - Invited talk L’Intelligence Artificielle symbolique - LIRMM AI day, 07/05/2021
- Elie Najm - Invited talk Représentation des connaissances et des raisonnements en agronomie systémique pour l'innovation en agroécologie - INRAE In-Ovive Seminar, 21/09/2021 Details
- Elie Najm - Invited poster Reasoning on data for innovation in agroecology - the DFKI-Inria-INRAE workshop on digital agriculture, 6-8/09/2021
- Federico Ulliana - Invited talk Knowledge-based Data-Management with Graal V2 - INRAE Semantic Linked Data Seminar, 12/10/2021.
11.1.5 Scientific expertise
- Michel Leclère was a member of the jury for the awarding of PhD grants in Computer Science within the University of Montpellier Doctoral School.
- Marie-Laure Mugnier and Federico Ulliana were members of a recruitement committee for an Associate Professor position (section 27) at Montpellier IUT / LIRMM.
- Marie-Laure Mugnier was a member of a recruitement committee for two Associate Professor positions (section 27) at the University of Lens / CRIL.
- David Carral reviewed a project proposal for the Austrian Science Fund (FWF).
- Jean-François Baget reviewed a project proposal for the ANR Generic Call for Proposals 2021.
11.1.6 National and local administrative responsabilities
- Madalina Croitoru has been deputy member of the CNU section 27 (Computer Science) since September 2019.
- Madalina Croitoru has been deputy director of the Computer Science Department at the Faculty of Science, University of Montpellier since September 2021.
- Madalina Croitoru has been in charge of international relations for the Computer Science department of the Science Faculty since September 2019.
- Marie-Laure Mugnier has been the president of the Section Commitee 27 (Computer Science) of the University of Montpellier since July 2021.
- Marie-Laure Mugnier has been a member of the Council of the Scientific Department MIPS (Mathematics Informatics Physics and Systems) of the University of Montpellier since 2016.
- Federico Ulliana was the head of the curriculum “Data, Knowledge and Natural Language Processing” (DECOL, about 30 students) in the Master of Computer Science, from July 2017 to July 2021.
11.2 Teaching - Supervision - Juries
11.2.1 Teaching
The five faculty members do an average of 200 teaching hours per year at the Computer Science department of the Science Faculty. They are in charge of courses in Logics (Licence), Databases (Master), Artificial Intelligence (M), Knowledge Representation and Reasoning (M), Theory of Data and Knowledge Bases (M), Social and Semantic Web (M) and Multi-Agent Systems (M). Concerning full-time researchers in 2021, Jean-François Baget and David Carral both teach in the Computer Science Master (for respectively about 30h and 15h).
11.2.2 Supervision
The following PhD theses are in progress:
- Olivier Rodriguez, “Querying key-value store under semantic constraints”. Supervisors: Federico Ulliana and Marie-Laure Mugnier. Started February 2019.
- Martin Jedwabny, “Argumentation and ethical decision making”. Supervisors: Madalina Croitoru and Pierre Bisquert. Started October 2019.
- Elie Najm, “Knowledge Representation and Reasoning for innovating agroecological systems”. Supervisors: Marie-Laure Mugnier, Christian Gary (INRAE, UMR ABSys), Jean-François Baget and Raphaël Metral (Supagro, UMR ABSys). Started October 2019.
- Guillaume Pérution-Kihli, “Des données aux connaissances : un cadre unifié pour l’intégration sémantique de données hétérogènes et l’amélioration de leur qualité”. Supervisors: Michel Leclère and Marie-Laure Mugnier. Started September 2020.
This year, the team has welcomed five interns, and supervised a research initiation for a sixth student:
- Mael Abily (École Normale Supérieure de Lyon, 2 months) developed an efficient alternative algorithm to compute the core chase for Description Logic ontologies. Supervisors: David Carral and Jean-François Baget.
- Riadh Guemache (Master 2 U. Montpellier, 5 months) proposed a declarative way to add user-defined functions and predicates in existential rules. A prototype was designed for the current version of the Graal software. Supervisors: Jean-François Baget and Federico Uliana.
- Lucas Larroque (Ecole normale supérieure Paris-Saclay, 2 months) studied normalization procedures for existential rules that preserve chase termination. Supervisors: David Carral and Marie-Laure Mugnier. This work continues during the university year 2021/22 in the framework of a module at ENS Paris dedicated to research, with the co-supervision of Michaël Thomazo.
- Tom Salembien (Telecom SudParis, 2 months) used py4J to build a Python programming interface to the Graal platform, allowing Python programmers to access Graal in a simple and intuitive way. Supervisor: Jean-François Baget.
- Quentin Yeche (Master 1 U. Montpellier, 2 months) on the development of a Java library for automatizing the experimental evaluation of reasoning systems. Supervisors: Federico Ulliana and Florent Thornil.
- Sebastien Bonduelle (Master 1 ENS Rennes, 1 semester) continued his work on rewriting disjunctive queries. Supervisor: Jean-François Baget.
11.2.3 Juries
Federico Ulliana was examiner for Muideen Lawal PhD defense (April 2021, Grenoble University) and Pawel Guzewic PhD defense (October 2021, École Polytechnique).
11.2.4 Popularization
Michel Chein made two interventions aimed at a non-specialist audience:
- Radio program at RCF (Radio Chrétienne de France): “L'intelligence artificielle participe-t-elle à la construction de l'homme dans sa plénitude, peut-elle rendre l’homme meilleur ?” (28 April 2021) Link.
- “Depuis la dispute au XVIIIe siècle entre Jean-Charles de Borda et Nicolas de Condorcet, les méthodes concernant le choix social ont-elles évolué ?” Talk at Académie des Sciences et Lettres de Montpellier (21 June 2021); to appear in Bull. Acad. Sc. Lett. Montp., vol. 52 (2021) Article.
12 Scientific production
12.1 Major publications
- 1 inproceedingsAnswering Conjunctive Regular Path Queries over Guarded Existential Rules.IJCAI: International Joint Conference on Artificial IntelligenceMelbourne, AustraliaAugust 2017
- 2 articleOn Rules with Existential Variables: Walking the Decidability Line.Artificial Intelligence1759-10March 2011, 1620-1654URL: http://hal.inria.fr/lirmm-00587012/en
- 3 inproceedingsOntology-Mediated Query Answering for Key-Value Stores.IJCAI: International Joint Conference on Artificial IntelligenceMelbourne, AustraliaAugust 2017
- 4 articleOntology-Mediated Queries: Combined Complexity and Succinctness of Rewritings via Circuit Complexity.Journal of the ACM (JACM)655September 2018, 1-51
- 5 inproceedingsCapturing Homomorphism-Closed Decidable Queries with Existential Rules.KR 2021 - 18th International Conference on Principles of Knowledge Representation and ReasoningVirtual, VietnamNovember 2021, 141--150
- 6 inproceedingsOblivious and Semi-Oblivious Boundedness for Existential Rules.IJCAI 2019 - International Joint Conference on Artificial IntelligenceMacao, ChinaAugust 2019
- 7 inproceedingsOntology-Based RDF Integration of Heterogeneous Data.EDBT/ICDT 2020 - 23rd International Conference on Extending Database TechnologyCopenhagen, DenmarkMarch 2020
- 8 inproceedingsOn a Flexible Representation for Defeasible Reasoning Variants.AAMAS: Autonomous Agents and MultiAgent SystemsStockholm, SwedenJuly 2018, 1123-1131
- 9 articleSound, Complete and Minimal UCQ-Rewriting for Existential Rules.Semantic Web journal652015, 451-475
- 10 articleChoice of environment-friendly food packagings through argumentation systems and preferences.Ecological Informatics48November 2018, 24-36
- 11 inproceedingsInconsistency Measures for Repair Semantics in OBDA.IJCAI: International Joint Conference on Artificial IntelligenceStockholm, SwedenJuly 2018, 1977-1983
12.2 Publications of the year
International journals
International peer-reviewed conferences
Conferences without proceedings
Scientific book chapters
Edition (books, proceedings, special issue of a journal)
Doctoral dissertations and habilitation theses