EN FR
EN FR


Section: Overall Objectives

Context and overall goal of the project

Building upon the expertise in machine learning (ML) and optimization of the Tao team, the Tau project will tackle some under-specified challenges behind the New Artificial Intelligence wave. The simultaneous advent of massive data and massive computational power, blurring the boundaries between data, structure, knowledge and common sense, seemingly makes it possible to fulfill all promises of the good old AI, now or soon.

This makes NewAI under-specified in three respects. A first dimension regards the relationships between AIs and human beings. The necessary conditions for AIs to be accepted by mankind and/or contribute to the common good are yet to be formally defined; it is hard to believe that a general and computable definition of “ethical behavior" can be set once for all. Some of these necessary conditions (explainable and causal modeling; unbiased data and models; model certification) can nevertheless be cast as ambitious and realistic goals for public research.

A second dimension regards the relationships between AI, data and knowledge. In closed worlds AIs can manage and acquire sufficient data to reach human-level performances from scratch [81]. In open worlds however, prior knowledge is used in various ways to overcome the lack of direct interactions with the world, e.g. through i) exploiting domain-dependent data invariances in intension or in extension (ranging from convolution to domain augmentation); ii) taking advantage of the low-rank structure (generative learning) or known properties (equivariant learning) of the observed data; iii) leveraging diverse domains and datasets, assumedly related to each other (domain adaptation; multi-task learning). A general and open question is how available prior knowledge can be best leveraged by an AI, all the more so as domains with small to medium-size data are considered.

A third dimension regards the intrinsic limitations of AI in terms of information theory. Long established theories, e.g. rooted in Occam's razor, currently hardly account for the practical leaps of deep learning, where the solution dimension outnumbers the input dimension. Beyond trials-and-errors, a long-term goal is to characterize the learning landscape w.r.t. order parameters to be defined, and a priori estimate the regions of problem instances where it is likely/possible/unlikely to learn accurate models.

The above under-specified AI issues define three core research pillars (Section 3), examining three interdependent aspects of AI:

I. The first pillar aims to answer the question of what it means to be a good AI and how to build them. More specifically, our goal is to advance the state of the art concerning robust learning (re adversarial attacks), causal modeling (aimed to support explanations and prescriptions), and unbiased models in the sense of prescribed neutrality constraints (including the assessment and repair of the data).

II. The second pillar tackles the "innate vs acquired" question: how to best combine available human knowledge, and agnostic machine learning. Tau will examine this question focusing on domains with spatial and temporal multi-scale structure, as pervasive in natural sciences (where domain knowledge is expressed using PDEs, or through powerful compact representations as in signal processing), taking advantage of the pluri-disciplinary expertise and scientific collaborations of the Tau members.

III. The third pillar aims to understand the learning landscape. In the short term, it tackles the so-called Auto- issue of automatically selecting and configuring an algorithm portfolio for a problem instance. This issue governs the knowledge transfer from research labs to industry [92], [91], all the more so as massive computational resources are at stake. In the medium term, our goal is to integrate the hyper-parameters and model structure in the learning criteria, using information theory and/or bilevel programming [93]. In the long-term, our goal is to establish a phase diagram of the learning landscape, through i) determining order parameters; ii) relating the different regions defined along these order parameters, to the quality of the optimal solution, and the probability of finding a good approximation thereof. These goals are aligned with the unique scientific expertise of Tau in statistical physics and in information theory, and benefit from our decade-long expertise in Auto-.

The above research pillars will take inspiration and be validated with three applicative topics (Section 4):

1.Energy management encompasses a variety of scientific problems related to research pillars I. (fair learning, privacy-compliant modelling, safety-related guarantees) and II. (spatio-temporal multi-scale modelling, distributional learning). It is also a strategic application for the planet, where Tau benefits from the Tao expertise and the long established relationships with Artelys (ILab Metis) and RTE.

2.Computational Social Sciences offer questions and methodological lessons about how to address these questions in a common decency spirit, along research pillar I. On-going studies at Tau include the learning and randomized assessment of prescriptive models for Human Resources (hiring and vocational studies; quality of life at work and economic performance) and nutrition habits (in relation with social networks and health), where i) learned models must be unbiased although data are undoubtedly biased; ii) prior knowledge must be accounted for and the interpretation of the learned models is mandatory; iii) causal modelling is key as models are deployed for prescription and self-fulfilling prophecies must be avoided at all costs [131].

3.Optimal data-driven design considers several physical/simulated phenomenons, ranging from high-energy physics to space weather, from population biology to medical imaging, from signal processing to certification of autonomous vehicle controllers, with: i) medium-size data; ii) extensive prior knowledge, notably concerning the symmetries and properties of the sought models; iii) computationally expensive simulators. All three characteristics are relevant to pillars II and III.