2025Activity reportProject-TeamDATASHAPE
RNSR: 201622050C- Research centers Inria Saclay Centre at Université Paris-Saclay Inria Centre at Université Côte d'Azur
- In partnership with:Université Paris-Saclay, CNRS
- Team name: Understanding the shape of data
- In collaboration with:Laboratoire de mathématiques d'Orsay de l'Université de Paris-Saclay (LMO)
Creation of the Project-Team: 2020 October 01
Each year, Inria research teams publish an Activity Report presenting their work and results over the reporting period. These reports follow a common structure, with some optional sections depending on the specific team. They typically begin by outlining the overall objectives and research programme, including the main research themes, goals, and methodological approaches. They also describe the application domains targeted by the team, highlighting the scientific or societal contexts in which their work is situated.
The reports then present the highlights of the year, covering major scientific achievements, software developments, or teaching contributions. When relevant, they include sections on software, platforms, and open data, detailing the tools developed and how they are shared. A substantial part is dedicated to new results, where scientific contributions are described in detail, often with subsections specifying participants and associated keywords.
Finally, the Activity Report addresses funding, contracts, partnerships, and collaborations at various levels, from industrial agreements to international cooperations. It also covers dissemination and teaching activities, such as participation in scientific events, outreach, and supervision. The document concludes with a presentation of scientific production, including major publications and those produced during the year.
Keywords
Computer Science and Digital Science
- A3. Data and knowledge
- A3.4. Machine learning and statistics
- A7.1. Algorithms
- A8. Mathematics of computing
- A8.1. Discrete mathematics, combinatorics
- A8.3. Geometry, Topology
- A9. Artificial intelligence
Other Research Topics and Application Domains
- B1. Life sciences
- B2. Digital health
- B5. Industry of the future
- B9. Society and Knowledge
- B9.5. Sciences
1 Team members, visitors, external collaborators
Research Scientists
- Frederic Chazal [Team leader, INRIA, Senior Researcher, HDR]
- Jean-Daniel Boissonnat [INRIA, Emeritus, HDR]
- Mathieu Carrière [INRIA, Researcher]
- David Cohen-Steiner [INRIA, Researcher]
- Marc Glisse [INRIA, Researcher]
- Clément Maria [INRIA, Researcher]
- Nina Lisann Otter [INRIA, ISFP]
- Mathijs Wintraecken [INRIA, ISFP]
Faculty Members
- Gilles Blanchard [UNIV PARIS SACLAY, Associate Professor]
- Charly Boricaud [UNIV PARIS SACLAY, Professor, until Sep 2025]
- Blanche Buet [UNIV PARIS SACLAY, Associate Professor]
- Remi Leclercq [UNIV PARIS SACLAY, Professor]
- Pierre Pansu [UNIV PARIS SACLAY, Emeritus]
Post-Doctoral Fellows
- Daniele Cannarsa [INRIA, Post-Doctoral Fellow, until Aug 2025]
- Francesco Conti [INRIA, Post-Doctoral Fellow]
- Ondrej Draganov [INRIA, Post-Doctoral Fellow, from Apr 2025]
- Corentin Lunel [INRIA, Post-Doctoral Fellow, until Sep 2025]
- Renata Turkes [INRIA, Post-Doctoral Fellow, until Aug 2025]
PhD Students
- Myriam Frikha [ERICSSON]
- Hugo Henneuse [UNIV PARIS SACLAY, until Oct 2025]
- Anna Hollands [UNIV PARIS SACLAY, from Oct 2025]
- Antonio Lage De Sousa Leitao [Scuola Normale Superiore di Pisa, Italy, from Nov 2024]
- Henrique Lovisi Ennes [UNIV COTE D'AZUR, from May 2025]
- Rohit Roy [INRIA, from Nov 2025]
- Alejandro Saldarriaga [DMA-ENS, from Oct 2025]
- Jérôme Taupin [Université Paris-Saclay, from Sep 2025]
Technical Staff
- Vincent Rouvreau [INRIA, Engineer]
- Hannah Schreiber [INRIA, Engineer]
Interns and Apprentices
- Ludo Andrianirina Mamisoa [UNIV COTE D'AZUR, Intern, from Mar 2025 until Aug 2025]
- Nestor Antunano Cabrera [INRIA, Intern, from Apr 2025 until Aug 2025]
- Madhav Cherupilil Sajeev [UNIV COTE D'AZUR, Intern, from Apr 2025 until Aug 2025]
- Alberto Conforti [INRIA, Intern, from Mar 2025 until Aug 2025]
- Beatriz Evelbauer [ENSTA, from May 2025 until Aug 2025]
- Anna Hollands [INRIA, Intern, from Apr 2025 until Sep 2025]
- Ilian Riveiro [Université Paris-Saclay, Intern, from Mar 2025 until Aug 2025, Université Paris-Saclay]
- Aurora Rivet [UNIV COTE D'AZUR, Intern, from Jun 2025 until Aug 2025]
- Jérôme Taupin [ENS Paris , until Aug 2025]
Administrative Assistants
- Sophie Honnorat [INRIA]
- Laetitia Jubely [INRIA, from May 2025]
Visiting Scientists
- Marzieh Eidi [MPI MiS, Germany, from Apr 2025 until May 2025]
- Yuri Gardinazzi [UNIV TRIESTE, from Nov 2025]
- Clément Levrard [UNIV PARIS, from May 2025 until May 2025]
- Javier Perera Lago [Univ Séville, from May 2025 until Jun 2025]
External Collaborators
- Bertrand Michel [CENTRALE NANTES]
- Martin Royer [SYSTEMX]
2 Overall objectives
During the last two decades, building on solid theoretical and algorithmic bases, geometric inference and computational topology have experienced important developments towards data analysis. New mathematically well-founded theories gave birth to the field of Topological Data Analysis (tda), which is now arousing interest from both academia and industry. Although one can trace back geometric approaches for data analysis quite far in the past, tda really started as a field with the pioneering works of H. Edelsbrunner et al. and G. Carlsson et al. in persistent homology at the beginning of the century. tdais mainly motivated by the idea that topology and geometry provide a powerful approach to infer robust qualitative, and sometimes quantitative, information about the structure of data. It aims at providing mathematical results and methods to infer, analyze and exploit complex data (point clouds, graphs, images, 3D shapes, time series...). It also intends to give access to robust and efficient data structures and algorithms to represent these data and that are amenable to precise analysis.
The overall objective of DataShape is three-fold:
- to settle the mathematical, statistical and algorithmic foundations of tda, and, more generally to contribute to the development of topological and geometric approaches in Machine Learning and AI;
- to develop a new family of well-founded and efficient data structures, algorithms and methods to uncover and exploit the geometry of data through the development of a state-of-the-art and easy-to-use open source software;
- to disseminate and promote tda research and outcomes among the data science community through collaborations with other domains of science and industrials.
The approach of DataShape relies on the conviction that, to reach these objectives, combining statistical, topological/geometric and computational approaches in a common framework is mandatory. For that purpose, DataShape became a joint team with the Laboratoire de Mathématiques d'Orsay in 2020 and now gathers a wide variety of expertise, going from fundamental mathematics to software development and industrial applications. The team also considers that tda needs to be combined with other data sciences approaches and tools, in particular statistical learning, to lead to successful real applications. Significant efforts have been made during the evaluation period to develop several long term industrial research collaborations in data science and AI.
The research program of DataShape is organized around four strongly correlated axes reflecting our will to address tda challenges in a global and unified framework.
The first axis focuses on the algorithmic aspects of tda and geometric inference as well as the mathematical foundations of the fields. Fundamental problems are the construction, processing and analysis of discrete representations of complex and possibly high dimensional shapes.
The second axis is dedicated to the statistical aspects of tda . It is dedicated to the study of the properties of topological information inferred from data from a statistical perspective and intends to propose new models and approaches for the development of tda in well-founded probabilistic and statistical settings. This axis also includes the analysis and development of general-purpose statistical learning approaches and tools that are currently active in the community and of relevance for Datashape's scientific goals.
The third axis is driven by the problems raised by the use of topological and geometric approaches in machine learning. It aims at better understanding the role of topological and geometric structures in machine learning problems and at applying tda tools to develop specialized topological approaches to be used in combination with other machine learning methods.
The fourth axis is dedicated to software development and experimental research, mainly through the GUDHI platform. GUDHI is intended to provide a high quality state-of-the-art implementation of data structures and algorithms dedicated to tdathrough an easy-to-use open source software.
Each DataShape member is involved in several research axes ensuring strong connections and interactions between them. Last, although the above 4 axes concentrate the main research activities of the team, DataShape always remains open and encourages its members to explore new directions and approaches related to geometric and topological methods in data analysis and machine learning. The past experience of the team has shown that such a strategy is often very fruitful and may lead to innovative and new research directions.
3 Research program
3.1 Algorithmic aspects and new mathematical directions for topological and geometric data analysis
tda requires to construct and manipulate appropriate representations of complex and high dimensional shapes. A major difficulty comes from the fact that the complexity of data structures and algorithms used to approximate shapes rapidly grows as the dimensionality increases, which makes them intractable in high dimensions. We focus our research on simplicial complexes which offer a convenient representation of general shapes and generalize graphs and triangulations. Our work includes the study of simplicial complexes with good approximation properties and the design of compact data structures to represent them.
In low dimensions, effective shape reconstruction techniques exist that can provide precise geometric approximations very efficiently and under reasonable sampling conditions. Extending those techniques to higher dimensions as is required in the context of tda is problematic since almost all methods in low dimensions rely on the computation of a subdivision of the ambient space. A direct extension of those methods would immediately lead to algorithms whose complexities depend exponentially on the ambient dimension, which is prohibitive in most applications. A first direction to by-pass the curse of dimensionality is to develop algorithms whose complexities depend on the intrinsic dimension of the data (which most of the time is small although unknown) rather than on the dimension of the ambient space. Another direction is to resort to cruder approximations that only captures the homotopy type or the homology of the sampled shape. The recent theory of persistent homology provides a powerful and robust tool to study the homology of sampled spaces in a stable way.
3.2 Statistical aspects of topological and geometric data analysis
The wide variety of larger and larger available data - often corrupted by noise and outliers - requires to consider the statistical properties of their topological and geometric features and to propose new relevant statistical models for their study.
There exist various statistical and machine learning methods intending to uncover the geometric structure of data. Beyond manifold learning and dimensionality reduction approaches that generally do not allow to assert the relevance of the inferred topological and geometric features and are not well-suited for the analysis of complex topological structures, set estimation methods intend to estimate, from random samples, a set around which the data is concentrated. In these methods, that include support and manifold estimation, principal curves/manifolds and their various generalizations to name a few, the estimation problems are usually considered under losses, such as Hausdorff distance or symmetric difference, that are not sensitive to the topology of the estimated sets, preventing these tools to directly infer topological or geometric information.
Regarding purely topological features, the statistical estimation of homology or homotopy type of compact subsets of Euclidean spaces, has only been considered recently, most of the time under the quite restrictive assumption that the data are randomly sampled from smooth manifolds.
In a more general setting, with the emergence of new geometric inference tools based on the study of distance functions and algebraic topology tools such as persistent homology, computational topology has recently seen an important development offering a new set of methods to infer relevant topological and geometric features of data sampled in general metric spaces. The use of these tools remains widely heuristic and until recently there were only a few preliminary results establishing connections between geometric inference, persistent homology and statistics. However, this direction has attracted a lot of attention over the last three years. In particular, stability properties and new representations of persistent homology information have led to very promising results to which the DataShape members have significantly contributed. These preliminary results open many perspectives and research directions that need to be explored.
Our goal is to build on our first statistical results in tda to develop the mathematical foundations of Statistical Topological and Geometric Data Analysis. Combined with the other objectives, our ultimate goal is to provide a well-founded and effective statistical toolbox for the understanding of topology and geometry of data.
3.3 Topological and geometric approaches for machine learning
This objective is driven by the problems raised by the use of topological and geometric approaches in machine learning. The goal is both to use our techniques to better understand the role of topological and geometric structures in machine learning problems and to apply our tda tools to develop specialized topological approaches to be used in combination with other machine learning methods.
3.4 Experimental research and software development
We develop a high quality open source software platform called gudhi which is becoming a reference in geometric and topological data analysis in high dimensions. The goal is not to provide code tailored to the numerous potential applications but rather to provide the central data structures and algorithms that underlie applications in geometric and topological data analysis.
The development of the gudhi platform also serves to benchmark and optimize new algorithmic solutions resulting from our theoretical work. Such development necessitates a whole line of research on software architecture and interface design, heuristics and fine-tuning optimization, robustness and arithmetic issues, and visualization. We aim at providing a full programming environment following the same recipes that made up the success story of the cgal library, the reference library in computational geometry.
Some of the algorithms implemented on the platform will also be interfaced to other software platforms, such as the R software for statistical computing, and languages such as Python in order to make them usable in combination with other data analysis and machine learning tools. A first attempt in this direction has been done with the creation of an R package called TDA in collaboration with the group of Larry Wasserman at Carnegie Mellon University (Inria Associated team CATS) that already includes some functionalities of the gudhi library and implements some joint results between our team and the CMU team. A similar interface with the Python language is also considered a priority. To go even further towards helping users, we will provide utilities that perform the most common tasks without requiring any programming at all.
4 Application domains
Our work is mostly of a fundamental mathematical and algorithmic nature but finds a variety of applications in data analysis, e.g., in material science, biology, sensor networks, 3D shape analysis and processing, to name a few.
More specifically, DataShape has developed and is still developing a strong expertise on new TDA methods for Machine Learning and Artificial Intelligence for complex data and (complex) time-dependent data. This includes, for example:
- domain adaptation problems for time series (PhD of Myriam Frikha with Ericsson),
- robust climate modelling (collaboration with University of Oxford)
- anomaly detection (with IRT Systemx and Confiance.AI program),
- the statistical significance of biological phenomena (cell cycle, stem cell differentiation, immune system responses) that occur in large scale single-cell RNAseq and spatial transcriptomics data sets (collaboration with Rizvi Lab, University of Wisconsin),
- the analysis of gene regulatory networks for plant-pathogen interactions (collaboration with INRAE): internship of Ludo Andrianirina,
- the analysis of satellite imaging and cartography data sets (collaboration with Thalès Alenia Space).
5 Social and environmental responsibility
5.1 Footprint of research activities
The weekly research seminar of DataShape is now taking place in hybrid mode. The travels for the team members have decreased a lot these years to take care of the environmental footprint of the team. The use of train, instead of plane, is strongly encouraged, when possible.
6 Highlights of the year
6.1 Awards
- Test of time award of the Symposium on Computational Geometry (SoCG) for: David Cohen-Steiner , Herbert Edelsbrunner, John Harer: Stability of persistence diagrams, SoCG 2005. Online announcement
- One of the Prix d'excellence de l'Université Côte d'Azur awarded to David Cohen-Steiner .
- Best figure award (over the last 5 years) at the Symposium on Computational Geometry (SoCG) for: Tight Bounds for the Learning of Homotopy à la Niyogi, Smale, and Weinberger for Subsets of Euclidean Spaces and of Riemannian Manifolds. Dominique Attali, Hana Dal Poz Kouřimská, Christopher Fillmore, Ishika Ghosh, André Lieutier, Elizabeth Stephenson, Mathijs Wintraecken, SoCG 2024.
- Best paper award (honorable mention) at SIGGRAPH ASIA 2025 for: Efficient and scalable spatial regularization of optimal transport by Lucas Brifault, David Cohen-Steiner and Mathieu Desbrun.
6.2 PhD defenses
- Hugo Henneuse, supervised by F. Chazal and P. Massart. June 13, 2025.
- Charly Boricaud, supervised by B. Buet and S. Masnou. December 9, 2025.
7 Latest software developments, platforms, open data
In 2025 we developed new software, which we discuss below.
7.1 Latest software developments
7.1.1 GUDHI
-
Name:
Geometric Understanding in Higher Dimensions
-
Keywords:
Computational geometry, Topology, Clustering
-
Scientific Description:
The Gudhi library is an open source library for Computational Topology and Topological Data Analysis (TDA). It offers state-of-the-art algorithms to construct various types of simplicial complexes, data structures to represent them, and algorithms to compute geometric approximations of shapes and persistent homology.
The GUDHI library offers the following interoperable modules:
. Complexes: + Cubical + Simplicial: Rips, Witness, Alpha and Čech complexes + Cover: Nerve and Graph induced complexes . Data structures and basic operations: + Simplex tree, Skeleton blockers and Toplex map + Construction, update, filtration and simplification . Topological descriptors computation . Manifold reconstruction . Topological descriptors tools: + Bottleneck and Wasserstein distance + Statistical tools + Persistence diagram and barcode
-
Functional Description:
The GUDHI open source library will provide the central data structures and algorithms that underly applications in geometry understanding in higher dimensions. It is intended to both help the development of new algorithmic solutions inside and outside the project, and to facilitate the transfer of results in applied fields.
-
News of the Year:
Below is a list of changes made since GUDHI 3.10.1:
- Delaunay complex . The Delaunay complex can be equipped with different filtrations: * Delaunay complex (no filtration values computed) * Delaunay-Čech complex (using minimal enclosing ball) * Alpha complex (moved in this new section) . The Delaunay-Čech and Alpha complex can output square, or not square, filtration values . An incremental version of the Delaunay complex (only in C++)
- Rips complex persistence scikit-learn like interface . A binding to Ripser when it accelerates the computation
- Persistence graphical tools . Can now handle scikit-learn like interfaces outputs as inputs
- Simplex tree . Can now store additionnal data on each simplex (only in C++) . Can be const
- URL:
- Publication:
-
Contact:
Marc Glisse
-
Participants:
Marc Glisse, Hannah Schreiber, 17 anonymous participants
-
Partners:
Université Côte d'Azur (UCA), Fujitsu
7.1.2 Multipers
-
Name:
Multiparameter Persistence for Machine Learning
-
Keywords:
Topology, Machine learning
-
Functional Description:
multipers is a Python library for Topological Data Analysis, focused on Multiparameter Persistence computation and visualizations for Machine Learning. It features several efficient computational and visualization tools, with integrated, easy to use, auto-differentiable Machine Learning pipelines, that can be seamlessly interfaced with scikit-learn and PyTorch. This library is meant to be usable for non-experts in Topological or Geometrical Machine Learning. Performance-critical functions are implemented in C++ or in Cython, are parallelizable with TBB, and have Python bindings and interface. It can handle a very diverse range of datasets that can be framed into a (finite) multi-filtered simplicial or cell complex, including, e.g., point clouds, graphs, time series, images, etc.
- URL:
- Publication:
-
Contact:
David Loiseaux
-
Participants:
Hannah Schreiber, an anonymous participant
8 New results
8.1 Algorithmic aspects and new mathematical directions for topological and geometric data analysis
8.1.1 Efficient and Scalable Spatial Regularization of Optimal Transport
Participants: David Cohen-Steiner.
In collaboration with Lucas Brifault (Dassault Systèmes) and Mathieu Desbrun (GEOMERIX)
In this paper (18), we introduce a novel approach to spatial regularization of optimal transport problems. Based on the notion of forward and backward "mean maps" of a transport plan, we introduce a convex formulation of optimal transport problems that incorporates regularization of these mean maps to promote spatial continuity of the resulting optimal plan. Unlike previous regularization approaches that required the optimization of all the transport plan coefficients, our formulation translates into an ADMM-based solver combined with Sinkhorn type algorithms, which drastically reduces the number of variables and scales up to large problems. We demonstrate the usefulness and efficiency of this new computational tool for various applications and for different regularizations.
8.1.2 Burning or Collapsing the Medial Axis is Unstable
Participants: Mathijs Wintraecken.
In collaboration with Erin Chambers (University of Notre Dame, USA), Christopher Fillmore (Institute of Science and Technology Austria), Elizabeth Stephenson (Orteliu, Oslo, Norway)
The medial axis of a set consists of the points in the ambient space without a unique closest point in the original set. Since its introduction, the medial axis has been used extensively in many applications as a method of computing a skeleton topologically equivalent to the original set. Unfortunately, one limiting factor in the use of the medial axis of a smooth manifold is that it is not necessarily topologically stable under small perturbations of the manifold. To counter these instabilities, various prunings of the medial axis have been proposed in the computational geometry community. In this paper (12), we examine one type of pruning, called burning. Because of the good experimental results it was hoped that the burning method of simplifying the medial axis would be stable. In this work, we show a simple example that dashes such hopes. Based on Bing's house with two rooms, we demonstrate an isotopy of a shape where the medial axis goes from collapsible to non-collapsible. More precisely, we consider the standard deformation retract from the closed ball to Bing's house with two rooms, but stop just short of the point where Bing's house becomes two dimensional. This way we obtain an isotopy from the 3-ball to a thickened version of Bing's house. Under this isotopy, the medial axis goes from collapsible to non-collapsible. We stress that this isotopy can be made generic, in the sense of singularity theory, as developed by Arnold and Thom.
8.1.3 Sparsification of the generalized persistence diagrams for scalability through gradient descent
Participant: Mathieu Carrière.
In collaboration with Seunghyun Kim, Woojin Kim (KAIST, South Korea)
The generalized persistence diagram (GPD) is a natural extension of the classical persistence barcode to the setting of multi-parameter persistence and beyond. The GPD is defined as an integer-valued function whose domain is the set of intervals in the indexing poset of a persistence module, and is known to be able to capture richer topological information than its single-parameter counterpart. However, computing the GPD is computationally prohibitive due to the sheer size of the interval set. Restricting the GPD to a subset of intervals provides a way to manage this complexity, compromising discriminating power to some extent. However, identifying and computing an effective restriction of the domain that minimizes the loss of discriminating power remains an open challenge.In this work, we introduce a novel method for optimizing the domain of the GPD through gradient descent optimization. To achieve this, we introduce a loss function tailored to optimize the selection of intervals, balancing computational efficiency and discriminative accuracy. The design of the loss function is based on the known erosion stability property of the GPD. We showcase the efficiency of our sparsification method for dataset classification in supervised machine learning. Experimental results demonstrate that our sparsification method significantly reduces the time required for computing the GPDs associated to several datasets, while maintaining classification accuracies comparable to those achieved using full GPDs. Our method thus opens the way for the use of GPD-based methods to applications at an unprecedented scale.
8.1.4 Multi-parameter Module Approximation: An Efficient and Interpretable Invariant for Multi-Parameter Persistence Modules with Guarantees
Participant: Mathieu Carrière.
In collaboration with David Loiseaux (Inria Saclay) and Andrew J. Blumberg (Columbia University, USA)
Topological data analysis (TDA) is a rapidly growing area of data science, whose most common descriptor is persistent homology, which tracks the topological changes in growing families of subsets of the data set itself, called filtrations, and encodes them in an algebraic object, called a persistence module. The algorithmic and theoretical properties of persistence modules are now well understood in the single-parameter case, that is, when there is only one filtration (e.g., feature scale) to study. In contrast, much less is known in the multi-parameter case, where several filtrations (e.g., scale and density) are used simultaneously. Since multi-parameter persistence modules usually encode information that is invisible to their single-parameter counterparts, it is critical to build tractable proxies for them, ideally with some theoretical robustness guarantees. In this article, we introduce a new parameterized family of topological descriptors, taking the form of candidate decompositions, for multi-parameter persistence modules, and we a identify a subfamily of these descriptors, that we call approximate decompositions, that are controllable approximations, in the sense that they preserve diagonal barcodes. Then, we introduce MMA (Multipersistence Module Approximation): an algorithm based on matching functions for computing instances of candidate decompositions with some precision parameter . By design, MMA can handle an arbitrary number of filtrations, and has bounded complexity and running time. Moreover, we prove the robustess of MMA: when computed with so-called compatible matching functions, we show that MMA produces approximate decompositions (and we prove that such matching functions exist for filtrations). Next, we restrict the focus on modules that can be decomposed into interval summands. In that case, compatible matching functions always exist, and we show that, for small enough , the approximate decompositions obtained with such compatible matching functions by MMA have an approximation error (in terms of the standard interleaving and bottleneck distances) that is bounded by , and that reaches zero for an even smaller, positive precision . Finally, we present empirical evidence validating that MMA has state-of-the-art performance and running time on several data sets.8.1.5 A fast algorithm for the Hecke representation of the braid group, and applications to the computation of the HOMFLY-PT polynomial and the search for interesting braids
Participant: Clément Maria.
In collaboration with Hoel Queffelec (CNRS - Institut Montpelliérain Alexander Grothendieck).
Knot theory is an active field of mathematics, in which combinatorial and computational methods play an important role. One side of computational knot theory, that has gained interest in recent years, both for complexity analysis and practical algorithms, is quantum topology and the computation of topological invariants issued from the theory. In this article 40, we leverage the rigidity brought by the representation-theoretic origins of the quantum invariants for algorithmic purposes. We do so by exploiting braids and the algebraic properties of the braid group to describe, analyze, and implement a fast algorithm to compute the Hecke representation of the braid group. We apply this construction to design a parameterized algorithm to compute the HOMFLY-PT polynomial of knots, and demonstrate its interest experimentally. Finally, we combine our fast Hecke representation algorithm with Garside theory, to implement a reservoir sampling search and find non-trivial braids with trivial Hecke representations with coefficients in . We find several such braids, in particular proving that the Hecke representation of with coefficients is non-faithful.
8.1.6 On Sparse Representations of 3-Manifolds
Participant: Clément Maria.
In collaboration with Kristóf Huszár (TU Graz).
3-manifolds are commonly represented as triangulations, consisting of abstract tetrahedra whose triangular faces are identified in pairs. The combinatorial sparsity of a triangulation, as measured by the treewidth of its dual graph, plays a fundamental role in the design of parameterized algorithms. In this work 36, we investigate algorithmic procedures that transform or modify a given triangulation while controlling specific sparsity parameters. First, we describe a linear-time algorithm that converts a given triangulation into a Heegaard diagram of the underlying 3-manifold, showing that the construction preserves treewidth. We apply this construction to exhibit a fixed-parameter tractable framework for computing Kuperberg's quantum invariants of 3-manifolds. Second, we present a quasi-linear-time algorithm that retriangulates a given triangulation into one with maximum edge valence of at most nine, while only moderately increasing the treewidth of the dual graph. Combining these two algorithms yields a quasi-linear-time algorithm that produces, from a given triangulation, a Heegaard diagram in which every attaching curve intersects at most nine others.
8.1.7 Compressed data structures for Heegaard splittings
Participant: Henrique Ennes, Clément Maria.
Heegaard splittings provide a natural representation of closed 3-manifolds by gluing two handlebodies along a common surface. These splittings can be equivalently given by two finite sets of meridians lying on the surface, which define a Heegaard diagram. In this work 34, we present a data structure to effectively represent Heegaard diagrams as normal curves with respect to triangulations of a surface, where the complexity is measured by the space required to express the normal coordinates' vectors in binary. This structure can be significantly more compact than triangulations of 3-manifolds, yielding exponential gains for certain families. Even with this succinct definition of complexity, we establish polynomial-time algorithms for comparing and manipulating diagrams, performing stabilizations, detecting trivial stabilizations and reductions, and computing topological invariants of the underlying manifolds, such as their fundamental and homology groups. We also contrast early implementations of our techniques with standard software programs for 3-manifolds, achieving faster algorithms for the average cases and exponential gains in speed for some particular presentations of the inputs.
8.1.8 Hardness of computation of quantum invariants on 3-manifolds with restricted topology
Participant: Henrique Ennes, Clément Maria.
Quantum invariants in low dimensional topology offer a wide variety of valuable invariants of knots and 3-manifolds, presented by explicit formulas that are readily computable. Their computational complexity has been actively studied and is tightly connected to topological quantum computing. In this article 21, we prove that for any 3-manifold quantum invariant in the Reshetikhin-Turaev model, there is a deterministic polynomial time algorithm that, given as input an arbitrary closed 3-manifold , outputs a closed 3-manifold with same quantum invariant, such that is hyperbolic, contains no low genus embedded incompressible surface, and is presented by a strongly irreducible Heegaard diagram. Our construction relies on properties of Heegaard splittings and the Hempel distance. At the level of computational complexity, this proves that the hardness of computing a given quantum invariant of 3-manifolds is preserved even when severely restricting the topology and the combinatorics of the input. This positively answers a question raised by Samperton.
8.1.9 Well-quasi-orders on embedded planar graphs
Participant: Clément Maria, Corentin Lunel.
The central theorem of topological graph theory states that the graph minor relation is a well-quasi-order on graphs. It has far-reaching consequences, in particular in the study of graph structures and the design of (parameterized) algorithms. In this article 39, we study two embedded versions of classical minor relations from structural graph theory and prove that they are also well-quasi-orders on general or restricted classes of embedded planar graphs. These embedded minor relations appear naturally for intrinsically embedded objects, such as knot diagrams and surfaces in . Handling the extra topological constraints of the embeddings requires careful analysis and extensions of classical methods for the more constrained embedded minor relations. We prove that the embedded version of immersion induces a well-quasi-order on bounded carving-width plane graphs by exhibiting particularly well-structured tree-decompositions and leveraging a classical argument on well-quasi-orders on forests. We deduce that the embedded graph minor relation defines a well-quasi-order on plane graphs via their directed medial graphs, when their branch-width is bounded. We conclude that the embedded graph minor relation is a well-quasi-order on all plane graphs, using classical grids theorems in the unbounded branch-width case.
8.1.10 Geometric characterisation of structural and regular equivalences in undirected (hyper)graphs
Participant: Nina Otter.
In collaboration with Marzieh Eidi (MPI MiS).
Similarity notions between vertices in a graph, such as structural and regular equivalence, are one of the main ingredients in clustering tools in complex network science. In this article 33 we generalise structural and regular equivalences for undirected hypergraphs and provide a characterisation of structural and regular equivalences of undirected graphs and hypergraphs through neighbourhood graphs and Ollivier-Ricci curvature. Our characterisation sheds new light on these similarity notions opening a new avenue for their exploration. These characterisations also enable the construction of a possibly wide family of regular partitions, thereby offering a new route to a task that has so far been computationally challenging.8.2 Statistical aspects of topological and geometric data analysis
8.2.1 Gromov-Wasserstein Bound between Reeb and Mapper Graphs
Participant: Mathieu Carrière.
In collaboration with Ziyad Oulhaj and Bertrand Michel (École Centrale de Nantes, France)
Since its introduction as a computable approximation of the Reeb graph, the Mapper graph has become one of the most popular tools from topological data analysis for performing data visualization and inference. However, finding an appropriate metric (that is, a tractable metric with theoretical guarantees) for comparing Reeb and Mapper graphs, in order to, e.g., quantify the rate of convergence of the Mapper graph to the Reeb graph, is a difficult problem. While several metrics have been proposed in the literature, none is able to incorporate measure information, when data points are sampled according to an underlying probability measure. The resulting Reeb and Mapper graphs are therefore purely deterministic and combinatorial, and substantial effort is thus required to ensure their statistical validity. In this article, we handle this issue by treating Reeb and Mapper graphs as metric measure spaces. This allows us to use Gromov-Wasserstein metrics to compare these graphs directly in order to better incorporate the probability measures that data points are sampled from. Then, we describe the geometry that arises from this perspective, and we derive rates of convergence of the Mapper graph to the Reeb graph in this context. Finally, we showcase the usefulness of such metrics for Reeb and Mapper graphs in a few numerical experiments.8.3 Topological and geometric approaches for machine learning
8.3.1 A Knowledge Graph and Topological Data Analysis Framework to Disentangle the Tomato-Multi Pathogens Complex Gene Regulatory Network
Participant: Mathieu Carrière.
In collaboration with Maxime Multari, Xavier Amorós-Gabarrón, Alexina Damy, Stéphanie Jaubert, Silvia Bottini (INRAE, France), Sebastian Lobentanzer, Julio Saez-Rodriguez and Aurélien Dugourd (Heidelberg University, Germany)
Global population is rapidly increasing, representing a major challenge for food supply, exacerbated by climate change and environmental degradation. Despite the pivotal role of agriculture, plant health and survival are threatened by various biotic stressors. Although how plants respond to each of these individual stresses is well studied, little is known about how they respond to a combination of many of these bio-aggressors occurring together. To tackle this question, first, we built TomTom, a knowledge graph gathering molecular interactions from nine publicly available databases, including transcription factors- or microRNAs- targets, proteinprotein interactions, and functional terms. Then, we selected transcriptomics data of tomato subjected to six distinct pathogens and performed an integrative analysis. We found 5561 candidate genes involved in the multi-stress response of tomato. To study how the response is orchestrated, we mapped those genes in TomTom and extracted a comprehensive gene regulatory network (GRN) composed of 71 transcription factors (TF) and 1618 target genes. By estimating the TF activity, we identified 43 TFs responding either specifically to one or multiple bio-aggressors. GRN analyses with a topological data analysis approach allowed to identify 18 clusters of TFs with similar properties, yielding four main configurations localized in specific regions of the GRN. Finally, we found four ERF hubs which cooperatively coordinate the tomato response to multiple pathogens. Our findings allowed to study the complex molecular reprogramming in tomato upon interaction with different biotic agents, providing tools scalable to other questions involving tomato molecular interactions and beyond.
8.3.2 Enhancer Dynamics and Spatial Organization Drive Anatomically Restricted Cellular States in the Human Spinal Cord
Participant: Mathieu Carrière.
In collaboration with Elena K. Kandror, Alexis Peterson, Andreas Tjärnberg, Yuchen Xu, Abbas H. Rizvi (University of Wisconsin, USA), Anqi Wang, Jun Hou Fung, William Pangburn, Raul Rabadan, Tom Maniatis (Columbia University, USA), Jackson Loper (University of Michigan, USA), Will Liao (NY Genome Center, USA), Krishnaa T. Mahbubani and Kourosh Saeb-Parsy (University of Cambridge, UK)
Here, we report the spatial organization of RNA transcription and associated enhancer dynamics in the human spinal cord at single-cell and single-molecule resolution. We expand traditional multiomic measurements to reveal epigenetically poised and bivalent active transcriptional enhancer states that define cell type specification. Simultaneous detection of chromatin accessibility and histone modifications in spinal cord nuclei reveals previously unobserved cell-type specific cryptic enhancer activity, in which transcriptional activation is uncoupled from chromatin accessibility. Such cryptic enhancers define both stable cell type identity and transitions between cells undergoing differentiation. We also define glial cell gene regulatory networks that reorganize along the rostrocaudal axis, revealing anatomical differences in gene regulation. Finally, we identify the spatial organization of cells into distinct cellular organizations and address the functional significance of this observation in the context of paracrine signaling. We conclude that cellular diversity is best captured through the lens of enhancer state and intercellular interactions that drive transitions in cellular state. This study provides fundamental insights into the cellular organization of the healthy human spinal cord.
8.3.3 Fermat Distance-to-Measure: a robust Fermat-like metric
Participant: Frédéric Chazal, Jérôme Taupin.
Given a probability measure with density, Fermat distances and density-driven metrics are conformal transformation of the Euclidean metric that shrink distances in high density areas and enlarge distances in low density areas. Although they have been widely studied and have shown to be useful in various machine learning tasks, they are limited to measures with density (with respect to Lebesgue measure, or volume form on manifold). In 45, by replacing the density with the Distance-to-Measure, we introduce a new metric, the Fermat Distance-to-Measure, defined for any probability measure in . We derive strong stability properties for the Fermat Distance-to-Measure with respect to the measure and propose an estimator from random sampling of the measure, featuring an explicit bound on its convergence speed.
8.4 Miscellaneous
8.4.1 Curvature-Guided Optimal Transport for Rigid Point Cloud Registration
Participant: Mathijs Wintraecken.
In collaboration with Roberto M Dyke (TITANE), Marie-Aurélie Chanut (Cerema - Centre d'Etudes et d'Expertise sur les Risques, l'Environnement, la Mobilité et l'Aménagement), Pierre Alliez (TITANE)
The rigid registration of pairs of point sets is a fundamental step for many downstream tasks including shape analysis, reconstruction and localization. There has been a growing interest in the use of Optimal Transport (OT) for point cloud registration problems. However, these techniques face limited adoption due to scalability issues—rendering them impractical—and their sensitivity to missing data commonly encountered in real-world scans. We consider how geometric information may be incorporated into an OT registration framework for improved accuracy and scalability. In this work, we guide mini-batch selection by binning shape features based on local curvature estimates. We demonstrate that our method achieves better results than other OT-based methods and is comparable to the state-of-the-art in terms of successful registrations.
8.4.2 Supervised Contamination Detection, with Flow Cytometry Application
Participant: Gilles Blanchard, Frédéric Chazal, Solenne Gaucher.
In 15, The contamination detection problem aims to determine whether a set of observations has been contaminated, i.e. whether it contains points drawn from a distribution different from the reference distribution. Here, we consider a supervised problem, where labeled samples drawn from both the reference distribution and the contamination distribution are available at training time. This problem is motivated by the detection of rare cells in flow cytometry. Compared to novelty detection problems or two-sample testing, where only samples from the reference distribution are available, the challenge lies in efficiently leveraging the observations from the contamination detection to design more powerful tests. In this article, we introduce a test for the supervised contamination detection problem. We provide non-asymptotic guarantees on its Type I error, and characterize its detection rate. The test relies on estimating reference and contamination densities using histograms, and its power depends strongly on the choice of the corresponding partition. We present an algorithm for judiciously choosing the partition that results in a powerful test. Simulations illustrate the good empirical performances of our partition selection algorithm and the efficiency of our test. Finally, we showcase our method and apply it to a real flow cytometry dataset.
8.4.3 Transductive Conformal Inference for Full Ranking
Participant: Gilles Blanchard.
In collaboration with J-B. Fermanian (U. Montpellier, IMAG, and Inria team IROKO), and P. Humbert (CNRS and Sorbonne Université)
In 22, we introduce a method based on Conformal Prediction (CP) to quantify the uncertainty of full ranking algorithms. We focus on a specific scenario where items are to be ranked by some “black box” algorithm. It is assumed that the relative (ground truth) ranking of of them is known. The objective is then to quantify the error made by the algorithm on the ranks of the new items among the total . In such a setting, the true ranks of the original items in the total depend on the (unknown) true ranks of the new ones. Consequently, we have no direct access to a calibration set to apply a classical CP method. To address this challenge, we propose to construct distribution-free bounds of the unknown conformity scores using recent results on the distribution of conformal p-values. Using these scores upper bounds, we provide valid prediction sets for the rank of any item. We also control the false coverage proportion, a crucial quantity when dealing with multiple prediction sets. Finally, we empirically show on both synthetic and real data the efficiency of our CP method for state-of-the-art algorithms such as RankNet or LambdaMart.8.4.4 Supervised aggregation of anomaly score functions for active anomaly detection
Participant: Martin Royer.
In collaboration with Kevin Bleakley (Inria Celest), Mouhcine Mendil (IRT Saint Exupéry), Benjamin Auder (Laboratoire de Mathématiques d'Orsay).
Detecting rare anomalies in batches of multidimensional data is challenging. In 11, we propose a supervised active-learning framework that sends a small number of data points from each batch to an expert for labeling as 'anomaly' or 'nominal', via two mechanisms: (i) points most likely to be an anomaly in the eyes of a supervised classifier trained on previously-labeled data; and (ii) points suggested by an active learner. Instead of, however, training the supervised classifier directly on the current set of labeled raw data points, we treat the scores calculated by an ensemble of M unsupervised anomaly detectors on each data point as if they were the learner's input features. This approach generalizes earlier attempts to linearly aggregate unsupervised anomaly detector scores, and broadens the scope of such methods to ordered data like time series. Results suggest that this method usually outperforms-often significantly-linear strategies. The Python library acanag provides an implementation of the proposed method.
8.4.5 Curvature penalization of strongly anisotropic interfaces models and their phase-field approximation
Participant: Blanche Buet.
In collaboration with Jean-François Babadjian (LMO, Université Paris-Saclay) and Michael Goldman (CMAP, Ecole Polytechnique).
25 studies the effect of anisotropy on sharp or diffuse interfaces models. When the surface tension is a convex function of the normal to the interface, the anisotropy is said to be weak. This usually ensures the lower semicontinuity of the associated energy. If, however, the surface tension depends on the normal in a nonconvex way, this so-called strong anisotropy may lead to instabilities related to the lack of lower semicontinuity of the functional. We investigate the regularizing effects of adding a higher order term of Willmore type to the energy. We consider two types of problems. The first one is an anisotropic nonconvex generalization of the perimeter, and the second one is an anisotropic nonconvex Mumford-Shah functional. In both cases, lower semicontinuity properties of the energies with respect to a natural mode of convergence are established, as well as Γ-convergence type results by means of a phase field approximation. In comparison with related results for curvature dependent energies, one of the original aspects of our work is that, in the context of free discontinuity problems, we are able to consider singular structures such as crack-tips or multiple junctions.
8.4.6 Approximate mean curvature flows of a general varifold, and their limit spacetime Brakke flow
Participant: Blanche Buet.
In collaboration with Gian Paolo Leonardi (University of Trento), Simon Masnou (Université Lyon 1) and Abdelmouksit Sagueni.
In 28, we propose a construction of mean curvature flows by approximation for very general initial data, in the spirit of the works of Brakke and of Kim & Tonegawa based on the theory of varifolds. Given a general varifold, we construct by iterated push-forwards an approximate time-discrete mean curvature flow depending on both a given time step and an approximation parameter. We show that, as the time step tends to 0, this time-discrete flow converges to a unique limit flow, which we call the approximate mean curvature flow. An interesting feature of our approach is its generality, as it provides an approximate notion of mean curvature flow for very general structures of any dimension and codimension, ranging from continuous surfaces to discrete point clouds. We prove that our approximate mean curvature flow satisfies several properties: stability, uniqueness, Brakke-type equality, mass decay. By coupling this approximate flow with the canonical time measure, we prove convergence, as the approximation parameter tends to 0, to a spacetime limit measure whose generalized mean curvature is bounded. Under an additional rectifiability assumption, we further prove that this limit measure is a spacetime Brakke flow.
8.4.7 Théorie de l'homotopie quantitative
Participant: Pierre Pansu.
Le but de la théorie de l'homotopie, en topologie, c'est de simplifier, après déformation continue, des applications continues entre espaces topologiques. Ce qui empêche de le faire, ce sont des invariants homotopiques. Cela soulève des questions quantitatives : - Le calcul des invariants est-il possible (décidable) ? Si oui, à quel coût ? - Construire des représentants de faible complexité et dont les valeurs des invariants sont prescrites est-il possible ? Si oui, à quel coût ? - Quelle est la complexité des déformations nécessaires ? Les réponses, souvent récentes, sont d'une grande diversité. En outre, bien des questions restent ouvertes, montrant que la topologie n'a pas dit son dernier mot, même en basses dimensions.
9 Bilateral contracts and grants with industry
9.1 Bilateral contracts with industry
-
Participants: David Cohen-Steiner.
Collaboration with Dassault Systèmes and Inria team Geomerix (Saclay) on the applications of methods from geometric measure theory to the modelling and processing of complex 3D shapes (PhD of Lucas Brifault, started in May 2022).
-
Participants: Frédéric Chazal, Myriam Frikha.
Research collaboration with Ericsson on transfer learning for temporal data with applications in telecommunications (PhD of Myriam Frikha, started in November 2024).
-
Participants: Frédéric Chazal, Martin Royer.
Collaboration with Thales on TDA-based anomaly detection for satellite telemetry data (started in Dec. 2025).
-
Participants: Frédéric Chazal, Mathieu Carrière.
Research collaboration with Thales on topological approaches for the analysis and certification of AI-based critical systems through the Master internship of Louise Méric that will continue through a CIFRE PhD at the very beginning of 2026.
10 Partnerships and cooperations
10.1 International initiatives
10.1.1 Associate Teams in the framework of an Inria International Lab or in the framework of an Inria International Program
Equipe Associée TopTime
Participants: Nina Otter.
-
Title:
Topological and statistical methods for time series data
-
Partner Institution(s):
- Australian National University (ANU), Australia
-
Date/Duration:
2024-2026
-
Additional information:
Katharine Turner (ANU) is the co-PI of the EA.
10.1.2 Participation in other International Programs
KTH Royal Institute of Technology Seed Funding: Strengthening French – Swedish AI Collaboration
Participants: Frédéric Chazal, Mathieu Carrière.
-
Title:
Geometry-informed AI in wireless communications
-
Funding Institution(s)
: KTH Stockholm, Sweden
-
Date/Duration:
2025-2026
- Joint project between the SCI school, department of mathematics (PI: Martina Scolamiero), the DataShape team and Ericsson (industrial PI: Francesco Davide Calabrese).
SALTO exchange program between MPG and CNRS
Participants: Nina Otter.
-
Title:
Higher-order interactions at the crossroads of geometry and topology
-
Partner Institution(s):
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
-
Date/Duration:
2024-2026
-
Additionnal info
SALTO exchange programme between the Max Planck Gesellschaft and the CNRS. Marzieh Eidi (MPI MiS) is the co-PI.
10.2 International research visitors
10.2.1 Visits of international scientists
- Wolfgang Polonik (UC Davis). September-October 2025 (1 month).
- Marzieh Eidi (MPI MiS) April-May 2025 (2 months).
10.3 National initiatives
Extended visit
Participants: Corentin Lunel, Clément Maria.
- Duration : 2024-2025
- Coordinator : Clément Maria
- Location : Institut Montpelliérain Alexandre Grothendieck (IMAG) - Université de Montpellier
The visit consists of federating mathematicians from IMAG working on low dimensional and quantum topology together with computer scientists from Datashape, to work at the interface of the two fields.
10.3.1 ANR
ANR Chair in AI
Participants: Frédéric Chazal, Marc Glisse.
- Acronym : TopAI
- Type : ANR Chair in AI.
- Title : Topological Data Analysis for Machine Learning and AI
- Coordinator : Frédéric Chazal
- Duration : 2020-2026.
- Others Partners: Two industrial partners, the French SME Sysnav and the French start-up MetaFora.
- Abstract:
The TopAI project aims at developing a world-leading research activity on topological and geometric approaches in Machine Learning (ML) and AI with a double academic and industrial/societal objective. First, building on the strong expertise of the candidate and his team in TDA, TopAI aims at designing new mathematically well-founded topological and geometric methods and tools for Data Analysis and ML and to make them available to the data science and AI community through state-of-the-art software tools. Second, thanks to already established close collaborations and the strong involvement of French industrial partners, TopAI aims at exploiting its expertise and tools to address a set of challenging problems with high societal and economic impact in personalized medicine and AI-assisted medical diagnosis.
ANR ALGOKNOT
Participants: Clément Maria.
- Acronym : ALGOKNOT.
- Type : ANR Jeune Chercheuse Jeune Chercheur.
- Title : Algorithmic and Combinatorial Aspects of Knot Theory.
- Coordinator : Clément Maria.
- Duration : 2020 – 2026
- Abstract: The project AlgoKnot aims at strengthening our understanding of the computational and combinatorial complexity of the diverse facets of knot theory, as well as designing efficient algorithms and software to study their interconnections.
- See also: Clément Maria and ANR AlgoKnot.
ANR GeMfaceT
Participants: Blanche Buet.
- Acronym: GeMfaceT.
- Type: ANR JCJC -CES 40 – Mathématiques
- Title: A bridge between Geometric Measure and Discrete Surface Theories
- Coordinator: Blanche Buet.
- Duration: 2021–2026
- Abstract: This project positions at the interface between geometric measure and discrete surface theories. There has recently been a growing interest in non-smooth structures, both from theoretical point of view, where singularities occur in famous optimization problems such as Plateau problem or geometric flows such as mean curvature flow, and applied point of view where complex high dimensional data are no longer assumed to lie on a smooth manifold but are more singular and allow crossings, tree-structures and dimension variations. We propose in this project to strengthen and expand the use of geometric measure concepts in discrete surface study and complex data modelling and also, to use those possible singular disrcete surfaces to compute numerical solutions to the aforementioned problems.
ANR StratMesh
Participants: Jean-Daniel Boissonnat, Mathijs Wintraecken.
- Acronym: StratMesh.
- Type: ANR PRC
- Title: A bridge between Geometric Measure and Discrete Surface Theories
- Coordinator: Mathijs Wintraecken (local), Guillaume Moroz (Gamble, Centre Inria de l'Université de Lorraine) .
- Duration: 2025–2029
- Abstract: StratMesh aims to develop provably-correct triangulation algorithms for stratified spaces. Our focus is on stratified spaces that are the projection of smooth manifolds, which arise in many applications such as robotics, control theory, and medial axis computation for learning from geometric data.
ANR TopModel
Participants: Mathieu Carrière.
- Acronym: TopModel.
- Type: ANR JCJC
- Title: TopModel
- Coordinator: Mathieu Carrière
- Duration: 2024–2027
- Abstract: The central tenet of this project is the use of multiparameter topological data analysis for machine learning models, for both regularizing and monitoring these models, and for the automatic generation of new features and descriptors to feed these models with. On the theoretical front, a lot of efforts will be devoted to the development, implementation and generalization of standard topological data analysis techniques, who (for the most part) can only study the topological variations of at most one parameter (such as the data scale), so as to make them suitable for the study of the topological variations of several parameters jointly (such as density and scale, marker genes). Then, the focus will be on specific applications, for which topological data analysis is known to be relevant and efficient, of these new multiparameter topological data analysis methods for machine learning models. More precisely, we will emphasize the usefulness of our new tools on data sets from cosmology (large scale structures of the Universe) and biology (single-cell sequencing, mass cytometry).
PEPR SN
Participants: Mathieu Carrière.
- Acronym: AI4scMED.
- Type: Work package in PEPR SN
- Title: Multiscale AI for single-cell based precision medicine
- Coordinator: Mathieu Carrière
- Duration: 2023–2027
- Abstract: Cell-based precision medicine holds revolutionary potential for healthcare, but realizing its full potential demands a deep understanding of disease variability and multiscale aspects. Single-cell (sc) multi-omics offers a unique way to obtain molecular profiles of individual cells and predict disease trajectories. To harness this complexity, new AI breakthroughs are needed. Our consortium will tackle methodological challenges to bridge the gap between sc data and personalized treatments, resolving cell type differences and integrating sc-multi-omics with imaging for spatial insights.
Addressing the complexity of the human body and combining genomics with other assays, we will develop AI-based methods to handle, integrate, analyze, and visualize multiscale complexity in diseases. Our developments will leverage cutting-edge AI for sc-genomic data analysis. To infer causal mechanisms at different levels, we will use causal/logical/stochastic modeling to integrate heterogeneous data and account for temporal scales and biophysical priors.
We will create network inference methods to understand molecular mechanisms in clinical samples, identifying key genes and predicting therapeutic impacts. Precision medicine must also integrate variability across different cell decision levels. We aim to build predictive models, digital twins, to enable data-driven personalized treatments by connecting intracellular dynamics, biochemical processes, cell populations, and tissue-level organization.
10.3.2 Collaboration with other national research institutes
Confiance.ai / IRT SystemX
Participants: Frédéric Chazal.
Research collaboration on anomaly detection for multivariate time series using TDA and ML approaches.
11 Dissemination
11.1 Promoting scientific activities
11.1.1 Scientific events: organisation
- Clément Maria was co-organizer of the QuantAzur Days, Nice, November 2025.
- Nina Otter was co-organiser of the conference “Topological methods for time-varying data: theory and applications (TopTime)”, at the Australian National University, Canberra, Australia, November 2025.
- Nina Otter was co-organiser of the workshop “Higher-order interactions at the crossroads of geometry and topology”, Laboratoire de Mathématiques d'Orsay, December 2025.
11.1.2 Scientific events: selection
Member of the conference program committees
- Clément Maria was member of the program committee of the 43rd International Symposium on Theoretical Aspects of Computer Science (STACS) 2026
- Nina Otter was member of the program committee of the 11th ATMCS conference (2025).
- Nina Otter was member of the program committee of the Applied Category Theory (ACT) 2025 Conference.
11.1.3 Journal
Member of the editorial boards
- Frédéric Chazal is a member of the following journal editorial boards: Discrete and Computational Geometry (Springer), Journal of Applied and Computational Topology (Springer).
- Frédéric Chazal is the Editor-in-Chief of the Journal of Applied and Computational Topology (Springer).
11.1.4 Leadership within the scientific community
- Frédéric Chazal is a member of the Scientific Advisory Board of the Centre for Topological Data Analysis of the Mathematical Institute at Oxford.
- Frédéric Chazal is a member of the Scientific Council of EMAp (ESCOLA DE MATEMÁTICA APLICADA DA FUNDAÇÃO GETULIO VARGAS), Rio de Janeiro, Brasil.
- Mathieu Carrière is a chair holder of the 3IA Institute at Université Côte d'Azur.
11.1.5 Scientific expertise
- Frédéric Chazal is a member of the “commission prospective de l’I2M” (Institut de Mathématiques de Marseille).
- Clément Maria was a jury member for the UCA-DS4H PhD grant allocation scheme for 2025.
- Nina Otter is member of the executive committee of the DataIA institute.
11.1.6 Research administration
- Marc Glisse is president of the CDT at Inria Saclay.
- Frédéric Chazal is co-responsible of the “programme Mathématiques et IA” of the Fondation Mathématique Jacques Hadamard, Paris-Saclay University (until Oct. 2025).
- Frédéric Chazal is a member of the council of the Graduate School in Mathematics, Paris-Saclay Univ.
- Clément Maria is co-responsible of the CNRS-Groupe de Travail GeoAlgo.
- Clément Maria is a member of the steering committee of the QuantAzur federative institute.
11.2 Teaching - Supervision - Juries - Educational and pedagogical outreach
11.2.1 Teaching
- Master: Mathijs Wintraecken, Introduction to Scientific Research, 2h eq-TD, mineure DS4H (Master and PhD)
- Master: Frédéric Chazal, Analyse Topologique des Données, 30h eq-TD, Université Paris-Saclay, France.
- Master: Frédéric Chazal and Julien Tierny, Topological Data Analysis, 38h eq-TD, M2, Mathématiques, Vision, Apprentissage (MVA), ENS Paris-Saclay, France.
- Master: Mathieu Carrière, Basic Algebra for Data Analysis, 18h eq-TD, MSc DSAI, Université Côte d'Azur
- Master: Mathieu Carrière and Frédéric Cazals, Geometric and Topological Methods in Data Analysis, with Applications in Biology and Medecine , 15h eq-TD, MSc DSAI, Université Côte d'Azur
- Master: Mathieu Carrière, Statistical Learning Theory, 15h eq-TD, MSc DSAI, Université Côte d'Azur
- PSL doctoral course: Eddie Aamari, Frédéric Chazal, Alejandro Saldarriaga, 1 week, Topological Data Analysis.
- Mini-course at Young Topologists Meeting 2025 : Frédéric Chazal, Persistent homology for machine Learning : a measure perspective, 6h, KTH Stockholm.
- Master: Marc Glisse, Conception et analyse d'algorithmes, 44h eq-TD, M1, École Polytechnique, France.
11.2.2 Supervision
- PhD in progress: Rohit Roy. Triangulating stratified spaces. Started on November 2025. Mathijs Wintraecken and Pierre Alliez (TITANE).
- PhD in progress: Myriam Frikha, Domain adaptation for temporal data. Started in Nov. 2024. Frédéric Chazal.
- PhD in progress: Jérôme Taupin, Density-based metric learning and applications in Topological Data Analysis. Started in Sept. 2025. Frédéric Chazal.
- PhD in progress: Anna Hollands, Persistent path-homology for directed-graph analysis: Statistical aspects and applications to machine learning. Started in Oct. 2025. Frédéric Chazal and Bertrand Michel.
- PhD in progress: Alejandro Saldarriaga, Topological Deep Learning. Started in Nov. 2025. Eddie Aamari (ENS Paris) and Frédéric Chazal.
- PhD in progress: Henrique Ennes, Topological approach to quantum complexity. Started in Oct. 2023. Clément Maria and Nicolas Nisse (Inria).
- PhD in progress: António Leitao, Persistent homology of cover refinements and applications to XAI. Started November 2024. Nina Otter and Fosca Giannotti (Scuola Normale Superiore di Pisa)
11.2.3 Juries
- Marc Glisse was the external reviewer for the PhD defense of Dominic Desjardins Côté, Université de Sherbrooke, Canada.
- Blanche Buet was a member of the PhD Defense jury of Rémi Mougenot (12/2025), Université de Lorraine.
- Mathieu Carrière was a member of the PhD Defense of Mohamed Kissi (10/2025), Université Paris-Sorbonne, and Rayna Andreeva (05/2025), University of Edinburgh.
- Nina Otter was a member of the PhD Defense of Andreas Abildtrup Hansen (09/2025), The Technical University of Denmark.
11.2.4 Productions (articles, videos, podcasts, serious games, ...)
- Clément Maria : portrait de chercheur, exposition Street Science à Nice.
- Clément Maria : Article Mapping the algorithmic complexity of topological quantum computing dans le magazine de l’IdEx d’Université Côte d’Azur INSIGHTS.
11.2.5 Participation in Live events
- Blanche Buet participated in Fête de la Science (at IMO, 10/2025) and in RJMI (at Ens Paris Saclay, 11/2025). Blanche Buet gave a popularization seminar to L3 students (at IMO, 12/2025). Blanche Buet is part of the organizing committee of the FMJH Welcome days for masters (at IMO, 09/2025).
11.2.6 Others science outreach relevant activities
- Frédéric Chazal gave a general audience introductory presentation on Artificial Intelligence at Université pour Tous de Bourgogne (March 2025).
- Frédéric Chazal participated in round tables and gave talks on different aspects of AI at the Academie du Renseignement.
12 Scientific production
12.1 Major publications
- 1 articleDomain Generalization by Marginal Transfer Learning.Journal of Machine Learning Research2222021, 1-55HALDOI
- 2 articleThe Topological Correctness of PL Approximations of Isomanifolds.Foundations of Computational Mathematics22July 2021, 967 - 1012HALDOI
- 3 articleMean curvature motion of point cloud varifolds.ESAIM: Mathematical Modelling and Numerical Analysis5652022, 1773 - 1808HALDOI
- 4 inproceedingsMultiparameter Persistence Images for Topological Machine Learning.NeurIPS 2020 - 33rd Conference on Neural Information Processing SystemsVancouver / Virtuel, CanadaDecember 2020HAL
- 5 inproceedingsOptimizing persistent homology based functions.ICML 2021 - 38th International Conference on Machine LearningPMLR 139Proceedings of the 38th International Conference on Machine Learning, ICML 2021.Virtual conference, United StatesJuly 2021, 1294-1303HAL
- 6 articleLexicographic Optimal Homologous Chains and Applications to Point Cloud Triangulations.Discrete and Computational Geometry68September 2022HALDOI
- 7 articleCompressive Statistical Learning with Random Feature Moments.Mathematical Statistics and Learning32August 2021, 113–164HALDOI
-
8
articleA Polynomial-Time Algorithm to Compute Turaev–Viro Invariants
of 3-Manifolds with Bounded First Betti Number.Foundations of Computational Mathematics205November 2019, 1013-1034HALDOI - 9 articleAlpha magnitude.Journal of Pure and Applied Algebra22711November 2023, 107396HALDOI
12.2 Publications of the year
International journals
International peer-reviewed conferences
Reports & preprints
Other scientific publications