DATASHAPE

DATASHAPE - 2025

2025Activity reportProject-TeamDATASHAPE

RNSR: 201622050C

Research‌ centers Inria Saclay Centre at Université Paris-Saclay Inria‌ Centre at Université Côte d'Azur
In partnership with:‌Université Paris-Saclay, CNRS
Team name: Understanding the shape‌ of data
In collaboration with:Laboratoire de mathématiques‌ d'Orsay de l'Université de Paris-Saclay (LMO)

Creation of‌ the Project-Team: 2020 October 01

Each year, Inria‌ research teams publish an Activity Report presenting their‌ work and results over the reporting period. These‌ reports follow a common structure, with some optional‌ sections depending on the specific team. They typically‌ begin by outlining the overall objectives and research‌ programme, including the main research themes, goals, and‌ methodological approaches. They also describe the application domains‌ targeted by the team, highlighting the scientific or‌ societal contexts in which their work is situated.‌

The reports then present the highlights of the‌ year, covering major scientific achievements, software developments, or‌ teaching contributions. When relevant, they include sections on‌ software, platforms, and open data, detailing the tools‌ developed and how they are shared. A substantial‌ part is dedicated to new results, where scientific‌ contributions are described in detail, often with subsections‌ specifying participants and associated keywords.

Finally, the Activity‌ Report addresses funding, contracts, partnerships, and collaborations at‌ various levels, from industrial agreements to international cooperations.‌ It also covers dissemination and teaching activities, such‌ as participation in scientific events, outreach, and supervision.‌ The document concludes with a presentation of scientific‌ production, including major publications and those produced during‌ the year.

Keywords

Computer Science and Digital Science‌

A3. Data and knowledge
A3.4. Machine learning and‌ statistics
A7.1. Algorithms
A8. Mathematics of computing
A8.1.‌ Discrete mathematics, combinatorics
A8.3. Geometry, Topology
A9. Artificial‌ intelligence

1 Team members, visitors, external collaborators

Research Scientists‌

Frederic Chazal [Team leader, INRIA,‌ Senior Researcher, HDR]
Jean-Daniel Boissonnat [‌INRIA, Emeritus, HDR]
Mathieu Carrière‌ [INRIA, Researcher]
David Cohen-Steiner [‌INRIA, Researcher]
Marc Glisse [INRIA‌, Researcher]
Clément Maria [INRIA,‌ Researcher]
Nina Lisann Otter [INRIA,‌ ISFP]
Mathijs Wintraecken [INRIA, ISFP‌]

Faculty Members

Gilles Blanchard [UNIV PARIS‌ SACLAY, Associate Professor]
Charly Boricaud [‌UNIV PARIS SACLAY, Professor, until Sep‌ 2025]
Blanche Buet [UNIV PARIS SACLAY‌, Associate Professor]
Remi Leclercq [UNIV PARIS SACLAY, Professor‌]
Pierre Pansu [‌UNIV PARIS SACLAY,‌‌ Emeritus]

Post-Doctoral Fellows

Daniele Cannarsa [INRIA‌, Post-Doctoral Fellow,‌ until Aug 2025]‌‌
Francesco Conti [INRIA, Post-Doctoral Fellow]‌
Ondrej Draganov [INRIA‌, Post-Doctoral Fellow,‌‌ from Apr 2025]
Corentin Lunel [INRIA‌, Post-Doctoral Fellow,‌ until Sep 2025]‌‌
Renata Turkes [INRIA, Post-Doctoral Fellow,‌ until Aug 2025]‌

PhD Students

Myriam Frikha‌‌ [ERICSSON]
Hugo Henneuse [UNIV PARIS‌ SACLAY, until Oct‌ 2025]
Anna Hollands‌‌ [UNIV PARIS SACLAY, from Oct 2025‌]
Antonio Lage De‌ Sousa Leitao [Scuola‌‌ Normale Superiore di Pisa, Italy, from Nov 2024‌]
Henrique Lovisi Ennes‌ [UNIV COTE D'AZUR‌‌, from May 2025]
Rohit Roy [‌INRIA, from Nov‌ 2025]
Alejandro Saldarriaga‌‌ [DMA-ENS, from Oct 2025]
Jérôme‌ Taupin [Université Paris-Saclay‌, from Sep 2025‌‌]

Technical Staff

Vincent Rouvreau [INRIA,‌ Engineer]
Hannah Schreiber‌ [INRIA, Engineer‌‌]

Interns and Apprentices

Ludo Andrianirina Mamisoa [‌UNIV COTE D'AZUR,‌ Intern, from Mar‌‌ 2025 until Aug 2025]
Nestor Antunano Cabrera‌ [INRIA, Intern‌, from Apr 2025‌‌ until Aug 2025]
Madhav Cherupilil Sajeev [‌UNIV COTE D'AZUR,‌ Intern, from Apr‌‌ 2025 until Aug 2025]
Alberto Conforti [‌INRIA, Intern,‌ from Mar 2025 until‌‌ Aug 2025]
Beatriz Evelbauer [ENSTA,‌ from May 2025 until‌ Aug 2025]
Anna‌‌ Hollands [INRIA, Intern, from Apr‌ 2025 until Sep 2025‌]
Ilian Riveiro [‌‌Université Paris-Saclay, Intern, from Mar 2025‌ until Aug 2025,‌ Université Paris-Saclay]
Aurora‌‌ Rivet [UNIV COTE D'AZUR, Intern,‌ from Jun 2025 until‌ Aug 2025]
Jérôme‌‌ Taupin [ENS Paris , until Aug 2025‌]

Administrative Assistants

Sophie‌ Honnorat [INRIA]‌‌
Laetitia Jubely [INRIA, from May 2025‌]

Visiting Scientists

Marzieh‌ Eidi [MPI MiS,‌‌ Germany, from Apr 2025 until May 2025‌]
Yuri Gardinazzi [‌UNIV TRIESTE, from‌‌ Nov 2025]
Clément Levrard [UNIV PARIS‌, from May 2025‌ until May 2025]‌‌
Javier Perera Lago [Univ Séville, from‌ May 2025 until Jun‌ 2025]

External Collaborators‌‌

Bertrand Michel [CENTRALE NANTES]
Martin Royer‌ [SYSTEMX]

2‌ Overall objectives

During the‌‌ last two decades, building on solid theoretical and‌ algorithmic bases, geometric inference‌ and computational topology have‌‌ experienced important developments towards data analysis. New mathematically‌ well-founded theories gave birth‌ to the field of‌‌ Topological Data Analysis (tda), which is‌ now arousing interest from‌ both academia and industry.‌‌ Although one can trace back geometric approaches for‌ data analysis quite far‌ in the past, tda‌‌ really started as a field with the pioneering‌ works of H. Edelsbrunner‌ et al. and G.‌‌ Carlsson et al. in‌ persistent homology at the beginning of the century.‌ tdais mainly motivated by the idea that‌ topology and geometry provide a powerful approach to‌ infer robust qualitative, and sometimes quantitative, information about‌ the structure of data. It aims at providing‌ mathematical results and methods to infer, analyze and‌ exploit complex data (point clouds, graphs, images, 3D‌ shapes, time series...). It also intends to give‌ access to robust and efficient data structures and‌ algorithms to represent these data and that are‌ amenable to precise analysis.

The overall objective of‌ DataShape is three-fold:

to settle the mathematical, statistical‌ and algorithmic foundations of tda, and, more‌ generally to contribute to the development of topological‌ and geometric approaches in Machine Learning and AI;‌
to develop a new family of well-founded and‌ efficient data structures, algorithms and methods to uncover‌ and exploit the geometry of data through the‌ development of a state-of-the-art and easy-to-use open source‌ software;
to disseminate and promote tda research‌ and outcomes among the data science community through‌ collaborations with other domains of science and industrials‌.

The approach of DataShape relies on the‌ conviction that, to reach these objectives, combining statistical,‌ topological/geometric and computational approaches in a common framework‌ is mandatory. For that purpose, DataShape became a‌ joint team with the Laboratoire de Mathématiques d'Orsay‌ in 2020 and now gathers a wide variety‌ of expertise, going from fundamental mathematics to software‌ development and industrial applications. The team also considers‌ that tda needs to be combined with other‌ data sciences approaches and tools, in particular statistical‌ learning, to lead to successful real applications. Significant‌ efforts have been made during the evaluation period‌ to develop several long term industrial research collaborations‌ in data science and AI.

The research program‌ of DataShape is organized around four strongly correlated‌ axes reflecting our will to address tda challenges‌ in a global and unified framework.

The first‌ axis focuses on the algorithmic aspects of tda‌ and geometric inference as well as the mathematical‌ foundations of the fields. Fundamental problems are the‌ construction, processing and analysis of discrete representations of‌ complex and possibly high dimensional shapes.

The second‌ axis is dedicated to the statistical aspects of‌ tda . It is dedicated to the study‌ of the properties of topological information inferred from‌ data from a statistical perspective and intends to‌ propose new models and approaches for the development‌ of tda in well-founded probabilistic and statistical settings.‌ This axis also includes the analysis and development‌ of general-purpose statistical learning approaches and tools that‌ are currently active in the community and of‌ relevance for Datashape's scientific goals.

The third axis‌ is driven by the problems raised by the‌ use of topological and geometric approaches in machine‌ learning. It aims at better understanding the‌ role of topological and geometric structures in machine‌ learning problems and at applying tda tools to‌ develop specialized topological approaches to be used in combination with other machine‌ learning methods.

The fourth‌ axis is dedicated to‌‌ software development and experimental research, mainly through‌ the GUDHI platform.‌ GUDHI is intended to‌‌ provide a high quality state-of-the-art implementation of data‌ structures and algorithms dedicated‌ to tdathrough an‌‌ easy-to-use open source software.

Each DataShape member is‌ involved in several research‌ axes ensuring strong connections‌‌ and interactions between them. Last, although the above‌ 4 axes concentrate the‌ main research activities of‌‌ the team, DataShape always remains open and encourages‌ its members to explore‌ new directions and approaches‌‌ related to geometric and topological methods in data‌ analysis and machine learning.‌ The past experience of‌‌ the team has shown that such a strategy‌ is often very fruitful‌ and may lead to‌‌ innovative and new research directions.

3 Research program‌

3.1 Algorithmic aspects and‌ new mathematical directions for‌‌ topological and geometric data analysis

tda requires to‌ construct and manipulate appropriate‌ representations of complex and‌‌ high dimensional shapes. A major difficulty comes from‌ the fact that the‌ complexity of data structures‌‌ and algorithms used to approximate shapes rapidly grows‌ as the dimensionality increases,‌ which makes them intractable‌‌ in high dimensions. We focus our research on‌ simplicial complexes which offer‌ a convenient representation of‌‌ general shapes and generalize graphs and triangulations. Our‌ work includes the study‌ of simplicial complexes with‌‌ good approximation properties and the design of compact‌ data structures to represent‌ them.

In low dimensions,‌‌ effective shape reconstruction techniques exist that can provide‌ precise geometric approximations very‌ efficiently and under reasonable‌‌ sampling conditions. Extending those techniques to higher dimensions‌ as is required in‌ the context of tda‌‌ is problematic since almost all methods in low‌ dimensions rely on the‌ computation of a subdivision‌‌ of the ambient space. A direct extension of‌ those methods would immediately‌ lead to algorithms whose‌‌ complexities depend exponentially on the ambient dimension, which‌ is prohibitive in most‌ applications. A first direction‌‌ to by-pass the curse of dimensionality is to‌ develop algorithms whose complexities‌ depend on the intrinsic‌‌ dimension of the data (which most of the‌ time is small although‌ unknown) rather than on‌‌ the dimension of the ambient space. Another direction‌ is to resort to‌ cruder approximations that only‌‌ captures the homotopy type or the homology of‌ the sampled shape. The‌ recent theory of persistent‌‌ homology provides a powerful and robust tool to‌ study the homology of‌ sampled spaces in a‌‌ stable way.

3.2 Statistical aspects of topological and‌ geometric data analysis

The‌ wide variety of larger‌‌ and larger available data - often corrupted by‌ noise and outliers -‌ requires to consider the‌‌ statistical properties of their topological and geometric features‌ and to propose new‌ relevant statistical models for‌‌ their study.

There exist various statistical and machine‌ learning methods intending to‌ uncover the geometric structure‌‌ of data. Beyond manifold learning and dimensionality reduction‌ approaches that generally do‌ not allow to assert‌‌ the relevance of the‌ inferred topological and geometric features and are not‌ well-suited for the analysis of complex topological structures,‌ set estimation methods intend to estimate, from random‌ samples, a set around which the data is‌ concentrated. In these methods, that include support and‌ manifold estimation, principal curves/manifolds and their various generalizations‌ to name a few, the estimation problems are‌ usually considered under losses, such as Hausdorff distance‌ or symmetric difference, that are not sensitive to‌ the topology of the estimated sets, preventing these‌ tools to directly infer topological or geometric information.‌

Regarding purely topological features, the statistical estimation of‌ homology or homotopy type of compact subsets of‌ Euclidean spaces, has only been considered recently, most‌ of the time under the quite restrictive assumption‌ that the data are randomly sampled from smooth‌ manifolds.

In a more general setting, with the‌ emergence of new geometric inference tools based on‌ the study of distance functions and algebraic topology‌ tools such as persistent homology, computational topology has‌ recently seen an important development offering a new‌ set of methods to infer relevant topological and‌ geometric features of data sampled in general metric‌ spaces. The use of these tools remains widely‌ heuristic and until recently there were only a‌ few preliminary results establishing connections between geometric inference,‌ persistent homology and statistics. However, this direction has‌ attracted a lot of attention over the last‌ three years. In particular, stability properties and new‌ representations of persistent homology information have led to‌ very promising results to which the DataShape members‌ have significantly contributed. These preliminary results open many‌ perspectives and research directions that need to be‌ explored.

Our goal is to build on our‌ first statistical results in tda to develop the‌ mathematical foundations of Statistical Topological and Geometric Data‌ Analysis. Combined with the other objectives, our ultimate‌ goal is to provide a well-founded and effective‌ statistical toolbox for the understanding of topology and‌ geometry of data.

3.3 Topological and geometric approaches‌ for machine learning

This objective is driven by‌ the problems raised by the use of topological‌ and geometric approaches in machine learning. The goal‌ is both to use our techniques to better‌ understand the role of topological and geometric structures‌ in machine learning problems and to apply our‌ tda tools to develop specialized topological approaches to‌ be used in combination with other machine learning‌ methods.

3.4 Experimental research and software development

We‌ develop a high quality open source software platform‌ called gudhi which is becoming a reference in‌ geometric and topological data analysis in high dimensions.‌ The goal is not to provide code tailored‌ to the numerous potential applications but rather to‌ provide the central data structures and algorithms that‌ underlie applications in geometric and topological data analysis.‌

The development of the gudhi platform also serves‌ to benchmark and optimize new algorithmic solutions resulting‌ from our theoretical work. Such development necessitates a‌ whole line of research on software architecture and interface design, heuristics and‌ fine-tuning optimization, robustness and‌ arithmetic issues, and visualization.‌‌ We aim at providing a full programming environment‌ following the same recipes‌ that made up the‌‌ success story of the cgal library, the reference‌ library in computational geometry.‌

Some of the algorithms‌‌ implemented on the platform will also be interfaced‌ to other software platforms,‌ such as the R‌‌ software for statistical computing, and languages such as‌ Python in order to‌ make them usable in‌‌ combination with other data analysis and machine learning‌ tools. A first attempt‌ in this direction has‌‌ been done with the creation of an R‌ package called TDA in‌ collaboration with the group‌‌ of Larry Wasserman at Carnegie Mellon University (Inria‌ Associated team CATS) that‌ already includes some functionalities‌‌ of the gudhi library and implements some joint‌ results between our team‌ and the CMU team.‌‌ A similar interface with the Python language is‌ also considered a priority.‌ To go even further‌‌ towards helping users, we will provide utilities that‌ perform the most common‌ tasks without requiring any‌‌ programming at all.

4 Application domains

Our work‌ is mostly of a‌ fundamental mathematical and algorithmic‌‌ nature but finds a variety of applications in‌ data analysis, e.g., in‌ material science, biology, sensor‌‌ networks, 3D shape analysis and processing, to name‌ a few.

More specifically,‌ DataShape has developed and‌‌ is still developing a strong expertise on new‌ TDA methods for Machine‌ Learning and Artificial Intelligence‌‌ for complex data and (complex) time-dependent data. This‌ includes, for example:

domain‌ adaptation problems for time‌‌ series (PhD of Myriam Frikha with Ericsson),
robust‌ climate modelling (collaboration with‌ University of Oxford)
anomaly‌‌ detection (with IRT Systemx and Confiance.AI program),
the‌ statistical significance of biological‌ phenomena (cell cycle, stem‌‌ cell differentiation, immune system responses) that occur in‌ large scale single-cell RNAseq‌ and spatial transcriptomics data‌‌ sets (collaboration with Rizvi Lab, University of Wisconsin),‌
the analysis of gene‌ regulatory networks for plant-pathogen‌‌ interactions (collaboration with INRAE): internship of Ludo Andrianirina,‌
the analysis of satellite‌ imaging and cartography data‌‌ sets (collaboration with Thalès Alenia Space).

5 Social‌ and environmental responsibility

5.1‌ Footprint of research activities‌‌

The weekly research seminar of DataShape is now‌ taking place in hybrid‌ mode. The travels for‌‌ the team members have decreased a lot these‌ years to take care‌ of the environmental footprint‌‌ of the team. The use of train, instead‌ of plane, is strongly‌ encouraged, when possible.

6‌‌ Highlights of the year

6.1 Awards

Test of‌ time award of the‌ Symposium on Computational Geometry‌‌ (SoCG) for: David Cohen-Steiner , Herbert Edelsbrunner, John‌ Harer: Stability of persistence‌ diagrams, SoCG 2005. Online‌‌ announcement
One of the Prix d'excellence de l'Université‌ Côte d'Azur awarded to‌ David Cohen-Steiner .
Best‌‌ figure award (over the last 5 years) at‌ the Symposium on Computational‌ Geometry (SoCG) for: Tight‌‌ Bounds for the Learning of Homotopy à la‌ Niyogi, Smale, and Weinberger‌ for Subsets of Euclidean‌‌ Spaces and of Riemannian‌ Manifolds. Dominique Attali, Hana Dal Poz Kouřimská, Christopher‌ Fillmore, Ishika Ghosh, André Lieutier, Elizabeth Stephenson, Mathijs‌ Wintraecken, SoCG 2024.
Best paper award (honorable mention)‌ at SIGGRAPH ASIA 2025 for: Efficient and scalable‌ spatial regularization of optimal transport by Lucas Brifault,‌ David Cohen-Steiner and Mathieu Desbrun.

6.2 PhD defenses‌

Hugo Henneuse, supervised by F. Chazal and P.‌ Massart. June 13, 2025.
Charly Boricaud, supervised by‌ B. Buet and S. Masnou. December 9, 2025.‌

7 Latest software developments, platforms, open data

In‌ 2025 we developed new software, which we discuss‌ below.

7.1 Latest software developments

7.1.1 GUDHI

Name:‌
Geometric Understanding in Higher Dimensions
Keywords:
Computational geometry,‌ Topology, Clustering
Scientific Description:

The Gudhi library is‌ an open source library for Computational Topology and‌ Topological Data Analysis (TDA). It offers state-of-the-art algorithms‌ to construct various types of simplicial complexes, data‌ structures to represent them, and algorithms to compute‌ geometric approximations of shapes and persistent homology.

The‌ GUDHI library offers the following interoperable modules:

.‌ Complexes: + Cubical + Simplicial: Rips, Witness, Alpha‌ and Čech complexes + Cover: Nerve and Graph‌ induced complexes . Data structures and basic operations:‌ + Simplex tree, Skeleton blockers and Toplex map‌ + Construction, update, filtration and simplification . Topological‌ descriptors computation . Manifold reconstruction . Topological descriptors‌ tools: + Bottleneck and Wasserstein distance + Statistical‌ tools + Persistence diagram and barcode
Functional Description:‌
The GUDHI open source library will provide the‌ central data structures and algorithms that underly applications‌ in geometry understanding in higher dimensions. It is‌ intended to both help the development of new‌ algorithmic solutions inside and outside the project, and‌ to facilitate the transfer of results in applied‌ fields.
News of the Year:

Below is a‌ list of changes made since GUDHI 3.10.1:

-‌ Delaunay complex . The Delaunay complex can be‌ equipped with different filtrations: * Delaunay complex (no‌ filtration values computed) * Delaunay-Čech complex (using minimal‌ enclosing ball) * Alpha complex (moved in this‌ new section) . The Delaunay-Čech and Alpha complex‌ can output square, or not square, filtration values‌ . An incremental version of the Delaunay complex‌ (only in C++)

- Rips complex persistence scikit-learn‌ like interface . A binding to Ripser when‌ it accelerates the computation

- Persistence graphical tools‌ . Can now handle scikit-learn like interfaces outputs‌ as inputs

- Simplex tree . Can now‌ store additionnal data on each simplex (only in‌ C++) . Can be const
URL:
https://gudhi.inria.fr/
Publication:‌
hal-01108461
Contact:
Marc Glisse
Participants:
Marc Glisse, Hannah‌ Schreiber, 17 anonymous participants
Partners:
Université Côte d'Azur‌ (UCA), Fujitsu

7.1.2 Multipers

Name:
Multiparameter Persistence for‌ Machine Learning
Keywords:
Topology, Machine learning
Functional Description:‌
multipers is a Python library for Topological Data‌ Analysis, focused on Multiparameter Persistence computation and visualizations‌ for Machine Learning. It features several efficient computational‌ and visualization tools, with integrated, easy to use,‌ auto-differentiable Machine Learning pipelines, that can be seamlessly‌ interfaced with scikit-learn and PyTorch. This library is meant to be usable‌ for non-experts in Topological‌ or Geometrical Machine Learning.‌‌ Performance-critical functions are implemented in C++ or in‌ Cython, are parallelizable with‌ TBB, and have Python‌‌ bindings and interface. It can handle a very‌ diverse range of datasets‌ that can be framed‌‌ into a (finite) multi-filtered simplicial or cell complex,‌ including, e.g., point clouds,‌ graphs, time series, images,‌‌ etc.
URL:
https://davidlapous.github.io/multipers/
Publication:
hal-04801544
Contact:
David Loiseaux‌
Participants:
Hannah Schreiber, an‌ anonymous participant

8 New‌‌ results

8.1 Algorithmic aspects and new mathematical directions‌ for topological and geometric‌ data analysis

8.1.1 Efficient‌‌ and Scalable Spatial Regularization of Optimal Transport

Participants:‌ David Cohen-Steiner.

In‌ collaboration with Lucas Brifault‌‌ (Dassault Systèmes) and Mathieu Desbrun (GEOMERIX)

In this‌ paper (18),‌ we introduce a novel‌‌ approach to spatial regularization of optimal transport problems.‌ Based on the notion‌ of forward and backward‌‌ "mean maps" of a transport plan, we introduce‌ a convex formulation of‌ optimal transport problems that‌‌ incorporates regularization of these mean maps to promote‌ spatial continuity of the‌ resulting optimal plan. Unlike‌‌ previous regularization approaches that required the optimization of‌ all the transport plan‌ coefficients, our formulation translates‌‌ into an ADMM-based solver combined with Sinkhorn type‌ algorithms, which drastically reduces‌ the number of variables‌‌ and scales up to large problems. We demonstrate‌ the usefulness and efficiency‌ of this new computational‌‌ tool for various applications and for different regularizations.‌

8.1.2 Burning or Collapsing‌ the Medial Axis is‌‌ Unstable

Participants: Mathijs Wintraecken.

In collaboration with‌ Erin Chambers (University of‌ Notre Dame, USA), Christopher‌‌ Fillmore (Institute of Science and Technology Austria), Elizabeth‌ Stephenson (Orteliu, Oslo, Norway)‌

The medial axis of‌‌ a set consists of the points in the‌ ambient space without a‌ unique closest point in‌‌ the original set. Since its introduction, the medial‌ axis has been used‌ extensively in many applications‌‌ as a method of computing a skeleton topologically‌ equivalent to the original‌ set. Unfortunately, one limiting‌‌ factor in the use of the medial axis‌ of a smooth manifold‌ is that it is‌‌ not necessarily topologically stable under small perturbations of‌ the manifold. To counter‌ these instabilities, various prunings‌‌ of the medial axis have been proposed in‌ the computational geometry community.‌ In this paper (‌‌12), we examine one type of pruning,‌ called burning. Because of‌ the good experimental results‌‌ it was hoped that the burning method of‌ simplifying the medial axis‌ would be stable. In‌‌ this work, we show a simple example that‌ dashes such hopes. Based‌ on Bing's house with‌‌ two rooms, we demonstrate an isotopy of a‌ shape where the medial‌ axis goes from collapsible‌‌ to non-collapsible. More precisely, we consider the standard‌ deformation retract from the‌ closed ball to Bing's‌‌ house with two rooms, but stop just short‌ of the point where‌ Bing's house becomes two‌‌ dimensional. This way we obtain an isotopy from‌ the 3-ball to a‌ thickened version of Bing's‌‌ house. Under this isotopy,‌ the medial axis goes from collapsible to non-collapsible.‌ We stress that this isotopy can be made‌ generic, in the sense of singularity theory, as‌ developed by Arnold and Thom.

8.1.3 Sparsification of‌ the generalized persistence diagrams for scalability through gradient‌ descent

Participant: Mathieu Carrière.

In collaboration with‌ Seunghyun Kim, Woojin Kim (KAIST, South Korea)

The‌ generalized persistence diagram (GPD) is a natural extension‌ of the classical persistence barcode to the setting‌ of multi-parameter persistence and beyond. The GPD is‌ defined as an integer-valued function whose domain is‌ the set of intervals in the indexing poset‌ of a persistence module, and is known to‌ be able to capture richer topological information than‌ its single-parameter counterpart. However, computing the GPD is‌ computationally prohibitive due to the sheer size of‌ the interval set. Restricting the GPD to a‌ subset of intervals provides a way to manage‌ this complexity, compromising discriminating power to some extent.‌ However, identifying and computing an effective restriction of‌ the domain that minimizes the loss of discriminating‌ power remains an open challenge.

In this work,‌ we introduce a novel method for optimizing the‌ domain of the GPD through gradient descent optimization.‌ To achieve this, we introduce a loss function‌ tailored to optimize the selection of intervals, balancing‌ computational efficiency and discriminative accuracy. The design of‌ the loss function is based on the known‌ erosion stability property of the GPD. We showcase‌ the efficiency of our sparsification method for dataset‌ classification in supervised machine learning. Experimental results demonstrate‌ that our sparsification method significantly reduces the time‌ required for computing the GPDs associated to several‌ datasets, while maintaining classification accuracies comparable to those‌ achieved using full GPDs. Our method thus opens‌ the way for the use of GPD-based methods‌ to applications at an unprecedented scale.

8.1.4 Multi-parameter‌ Module Approximation: An Efficient and Interpretable Invariant for‌ Multi-Parameter Persistence Modules with Guarantees

Participant: Mathieu Carrière‌.

In collaboration with David Loiseaux (Inria Saclay)‌ and Andrew J. Blumberg (Columbia University, USA)

Topological‌ data analysis (TDA) is a rapidly growing area‌ of data science, whose most common descriptor is‌ persistent homology, which tracks the topological changes in‌ growing families of subsets of the data set‌ itself, called filtrations, and encodes them in an‌ algebraic object, called a persistence module. The algorithmic‌ and theoretical properties of persistence modules are now‌ well understood in the single-parameter case, that is,‌ when there is only one filtration (e.g., feature‌ scale) to study. In contrast, much less is‌ known in the multi-parameter case, where several filtrations‌ (e.g., scale and density) are used simultaneously. Since‌ multi-parameter persistence modules usually encode information that is‌ invisible to their single-parameter counterparts, it is critical‌ to build tractable proxies for them, ideally with‌ some theoretical robustness guarantees. In this article, we‌ introduce a new parameterized family of topological descriptors,‌ taking the form of candidate decompositions, for multi-parameter‌ persistence modules, and we a identify a subfamily of these descriptors, that‌ we call approximate decompositions,‌ that are controllable approximations,‌‌ in the sense that they preserve diagonal barcodes.‌ Then, we introduce MMA‌ (Multipersistence Module Approximation): an‌‌ algorithm based on matching functions for computing instances‌ of candidate decompositions with‌ some precision parameter

δ﻿‌​‌ > 0

. By design, MMA can handle‌ an arbitrary number of‌ filtrations, and has bounded‌‌ complexity and running time. Moreover, we prove the‌ robustess of MMA: when‌ computed with so-called compatible‌‌ matching functions, we show that MMA produces approximate‌ decompositions (and we prove‌ that such matching functions‌‌ exist for

n =﻿​​﻿ 2

filtrations). Next, we‌ restrict the focus on‌ modules that can be‌‌ decomposed into interval summands. In that case, compatible‌ matching functions always exist,‌ and we show that,‌‌ for small enough

δ﻿​​﻿

, the approximate decompositions‌ obtained with such compatible‌ matching functions by MMA‌‌ have an approximation error (in terms of the‌ standard interleaving and bottleneck‌ distances) that is bounded‌‌ by

δ

, and that reaches zero for‌ an even smaller, positive‌ precision

δ_{exact}

.‌‌ Finally, we present empirical evidence validating that MMA‌ has state-of-the-art performance and‌ running time on several‌‌ data sets.

8.1.5 A fast algorithm for the‌ Hecke representation of the‌ braid group, and applications‌‌ to the computation of the HOMFLY-PT polynomial and‌ the search for interesting‌ braids

Participant: Clément Maria‌‌.

In collaboration with Hoel Queffelec (CNRS -‌ Institut Montpelliérain Alexander Grothendieck).‌

Knot theory is an‌‌ active field of mathematics, in which combinatorial and‌ computational methods play an‌ important role. One side‌‌ of computational knot theory, that has gained interest‌ in recent years, both‌ for complexity analysis and‌‌ practical algorithms, is quantum topology and the computation‌ of topological invariants issued‌ from the theory. In‌‌ this article 40, we leverage the rigidity‌ brought by the representation-theoretic‌ origins of the quantum‌‌ invariants for algorithmic purposes. We do so by‌ exploiting braids and the‌ algebraic properties of the‌‌ braid group to describe, analyze, and implement a‌ fast algorithm to compute‌ the Hecke representation of‌‌ the braid group. We apply this construction to‌ design a parameterized algorithm‌ to compute the HOMFLY-PT‌‌ polynomial of knots, and demonstrate its interest experimentally.‌ Finally, we combine our‌ fast Hecke representation algorithm‌‌ with Garside theory, to implement a reservoir sampling‌ search and find non-trivial‌ braids with trivial Hecke‌‌ representations with coefficients in $ℤ / p ℤ‌$ . We find several‌ such braids, in particular‌‌ proving that the Hecke representation of $B_{5‌}$ with $ℤ / 2‌ ℤ$ coefficients is non-faithful.‌‌

8.1.6 On Sparse Representations of 3-Manifolds

Participant: Clément‌ Maria.

In collaboration‌ with Kristóf Huszár (TU‌‌ Graz).

3-manifolds are commonly represented as triangulations, consisting‌ of abstract tetrahedra whose‌ triangular faces are identified‌‌ in pairs. The combinatorial sparsity of a triangulation,‌ as measured by the‌ treewidth of its dual‌‌ graph, plays a fundamental role in the design‌ of parameterized algorithms. In‌ this work 36,‌‌ we investigate algorithmic procedures‌ that transform or modify a given triangulation while‌ controlling specific sparsity parameters. First, we describe a‌ linear-time algorithm that converts a given triangulation into‌ a Heegaard diagram of the underlying 3-manifold, showing‌ that the construction preserves treewidth. We apply this‌ construction to exhibit a fixed-parameter tractable framework for‌ computing Kuperberg's quantum invariants of 3-manifolds. Second, we‌ present a quasi-linear-time algorithm that retriangulates a given‌ triangulation into one with maximum edge valence of‌ at most nine, while only moderately increasing the‌ treewidth of the dual graph. Combining these two‌ algorithms yields a quasi-linear-time algorithm that produces, from‌ a given triangulation, a Heegaard diagram in which‌ every attaching curve intersects at most nine others.‌

8.1.7 Compressed data structures for Heegaard splittings

Participant:‌ Henrique Ennes, Clément Maria.

Heegaard splittings‌ provide a natural representation of closed 3-manifolds by‌ gluing two handlebodies along a common surface. These‌ splittings can be equivalently given by two finite‌ sets of meridians lying on the surface, which‌ define a Heegaard diagram. In this work 34‌, we present a data structure to effectively‌ represent Heegaard diagrams as normal curves with respect‌ to triangulations of a surface, where the complexity‌ is measured by the space required to express‌ the normal coordinates' vectors in binary. This structure‌ can be significantly more compact than triangulations of‌ 3-manifolds, yielding exponential gains for certain families. Even‌ with this succinct definition of complexity, we establish‌ polynomial-time algorithms for comparing and manipulating diagrams, performing‌ stabilizations, detecting trivial stabilizations and reductions, and computing‌ topological invariants of the underlying manifolds, such as‌ their fundamental and homology groups. We also contrast‌ early implementations of our techniques with standard software‌ programs for 3-manifolds, achieving faster algorithms for the‌ average cases and exponential gains in speed for‌ some particular presentations of the inputs.

8.1.8 Hardness‌ of computation of quantum invariants on 3-manifolds with‌ restricted topology

Participant: Henrique Ennes, Clément Maria‌.

Quantum invariants in low dimensional topology offer‌ a wide variety of valuable invariants of knots‌ and 3-manifolds, presented by explicit formulas that are‌ readily computable. Their computational complexity has been actively‌ studied and is tightly connected to topological quantum‌ computing. In this article 21, we prove‌ that for any 3-manifold quantum invariant in the‌ Reshetikhin-Turaev model, there is a deterministic polynomial time‌ algorithm that, given as input an arbitrary closed‌ 3-manifold $M$ , outputs a closed 3-manifold ${M‌}^{'}$ with same quantum invariant, such that ${M‌}^{'}$ is hyperbolic, contains no low genus embedded‌ incompressible surface, and is presented by a strongly‌ irreducible Heegaard diagram. Our construction relies on properties‌ of Heegaard splittings and the Hempel distance. At‌ the level of computational complexity, this proves that‌ the hardness of computing a given quantum invariant‌ of 3-manifolds is preserved even when severely restricting‌ the topology and the combinatorics of the input.‌ This positively answers a question raised by Samperton.‌

8.1.9 Well-quasi-orders on embedded planar graphs

Participant: Clément Maria, Corentin Lunel‌.

The central theorem‌ of topological graph theory‌‌ states that the graph minor relation is a‌ well-quasi-order on graphs. It‌ has far-reaching consequences, in‌‌ particular in the study of graph structures and‌ the design of (parameterized)‌ algorithms. In this article‌‌ 39, we study two embedded versions of‌ classical minor relations from‌ structural graph theory and‌‌ prove that they are also well-quasi-orders on general‌ or restricted classes of‌ embedded planar graphs. These‌‌ embedded minor relations appear naturally for intrinsically embedded‌ objects, such as knot‌ diagrams and surfaces in‌‌ $ℝ^{3}$ . Handling the extra topological constraints‌ of the embeddings requires‌ careful analysis and extensions‌‌ of classical methods for the more constrained embedded‌ minor relations. We prove‌ that the embedded version‌‌ of immersion induces a well-quasi-order on bounded carving-width‌ plane graphs by exhibiting‌ particularly well-structured tree-decompositions and‌‌ leveraging a classical argument on well-quasi-orders on forests.‌ We deduce that the‌ embedded graph minor relation‌‌ defines a well-quasi-order on plane graphs via their‌ directed medial graphs, when‌ their branch-width is bounded.‌‌ We conclude that the embedded graph minor relation‌ is a well-quasi-order on‌ all plane graphs, using‌‌ classical grids theorems in the unbounded branch-width case.‌

8.1.10 Geometric characterisation of‌ structural and regular equivalences‌‌ in undirected (hyper)graphs

Participant: Nina Otter.

In‌ collaboration with Marzieh Eidi‌ (MPI MiS).

Similarity notions‌‌ between vertices in a graph, such as structural‌ and regular equivalence, are‌ one of the main‌‌ ingredients in clustering tools in complex network science.‌ In this article 33‌ we generalise structural and‌‌ regular equivalences for undirected hypergraphs and provide a‌ characterisation of structural and‌ regular equivalences of undirected‌‌ graphs and hypergraphs through neighbourhood graphs and Ollivier-Ricci‌ curvature. Our characterisation sheds‌ new light on these‌‌ similarity notions opening a new avenue for their‌ exploration. These characterisations also‌ enable the construction of‌‌ a possibly wide family of regular partitions, thereby‌ offering a new route‌ to a task that‌‌ has so far been computationally challenging.

8.2 Statistical‌ aspects of topological and‌ geometric data analysis

8.2.1‌‌ Gromov-Wasserstein Bound between Reeb and Mapper Graphs

Participant:‌ Mathieu Carrière.

In‌ collaboration with Ziyad Oulhaj‌‌ and Bertrand Michel (École Centrale de Nantes, France)‌

Since its introduction as‌ a computable approximation of‌‌ the Reeb graph, the Mapper graph has become‌ one of the most‌ popular tools from topological‌‌ data analysis for performing data visualization and inference.‌ However, finding an appropriate‌ metric (that is, a‌‌ tractable metric with theoretical guarantees) for comparing Reeb‌ and Mapper graphs, in‌ order to, e.g., quantify‌‌ the rate of convergence of the Mapper graph‌ to the Reeb graph,‌ is a difficult problem.‌‌ While several metrics have been proposed in the‌ literature, none is able‌ to incorporate measure information,‌‌ when data points are sampled according to an‌ underlying probability measure. The‌ resulting Reeb and Mapper‌‌ graphs are therefore purely deterministic and combinatorial, and‌ substantial effort is thus‌ required to ensure their‌‌ statistical validity. In this‌ article, we handle this issue by treating Reeb‌ and Mapper graphs as metric measure spaces. This‌ allows us to use Gromov-Wasserstein metrics to compare‌ these graphs directly in order to better incorporate‌ the probability measures that data points are sampled‌ from. Then, we describe the geometry that arises‌ from this perspective, and we derive rates of‌ convergence of the Mapper graph to the Reeb‌ graph in this context. Finally, we showcase the‌ usefulness of such metrics for Reeb and Mapper‌ graphs in a few numerical experiments.

8.3 Topological‌ and geometric approaches for machine learning

8.3.1 A‌ Knowledge Graph and Topological Data Analysis Framework to‌ Disentangle the Tomato-Multi Pathogens Complex Gene Regulatory Network‌

Participant: Mathieu Carrière.

In collaboration with Maxime‌ Multari, Xavier Amorós-Gabarrón, Alexina Damy, Stéphanie Jaubert, Silvia‌ Bottini (INRAE, France), Sebastian Lobentanzer, Julio Saez-Rodriguez and‌ Aurélien Dugourd (Heidelberg University, Germany)

Global population is‌ rapidly increasing, representing a major challenge for food‌ supply, exacerbated by climate change and environmental degradation.‌ Despite the pivotal role of agriculture, plant health‌ and survival are threatened by various biotic stressors.‌ Although how plants respond to each of these‌ individual stresses is well studied, little is known‌ about how they respond to a combination of‌ many of these bio-aggressors occurring together. To tackle‌ this question, first, we built TomTom, a knowledge‌ graph gathering molecular interactions from nine publicly available‌ databases, including transcription factors- or microRNAs- targets, proteinprotein‌ interactions, and functional terms. Then, we selected transcriptomics‌ data of tomato subjected to six distinct pathogens‌ and performed an integrative analysis. We found 5561‌ candidate genes involved in the multi-stress response of‌ tomato. To study how the response is orchestrated,‌ we mapped those genes in TomTom and extracted‌ a comprehensive gene regulatory network (GRN) composed of‌ 71 transcription factors (TF) and 1618 target genes.‌ By estimating the TF activity, we identified 43‌ TFs responding either specifically to one or multiple‌ bio-aggressors. GRN analyses with a topological data analysis‌ approach allowed to identify 18 clusters of TFs‌ with similar properties, yielding four main configurations localized‌ in specific regions of the GRN. Finally, we‌ found four ERF hubs which cooperatively coordinate the‌ tomato response to multiple pathogens. Our findings allowed‌ to study the complex molecular reprogramming in tomato‌ upon interaction with different biotic agents, providing tools‌ scalable to other questions involving tomato molecular interactions‌ and beyond.

8.3.2 Enhancer Dynamics and Spatial Organization‌ Drive Anatomically Restricted Cellular States in the Human‌ Spinal Cord

Participant: Mathieu Carrière.

In collaboration‌ with Elena K. Kandror, Alexis Peterson, Andreas Tjärnberg,‌ Yuchen Xu, Abbas H. Rizvi (University of Wisconsin,‌ USA), Anqi Wang, Jun Hou Fung, William Pangburn,‌ Raul Rabadan, Tom Maniatis (Columbia University, USA), Jackson‌ Loper (University of Michigan, USA), Will Liao (NY‌ Genome Center, USA), Krishnaa T. Mahbubani and Kourosh‌ Saeb-Parsy (University of Cambridge, UK)

Here, we report‌ the spatial organization of RNA transcription and associated‌ enhancer dynamics in the human spinal cord at single-cell and single-molecule resolution.‌ We expand traditional multiomic‌ measurements to reveal epigenetically‌‌ poised and bivalent active transcriptional enhancer states that‌ define cell type specification.‌ Simultaneous detection of chromatin‌‌ accessibility and histone modifications in spinal cord nuclei‌ reveals previously unobserved cell-type‌ specific cryptic enhancer activity,‌‌ in which transcriptional activation is uncoupled from chromatin‌ accessibility. Such cryptic enhancers‌ define both stable cell‌‌ type identity and transitions between cells undergoing differentiation.‌ We also define glial‌ cell gene regulatory networks‌‌ that reorganize along the rostrocaudal axis, revealing anatomical‌ differences in gene regulation.‌ Finally, we identify the‌‌ spatial organization of cells into distinct cellular organizations‌ and address the functional‌ significance of this observation‌‌ in the context of paracrine signaling. We conclude‌ that cellular diversity is‌ best captured through the‌‌ lens of enhancer state and intercellular interactions that‌ drive transitions in cellular‌ state. This study provides‌‌ fundamental insights into the cellular organization of the‌ healthy human spinal cord.‌

8.3.3 Fermat Distance-to-Measure: a‌‌ robust Fermat-like metric

Participant: Frédéric Chazal, Jérôme‌ Taupin.

Given a‌ probability measure with density,‌‌ Fermat distances and density-driven metrics are conformal transformation‌ of the Euclidean metric‌ that shrink distances in‌‌ high density areas and enlarge distances in low‌ density areas. Although they‌ have been widely studied‌‌ and have shown to be useful in various‌ machine learning tasks, they‌ are limited to measures‌‌ with density (with respect to Lebesgue measure, or‌ volume form on manifold).‌ In 45, by‌‌ replacing the density with the Distance-to-Measure, we introduce‌ a new metric, the‌ Fermat Distance-to-Measure, defined for‌‌ any probability measure in $ℝ^{d}$ . We‌ derive strong stability properties‌ for the Fermat Distance-to-Measure‌‌ with respect to the measure and propose an‌ estimator from random sampling‌ of the measure, featuring‌‌ an explicit bound on its convergence speed.

8.4‌ Miscellaneous

8.4.1 Curvature-Guided Optimal‌ Transport for Rigid Point‌‌ Cloud Registration

Participant: Mathijs Wintraecken.

In collaboration‌ with Roberto M Dyke‌ (TITANE), Marie-Aurélie Chanut (Cerema‌‌ - Centre d'Etudes et d'Expertise sur les Risques,‌ l'Environnement, la Mobilité et‌ l'Aménagement), Pierre Alliez (TITANE)‌‌

The rigid registration of pairs of point sets‌ is a fundamental step‌ for many downstream tasks‌‌ including shape analysis, reconstruction and localization. There has‌ been a growing interest‌ in the use of‌‌ Optimal Transport (OT) for point cloud registration problems.‌ However, these techniques face‌ limited adoption due to‌‌ scalability issues—rendering them impractical—and their sensitivity to missing‌ data commonly encountered in‌ real-world scans. We consider‌‌ how geometric information may be incorporated into an‌ OT registration framework for‌ improved accuracy and scalability.‌‌ In this work, we guide mini-batch selection by‌ binning shape features based‌ on local curvature estimates.‌‌ We demonstrate that our method achieves better results‌ than other OT-based methods‌ and is comparable to‌‌ the state-of-the-art in terms of successful registrations.

8.4.2‌ Supervised Contamination Detection, with‌ Flow Cytometry Application

Participant:‌‌ Gilles Blanchard, Frédéric Chazal, Solenne Gaucher‌.

In 15,‌ The contamination detection problem‌‌ aims to determine whether‌ a set of observations has been contaminated, i.e.‌ whether it contains points drawn from a distribution‌ different from the reference distribution. Here, we consider‌ a supervised problem, where labeled samples drawn from‌ both the reference distribution and the contamination distribution‌ are available at training time. This problem is‌ motivated by the detection of rare cells in‌ flow cytometry. Compared to novelty detection problems or‌ two-sample testing, where only samples from the reference‌ distribution are available, the challenge lies in efficiently‌ leveraging the observations from the contamination detection to‌ design more powerful tests. In this article, we‌ introduce a test for the supervised contamination detection‌ problem. We provide non-asymptotic guarantees on its Type‌ I error, and characterize its detection rate. The‌ test relies on estimating reference and contamination densities‌ using histograms, and its power depends strongly on‌ the choice of the corresponding partition. We present‌ an algorithm for judiciously choosing the partition that‌ results in a powerful test. Simulations illustrate the‌ good empirical performances of our partition selection algorithm‌ and the efficiency of our test. Finally, we‌ showcase our method and apply it to a‌ real flow cytometry dataset.

8.4.3 Transductive Conformal Inference‌ for Full Ranking

Participant: Gilles Blanchard.

In‌ collaboration with J-B. Fermanian (U. Montpellier, IMAG, and‌ Inria team IROKO), and P. Humbert (CNRS and‌ Sorbonne Université)

In 22, we introduce a‌ method based on Conformal Prediction (CP) to quantify‌ the uncertainty of full ranking algorithms. We focus‌ on a specific scenario where

n + m​‌﻿﻿

items are to be ranked by some “black‌ box” algorithm. It is assumed that the relative‌ (ground truth) ranking of

n

of them is‌ known. The objective is then to quantify the‌ error made by the algorithm on the ranks‌ of the

m

new items among the total‌

(n + m​​﻿﻿)

. In such‌ a setting, the true ranks of the

n​‌﻿﻿

original items in the total

(n +​​​‌ m)

depend on the (unknown) true ranks‌ of the

m

new ones. Consequently, we have‌ no direct access to a calibration set to‌ apply a classical CP method. To address this‌ challenge, we propose to construct distribution-free bounds of‌ the unknown conformity scores using recent results on‌ the distribution of conformal p-values. Using these scores‌ upper bounds, we provide valid prediction sets for‌ the rank of any item. We also control‌ the false coverage proportion, a crucial quantity when‌ dealing with multiple prediction sets. Finally, we empirically‌ show on both synthetic and real data the‌ efficiency of our CP method for state-of-the-art algorithms‌ such as RankNet or LambdaMart.

8.4.4 Supervised aggregation‌ of anomaly score functions for active anomaly detection‌

Participant: Martin Royer.

In collaboration with Kevin‌ Bleakley (Inria Celest), Mouhcine Mendil (IRT Saint Exupéry),‌ Benjamin Auder (Laboratoire de Mathématiques d'Orsay).

Detecting rare‌ anomalies in batches of multidimensional data is challenging.‌ In 11, we propose a supervised active-learning framework that sends a‌ small number of data‌ points from each batch‌‌ to an expert for labeling as 'anomaly' or‌ 'nominal', via two mechanisms:‌ (i) points most likely‌‌ to be an anomaly in the eyes of‌ a supervised classifier trained‌ on previously-labeled data; and‌‌ (ii) points suggested by an active learner. Instead‌ of, however, training the‌ supervised classifier directly on‌‌ the current set of labeled raw data points,‌ we treat the scores‌ calculated by an ensemble‌‌ of M unsupervised anomaly detectors on each data‌ point as if they‌ were the learner's input‌‌ features. This approach generalizes earlier attempts to linearly‌ aggregate unsupervised anomaly detector‌ scores, and broadens the‌‌ scope of such methods to ordered data like‌ time series. Results suggest‌ that this method usually‌‌ outperforms-often significantly-linear strategies. The Python library acanag provides‌ an implementation of the‌ proposed method.

8.4.5 Curvature‌‌ penalization of strongly anisotropic interfaces models and their‌ phase-field approximation

Participant: Blanche‌ Buet.

In collaboration‌‌ with Jean-François Babadjian (LMO, Université Paris-Saclay) and Michael‌ Goldman (CMAP, Ecole Polytechnique).‌

25 studies the effect‌‌ of anisotropy on sharp or diffuse interfaces models.‌ When the surface tension‌ is a convex function‌‌ of the normal to the interface, the anisotropy‌ is said to be‌ weak. This usually ensures‌‌ the lower semicontinuity of the associated energy. If,‌ however, the surface tension‌ depends on the normal‌‌ in a nonconvex way, this so-called strong anisotropy‌ may lead to instabilities‌ related to the lack‌‌ of lower semicontinuity of the functional. We investigate‌ the regularizing effects of‌ adding a higher order‌‌ term of Willmore type to the energy. We‌ consider two types of‌ problems. The first one‌‌ is an anisotropic nonconvex generalization of the perimeter,‌ and the second one‌ is an anisotropic nonconvex‌‌ Mumford-Shah functional. In both cases, lower semicontinuity properties‌ of the energies with‌ respect to a natural‌‌ mode of convergence are established, as well as‌ Γ-convergence type results by‌ means of a phase‌‌ field approximation. In comparison with related results for‌ curvature dependent energies, one‌ of the original aspects‌‌ of our work is that, in the context‌ of free discontinuity problems,‌ we are able to‌‌ consider singular structures such as crack-tips or multiple‌ junctions.

8.4.6 Approximate mean‌ curvature flows of a‌‌ general varifold, and their limit spacetime Brakke flow‌

Participant: Blanche Buet.‌

In collaboration with Gian‌‌ Paolo Leonardi (University of Trento), Simon Masnou (Université‌ Lyon 1) and Abdelmouksit‌ Sagueni.

In 28,‌‌ we propose a construction of mean curvature flows‌ by approximation for very‌ general initial data, in‌‌ the spirit of the works of Brakke and‌ of Kim & Tonegawa‌ based on the theory‌‌ of varifolds. Given a general varifold, we construct‌ by iterated push-forwards an‌ approximate time-discrete mean curvature‌‌ flow depending on both a given time step‌ and an approximation parameter.‌ We show that, as‌‌ the time step tends to 0, this time-discrete‌ flow converges to a‌ unique limit flow, which‌‌ we call the approximate‌ mean curvature flow. An interesting feature of our‌ approach is its generality, as it provides an‌ approximate notion of mean curvature flow for very‌ general structures of any dimension and codimension, ranging‌ from continuous surfaces to discrete point clouds. We‌ prove that our approximate mean curvature flow satisfies‌ several properties: stability, uniqueness, Brakke-type equality, mass decay.‌ By coupling this approximate flow with the canonical‌ time measure, we prove convergence, as the approximation‌ parameter tends to 0, to a spacetime limit‌ measure whose generalized mean curvature is bounded. Under‌ an additional rectifiability assumption, we further prove that‌ this limit measure is a spacetime Brakke flow.‌

8.4.7 Théorie de l'homotopie quantitative

Participant: Pierre Pansu‌.

Le but de la théorie de l'homotopie,‌ en topologie, c'est de simplifier, après déformation continue,‌ des applications continues entre espaces topologiques. Ce qui‌ empêche de le faire, ce sont des invariants‌ homotopiques. Cela soulève des questions quantitatives : -‌ Le calcul des invariants est-il possible (décidable) ?‌ Si oui, à quel coût ? - Construire‌ des représentants de faible complexité et dont les‌ valeurs des invariants sont prescrites est-il possible ?‌ Si oui, à quel coût ? - Quelle‌ est la complexité des déformations nécessaires ? Les‌ réponses, souvent récentes, sont d'une grande diversité. En‌ outre, bien des questions restent ouvertes, montrant que‌ la topologie n'a pas dit son dernier mot,‌ même en basses dimensions.

9 Bilateral contracts and‌ grants with industry

9.1 Bilateral contracts with industry‌

Participants: David Cohen-Steiner.

Collaboration with Dassault Systèmes‌ and Inria team Geomerix (Saclay) on the applications‌ of methods from geometric measure theory to the‌ modelling and processing of complex 3D shapes (PhD‌ of Lucas Brifault, started in May 2022).
Participants:‌ Frédéric Chazal, Myriam Frikha.

Research collaboration‌ with Ericsson on transfer learning for temporal data‌ with applications in telecommunications (PhD of Myriam Frikha,‌ started in November 2024).
Participants: Frédéric Chazal,‌ Martin Royer.

Collaboration with Thales on TDA-based‌ anomaly detection for satellite telemetry data (started in‌ Dec. 2025).
Participants: Frédéric Chazal, Mathieu Carrière‌.

Research collaboration with Thales on topological approaches‌ for the analysis and certification of AI-based critical‌ systems through the Master internship of Louise Méric‌ that will continue through a CIFRE PhD at‌ the very beginning of 2026.

10 Partnerships and‌ cooperations

10.1 International initiatives

10.1.1 Associate Teams in‌ the framework of an Inria International Lab or‌ in the framework of an Inria International Program‌

Equipe Associée TopTime

Participants: Nina Otter.

Title:‌
Topological and statistical methods for time series data‌
Partner Institution(s):
- Australian National University (ANU), Australia
Date/Duration:‌
2024-2026
Additional information:
Katharine Turner (ANU) is the‌ co-PI of the EA.

10.1.2 Participation in other‌ International Programs

KTH Royal Institute of Technology Seed‌ Funding: Strengthening French – Swedish AI Collaboration

Participants:‌ Frédéric Chazal, Mathieu Carrière.

Title:
Geometry-informed‌ AI in wireless communications
Funding Institution(s)
: KTH‌ Stockholm, Sweden
Date/Duration:
2025-2026
- Joint project between the SCI school, department of‌ mathematics (PI: Martina Scolamiero),‌ the DataShape team and‌‌ Ericsson (industrial PI: Francesco Davide Calabrese).

SALTO exchange‌ program between MPG and‌ CNRS

Participants: Nina Otter‌‌.

Title:
Higher-order interactions at the crossroads of‌ geometry and topology
Partner‌ Institution(s):
- Max Planck Institute‌‌ for Mathematics in the Sciences, Leipzig, Germany
Date/Duration:‌
2024-2026
Additionnal info
SALTO‌ exchange programme between the‌‌ Max Planck Gesellschaft and the CNRS. Marzieh Eidi‌ (MPI MiS) is the‌ co-PI.

10.2 International research‌‌ visitors

10.2.1 Visits of international scientists

Wolfgang Polonik‌ (UC Davis). September-October 2025‌ (1 month).
Marzieh Eidi‌‌ (MPI MiS) April-May 2025 (2 months).

10.3 National‌ initiatives

Extended visit

Participants:‌ Corentin Lunel, Clément‌‌ Maria.

- Duration : 2024-2025

- Coordinator‌ : Clément Maria

-‌ Location : Institut Montpelliérain‌‌ Alexandre Grothendieck (IMAG) - Université de Montpellier

The‌ visit consists of federating‌ mathematicians from IMAG working‌‌ on low dimensional and quantum topology together with‌ computer scientists from Datashape,‌ to work at the‌‌ interface of the two fields.

10.3.1 ANR

ANR‌ Chair in AI

Participants:‌ Frédéric Chazal, Marc‌‌ Glisse.

- Acronym : TopAI

- Type‌ : ANR Chair in‌ AI.

- Title :‌‌ Topological Data Analysis for Machine Learning and AI‌

- Coordinator : Frédéric‌ Chazal

- Duration :‌‌ 2020-2026.

- Others Partners: Two industrial partners, the‌ French SME Sysnav and‌ the French start-up MetaFora.‌‌

- Abstract:

The TopAI project aims at developing‌ a world-leading research activity‌ on topological and geometric‌‌ approaches in Machine Learning (ML) and AI with‌ a double academic and‌ industrial/societal objective. First, building‌‌ on the strong expertise of the candidate and‌ his team in TDA,‌ TopAI aims at designing‌‌ new mathematically well-founded topological and geometric methods and‌ tools for Data Analysis‌ and ML and to‌‌ make them available to the data science and‌ AI community through state-of-the-art‌ software tools. Second, thanks‌‌ to already established close collaborations and the strong‌ involvement of French industrial‌ partners, TopAI aims at‌‌ exploiting its expertise and tools to address a‌ set of challenging problems‌ with high societal and‌‌ economic impact in personalized medicine and AI-assisted medical‌ diagnosis.

ANR ALGOKNOT

Participants:‌ Clément Maria.

-‌‌ Acronym : ALGOKNOT.

- Type : ANR Jeune‌ Chercheuse Jeune Chercheur.

-‌ Title : Algorithmic and‌‌ Combinatorial Aspects of Knot Theory.

- Coordinator :‌ Clément Maria.

- Duration‌ : 2020 – 2026‌‌

- Abstract: The project AlgoKnot aims at strengthening‌ our understanding of the‌ computational and combinatorial complexity‌‌ of the diverse facets of knot theory, as‌ well as designing efficient‌ algorithms and software to‌‌ study their interconnections.

- See also: Clément Maria‌ and ANR AlgoKnot.‌

ANR GeMfaceT

Participants: Blanche‌‌ Buet.

- Acronym: GeMfaceT.

- Type: ANR‌ JCJC -CES 40 –‌ Mathématiques

- Title: A‌‌ bridge between Geometric Measure and Discrete Surface Theories‌

- Coordinator: Blanche Buet.‌

- Duration: 2021–2026

-‌‌ Abstract: This project positions at the interface between‌ geometric measure and discrete‌ surface theories. There has‌‌ recently been a growing‌ interest in non-smooth structures, both from theoretical point‌ of view, where singularities occur in famous optimization‌ problems such as Plateau problem or geometric flows‌ such as mean curvature flow, and applied point‌ of view where complex high dimensional data are‌ no longer assumed to lie on a smooth‌ manifold but are more singular and allow crossings,‌ tree-structures and dimension variations. We propose in this‌ project to strengthen and expand the use of‌ geometric measure concepts in discrete surface study and‌ complex data modelling and also, to use those‌ possible singular disrcete surfaces to compute numerical solutions‌ to the aforementioned problems.

ANR StratMesh

Participants: Jean-Daniel‌ Boissonnat, Mathijs Wintraecken.

- Acronym: StratMesh.‌

- Type: ANR PRC

- Title: A bridge‌ between Geometric Measure and Discrete Surface Theories

-‌ Coordinator: Mathijs Wintraecken (local), Guillaume Moroz (Gamble, Centre‌ Inria de l'Université de Lorraine) .

- Duration:‌ 2025–2029

- Abstract: StratMesh aims to develop provably-correct‌ triangulation algorithms for stratified spaces. Our focus is‌ on stratified spaces that are the projection of‌ smooth manifolds, which arise in many applications such‌ as robotics, control theory, and medial axis computation‌ for learning from geometric data.

ANR TopModel

Participants:‌ Mathieu Carrière.

- Acronym: TopModel.

- Type:‌ ANR JCJC

- Title: TopModel

- Coordinator: Mathieu‌ Carrière

- Duration: 2024–2027

- Abstract: The central‌ tenet of this project is the use of‌ multiparameter topological data analysis for machine learning models,‌ for both regularizing and monitoring these models, and‌ for the automatic generation of new features and‌ descriptors to feed these models with. On the‌ theoretical front, a lot of efforts will be‌ devoted to the development, implementation and generalization of‌ standard topological data analysis techniques, who (for the‌ most part) can only study the topological variations‌ of at most one parameter (such as the‌ data scale), so as to make them suitable‌ for the study of the topological variations of‌ several parameters jointly (such as density and scale,‌ marker genes). Then, the focus will be on‌ specific applications, for which topological data analysis is‌ known to be relevant and efficient, of these‌ new multiparameter topological data analysis methods for machine‌ learning models. More precisely, we will emphasize the‌ usefulness of our new tools on data sets‌ from cosmology (large scale structures of the Universe)‌ and biology (single-cell sequencing, mass cytometry).

PEPR SN‌

Participants: Mathieu Carrière.

- Acronym: AI4scMED.

-‌ Type: Work package in PEPR SN

- Title:‌ Multiscale AI for single-cell based precision medicine

-‌ Coordinator: Mathieu Carrière

- Duration: 2023–2027

- Abstract:‌ Cell-based precision medicine holds revolutionary potential for healthcare,‌ but realizing its full potential demands a deep‌ understanding of disease variability and multiscale aspects. Single-cell‌ (sc) multi-omics offers a unique way to obtain‌ molecular profiles of individual cells and predict disease‌ trajectories. To harness this complexity, new AI breakthroughs‌ are needed. Our consortium will tackle methodological challenges‌ to bridge the gap between sc data and personalized treatments, resolving cell‌ type differences and integrating‌ sc-multi-omics with imaging for‌‌ spatial insights.

Addressing the complexity of the human‌ body and combining genomics‌ with other assays, we‌‌ will develop AI-based methods to handle, integrate, analyze,‌ and visualize multiscale complexity‌ in diseases. Our developments‌‌ will leverage cutting-edge AI for sc-genomic data analysis.‌ To infer causal mechanisms‌ at different levels, we‌‌ will use causal/logical/stochastic modeling to integrate heterogeneous data‌ and account for temporal‌ scales and biophysical priors.‌‌

We will create network inference methods to understand‌ molecular mechanisms in clinical‌ samples, identifying key genes‌‌ and predicting therapeutic impacts. Precision medicine must also‌ integrate variability across different‌ cell decision levels. We‌‌ aim to build predictive models, digital twins, to‌ enable data-driven personalized treatments‌ by connecting intracellular dynamics,‌‌ biochemical processes, cell populations, and tissue-level organization.

10.3.2‌ Collaboration with other national‌ research institutes

Confiance.ai /‌‌ IRT SystemX

Participants: Frédéric Chazal.

Research collaboration‌ on anomaly detection for‌ multivariate time series using‌‌ TDA and ML approaches.

11 Dissemination

11.1 Promoting‌ scientific activities

11.1.1 Scientific‌ events: organisation

Clément Maria‌‌ was co-organizer of the QuantAzur Days, Nice, November‌ 2025.
Nina Otter was‌ co-organiser of the conference‌‌ “Topological methods for time-varying data: theory and applications‌ (TopTime)”, at the Australian‌ National University, Canberra, Australia,‌‌ November 2025.
Nina Otter was co-organiser of the‌ workshop “Higher-order interactions at‌ the crossroads of geometry‌‌ and topology”, Laboratoire de Mathématiques d'Orsay, December 2025.‌

11.1.2 Scientific events: selection‌

Member of the conference‌‌ program committees

Clément Maria was member of the‌ program committee of the‌ 43rd International Symposium on‌‌ Theoretical Aspects of Computer Science (STACS) 2026
Nina‌ Otter was member of‌ the program committee of‌‌ the 11th ATMCS conference (2025).
Nina Otter was‌ member of the program‌ committee of the Applied‌‌ Category Theory (ACT) 2025 Conference.

11.1.3 Journal

Member‌ of the editorial boards‌

Frédéric Chazal is a‌‌ member of the following journal editorial boards: Discrete‌ and Computational Geometry (Springer),‌ Journal of Applied and‌‌ Computational Topology (Springer).
Frédéric Chazal is the Editor-in-Chief‌ of the Journal of‌ Applied and Computational Topology‌‌ (Springer).

11.1.4 Leadership within the scientific community

Frédéric‌ Chazal is a member‌ of the Scientific Advisory‌‌ Board of the Centre for Topological Data Analysis‌ of the Mathematical Institute‌ at Oxford.
Frédéric Chazal‌‌ is a member of the Scientific Council of‌ EMAp (ESCOLA DE MATEMÁTICA‌ APLICADA DA FUNDAÇÃO GETULIO‌‌ VARGAS), Rio de Janeiro, Brasil.
Mathieu Carrière is‌ a chair holder of‌ the 3IA Institute at‌‌ Université Côte d'Azur.

11.1.5 Scientific expertise

Frédéric Chazal‌ is a member of‌ the “commission prospective de‌‌ l’I2M” (Institut de Mathématiques de Marseille).
Clément Maria‌ was a jury member‌ for the UCA-DS4H PhD‌‌ grant allocation scheme for 2025.
Nina Otter is‌ member of the executive‌ committee of the DataIA‌‌ institute.

11.1.6 Research administration

Marc Glisse is president‌ of the CDT at‌ Inria Saclay.
Frédéric Chazal‌‌ is co-responsible of the “programme Mathématiques et IA”‌ of the Fondation Mathématique‌ Jacques Hadamard, Paris-Saclay University‌‌ (until Oct. 2025).
Frédéric‌ Chazal is a member of the council of‌ the Graduate School in Mathematics, Paris-Saclay Univ.
Clément‌ Maria is co-responsible of the CNRS-Groupe de Travail‌ GeoAlgo.
Clément Maria is a member of the‌ steering committee of the QuantAzur federative institute.

11.2‌ Teaching - Supervision - Juries - Educational and‌ pedagogical outreach

11.2.1 Teaching

Master: Mathijs Wintraecken, Introduction‌ to Scientific Research, 2h eq-TD, mineure DS4H (Master‌ and PhD)
Master: Frédéric Chazal, Analyse Topologique des‌ Données, 30h eq-TD, Université Paris-Saclay, France.
Master: Frédéric‌ Chazal and Julien Tierny, Topological Data Analysis, 38h‌ eq-TD, M2, Mathématiques, Vision, Apprentissage (MVA), ENS Paris-Saclay,‌ France.
Master: Mathieu Carrière, Basic Algebra for Data‌ Analysis, 18h eq-TD, MSc DSAI, Université Côte d'Azur‌
Master: Mathieu Carrière and Frédéric Cazals, Geometric and‌ Topological Methods in Data Analysis, with Applications in‌ Biology and Medecine , 15h eq-TD, MSc DSAI,‌ Université Côte d'Azur
Master: Mathieu Carrière, Statistical Learning‌ Theory, 15h eq-TD, MSc DSAI, Université Côte d'Azur‌
PSL doctoral course: Eddie Aamari, Frédéric Chazal, Alejandro‌ Saldarriaga, 1 week, Topological Data Analysis.
Mini-course at‌ Young Topologists Meeting 2025 : Frédéric Chazal, Persistent‌ homology for machine Learning : a measure perspective,‌ 6h, KTH Stockholm.
Master: Marc Glisse, Conception et‌ analyse d'algorithmes, 44h eq-TD, M1, École Polytechnique, France.‌

11.2.2 Supervision

PhD in progress: Rohit Roy. Triangulating‌ stratified spaces. Started on November 2025. Mathijs Wintraecken‌ and Pierre Alliez (TITANE).
PhD in progress: Myriam‌ Frikha, Domain adaptation for temporal data. Started in‌ Nov. 2024. Frédéric Chazal.
PhD in progress: Jérôme‌ Taupin, Density-based metric learning and applications in Topological‌ Data Analysis. Started in Sept. 2025. Frédéric Chazal.‌
PhD in progress: Anna Hollands, Persistent path-homology for‌ directed-graph analysis: Statistical aspects and applications to machine‌ learning. Started in Oct. 2025. Frédéric Chazal and‌ Bertrand Michel.
PhD in progress: Alejandro Saldarriaga, Topological‌ Deep Learning. Started in Nov. 2025. Eddie Aamari‌ (ENS Paris) and Frédéric Chazal.
PhD in progress:‌ Henrique Ennes, Topological approach to quantum complexity. Started‌ in Oct. 2023. Clément Maria and Nicolas Nisse‌ (Inria).
PhD in progress: António Leitao, Persistent homology‌ of cover refinements and applications to XAI. Started‌ November 2024. Nina Otter and Fosca Giannotti (Scuola‌ Normale Superiore di Pisa)

11.2.3 Juries

Marc Glisse‌ was the external reviewer for the PhD defense‌ of Dominic Desjardins Côté, Université de Sherbrooke, Canada.‌
Blanche Buet was a member of the PhD‌ Defense jury of Rémi Mougenot (12/2025), Université de‌ Lorraine.
Mathieu Carrière was a member of the‌ PhD Defense of Mohamed Kissi (10/2025), Université Paris-Sorbonne,‌ and Rayna Andreeva (05/2025), University of Edinburgh.
Nina‌ Otter was a member of the PhD Defense‌ of Andreas Abildtrup Hansen (09/2025), The Technical University‌ of Denmark.

11.2.4 Productions (articles, videos, podcasts, serious‌ games, ...)

Clément Maria : portrait de chercheur,‌ exposition Street Science à Nice.
Clément Maria :‌ Article Mapping the algorithmic complexity of topological quantum‌ computing dans le magazine de l’IdEx d’Université Côte‌ d’Azur INSIGHTS.

11.2.5 Participation in Live events

Blanche‌ Buet participated in Fête de la Science (at IMO, 10/2025) and in‌ RJMI (at Ens Paris‌ Saclay, 11/2025). Blanche Buet‌‌ gave a popularization seminar to L3 students (at‌ IMO, 12/2025). Blanche Buet‌ is part of the‌‌ organizing committee of the FMJH Welcome days for‌ masters (at IMO, 09/2025).‌

11.2.6 Others science outreach‌‌ relevant activities

Frédéric Chazal gave a general audience‌ introductory presentation on Artificial‌ Intelligence at Université pour‌‌ Tous de Bourgogne (March 2025).
Frédéric Chazal participated‌ in round tables and‌ gave talks on different‌‌ aspects of AI at the Academie du Renseignement.‌

12 Scientific production

12.1‌ Major publications

1 article‌‌G.Gilles Blanchard, A. A.Aniket Anand‌ Deshmukh, U.Urun‌ Dogan, G.Gyemin‌‌ Lee and C.Clayton Scott. Domain Generalization‌ by Marginal Transfer Learning‌.Journal of Machine‌‌ Learning Research2222021, 1-55HAL‌DOI
2 articleJ.-D.‌Jean-Daniel Boissonnat and M.‌‌Mathijs Wintraecken. The Topological Correctness of PL‌ Approximations of Isomanifolds.‌Foundations of Computational Mathematics‌‌22July 2021, 967 - 1012HAL‌DOI
3 articleB.‌Blanche Buet and M.‌‌Martin Rumpf. Mean curvature motion of point‌ cloud varifolds.ESAIM:‌ Mathematical Modelling and Numerical‌‌ Analysis5652022, 1773 - 1808‌HAL DOI
4 inproceedings‌M.Mathieu Carriere and‌‌ A. J.Andrew J Blumberg. Multiparameter Persistence‌ Images for Topological Machine‌ Learning.NeurIPS 2020‌‌ - 33rd Conference on Neural Information Processing Systems‌Vancouver / Virtuel, Canada‌December 2020HAL
5‌‌ inproceedingsM.Mathieu Carriere, F.Frédéric Chazal‌, M.Marc Glisse‌, Y.Yuichi Ike‌‌ and H.Hariprasad Kannan. Optimizing persistent homology‌ based functions.ICML‌ 2021 - 38th International‌‌ Conference on Machine LearningPMLR 139Proceedings of‌ the 38th International Conference‌ on Machine Learning, ICML‌‌ 2021.Virtual conference, United StatesJuly 2021,‌ 1294-1303HAL
6 article‌D.David Cohen-Steiner,‌‌ A.André Lieutier and J.Julien Vuillamy.‌ Lexicographic Optimal Homologous Chains‌ and Applications to Point‌‌ Cloud Triangulations.Discrete and Computational Geometry68‌September 2022HAL DOI‌
7 articleR.Rémi‌‌ Gribonval, G.Gilles Blanchard, N.Nicolas‌ Keriven and Y.Yann‌ Traonmilin. Compressive Statistical‌‌ Learning with Random Feature Moments.Mathematical Statistics‌ and Learning32‌August 2021, 113–164‌‌HAL DOI
8 articleC.Clément Maria and‌ J.Jonathan Spreer.‌ A Polynomial-Time Algorithm to‌‌ Compute Turaev–Viro Invariants ${TV}_{4, q}$ of‌ 3-Manifolds with Bounded First‌ Betti Number.Foundations‌‌ of Computational Mathematics205November 2019,‌ 1013-1034HAL DOI
9‌ articleM.Miguel O'Malley‌‌, S.Sara Kalisnik and N.Nina Otter‌. Alpha magnitude.‌Journal of Pure and‌‌ Applied Algebra22711November 2023, 107396‌HAL DOI

12.2 Publications‌ of the year

International‌‌ journals

10 articleL.Louis Abraham, C.‌Charles Arnal and A.‌Antoine Marie. Prompt‌‌ selection matters: enhancing text annotations for social sciences‌ with large language models‌.Journal of Computational‌‌ Social Science873‌July 2025HAL DOI
11 articleK.Kevin‌ Bleakley, M.Mouhcine Mendil, M.Martin‌ Royer and B.Benjamin Auder. Supervised aggregation‌ of anomaly score functions for active anomaly detection‌.Transactions on Machine Learning Research Journal2026‌. In press. HALback to text
12‌ articleE. W.Erin Wolf Chambers, C.‌Christopher Fillmore, E.Elizabeth Stephenson and M.‌Mathijs Wintraecken. Burning or Collapsing the Medial‌ Axis is Unstable.La MatematicaAugust 2025‌HAL DOI back to text
13 articleE.‌Erin Chambers, T.Tao Ju, D.‌David Letscher, H.Hannah Schreiber and D.‌Dan Zeng. VHS: A package for homological‌ simplification of voxelized plant root data for skeletonization‌.Computational Geometry130May 2025, 102198‌HAL DOI
14 articleH.Hana Dal Poz‌ Kouřimská, A.André Lieutier and M.Mathijs‌ Wintraecken. The medial axis of any closed‌ bounded set is locally Lipschitz stable with respect‌ to the Hausdorff distance under ambient diffeomorphisms.‌Journal of Applied and Computational Topology93‌July 2025, 20HAL DOI
15 article‌S.Solenne Gaucher, G.Gilles Blanchard and‌ F.Frédéric Chazal. Supervised Contamination Detection, with‌ Flow Cytometry Application.Biometrika1124August‌ 2025HAL DOI back to text
16 article‌D.David Loiseaux, M.Mathieu Carrière and‌ A.Andrew Blumberg. Multi-parameter Module Approximation: an‌ efficient and interpretable invariant for multi-parameter persistence modules‌ with guarantees.Journal of Applied and Computational‌ Topology94October 2025, 26HAL‌DOI
17 articleW.Wojciech Reise, B.‌Bertrand Michel and F.Frédéric Chazal. Topological‌ signatures of periodic-like signals.Bernoulli313‌2025, 1955-1990HAL

International peer-reviewed conferences

18‌ inproceedingsL.Lucas Brifault, D.David Cohen-Steiner‌ and M.Mathieu Desbrun. Efficient and Scalable‌ Spatial Regularization of Optimal Transport.SA Conference‌ Papers '25: SIGGRAPH Asia 2025 Conference PapersSA‌ Conference Papers '25: SIGGRAPH Asia 2025 Conference Papers‌Hong Kong Hong Kong, ChinaACMDecember 2025‌, 1-10HAL DOIback to text
19‌ inproceedingsF.Francesco Conti, G.Gianmarco Lazzini‌, R.Raffaele Gaeta, L. E.Luca‌ Emanuele Pollina, A.Annalisa Comandatore, N.‌Niccolò Furbetta, L.Luca Morelli, M.‌Mario D’acunto, D.Davide Moroni and M.‌ A.Maria Antonietta Pascali. Topological Machine Learning‌ for Raman Spectroscopy: Perspectives for Pancreatic Diseases.‌AITA 2025 - 18th International Workshop on Advanced‌ Infrared Technology and ApplicationsKobe, JapanSeptember 2025‌HAL DOI
20 inproceedingsF.Francesco Conti,‌ D.Davide Moroni and M. A.Maria Antonietta‌ Pascali. Topological Machine Learning for Discriminative Spectral‌ Band Identification in Raman Spectroscopy of Pathological Samples‌.AITA 2025 - 18th International Workshop on‌ Advanced Infrared Technology and ApplicationsKobe, JapanSeptember‌ 2025HAL DOI
21 inproceedingsH.Henrique Ennes‌ and C.Clément Maria. Hardness of computation of quantum invariants on‌ 3-manifolds with restricted topology‌.Leibniz International Proceedings‌‌ in Informatics (LIPIcs), ESA 2025ESA 2025 -‌ European Symposium on Algorithms‌ (part of ALGO 2025)‌‌351Warsaw, PolandSchloss Dagstuhl – Leibniz-Zentrum für‌ Informatik2025HAL DOI‌back to text
22‌‌ inproceedingsJ.-B.Jean-Baptiste Fermanian, P.Pierre Humbert‌ and G.Gilles Blanchard‌. Transductive Conformal Inference‌‌ for Full Ranking.NeurIPS 2025, The Thirty-Ninth‌ Annual Conference on Neural‌ Information Processing SystemsSan‌‌ Diego (CA), United StatesDecember 2025HAL back‌ to text
23 inproceedings‌C.Corentin Lunel,‌‌ A.Arnaud de Mesmay and J.Jonathan Spreer‌. Hard Diagrams of‌ Split Links.Leibniz‌‌ International Proceedings in Informatics (LIPIcs)41st International Symposium‌ on Computational Geometry (SoCG‌ 2025)Kanazawa, JapanSchloss‌‌ Dagstuhl – Leibniz-Zentrum für Informatik2025HAL DOI‌

Reports & preprints

24‌ miscR.Rayna Andreeva‌‌, H.Haydeé Contreras-Peruyero, S.Sanjukta Krishnagopal‌, N.Nina Otter‌, M. A.Maria‌‌ Antonietta Pascali and E.Elizabeth Thompson. Fractal‌ dimensions of complex networks:‌ advocating for a topological‌‌ approach.2025HALDOI
25 miscJ.-F.‌Jean-François Babadjian, B.‌Blanche Buet and M.‌‌Michael Goldman. Curvature penalization of strongly anisotropic‌ interfaces models and their‌ phase-field approximation.October‌‌ 2025HAL back to text
26 miscG.‌Gilles Blanchard, N.‌Nicolas Curien, K.‌‌Klara Krause and A. G.Alexander G. Reisach‌. A phase transition‌ in Barak-Erdős random graphs‌‌.December 2025HAL
27 miscG.Gilles‌ Blanchard, J.-B.Jean-Baptiste‌ Fermanian and H.Hannah‌‌ Marienwald. Estimation of multiple mean vectors in‌ high dimension.March‌ 2025HAL
28 misc‌‌B.Blanche Buet, G. P.Gian Paolo‌ Leonardi, S.Simon‌ Masnou and A.Abdelmouksit‌‌ Sagueni. Approximate mean curvature flows of a‌ general varifold, and their‌ limit spacetime Brakke flow‌‌.October 2025HALback to text
29‌ miscJ.Jeremie Capitao-Miniconi‌, É.Élisabeth Gassiat‌‌ and L.Luc Lehéricy. Deconvolution of repeated‌ measurements corrupted by unknown‌ noise.July 2025‌‌HAL
30 miscJ.Jeremie Capitao-Miniconi, É.‌Élisabeth Gassiat and L.‌Luc Lehéricy. Support‌‌ and distribution inference from noisy data.February‌ 2025HAL
31 misc‌E. W.Erin Wolf‌‌ Chambers, C.Christopher Fillmore, E.Elizabeth‌ Stephenson and M.Mathijs‌ Wintraecken. Braiding Vineyards‌‌.November 2025HAL
32 miscO.Ondřej‌ Draganov, H.Herbert‌ Edelsbrunner, S.Sophie‌‌ Rosenmeier and M.Morteza Saghafian. Expected Length‌ of the Euclidean Minimum‌ Spanning Tree and 1-norms‌‌ of Chromatic Persistence Diagrams in the Plane.‌November 2025HAL
33‌ miscM.Marzieh Eidi‌‌ and N.Nina Otter. Geometric characterisation of‌ structural and regular equivalences‌ in undirected (hyper)graphs.‌‌2025HAL back to text
34 miscH.‌Henrique Ennes and C.‌Clément Maria. Compressed‌‌ data structures for Heegaard splittings.July 2025‌HAL back to text‌
35 miscO.Olympio‌‌ Hacquard, G.Gilles‌ Blanchard and C.Clément Levrard. Statistical learning‌ on measures: an application to persistence diagrams.‌May 2025HAL
36 miscK.Kristóf Huszár‌ and C.Clément Maria. On Sparse Representations‌ of 3-Manifolds.December 2025HAL back to‌ text
37 miscE. K.Elena K Kandror‌, A.Anqi Wang, M.Mathieu Carriere‌, A.Alexis Peterson, W.Will Liao‌, A.Andreas Tjarnberg, J. H.Jun‌ Hou Fung, K. T.Krishnaa T Mahbubani‌, J.Jackson Loper, W.William Pangburn‌, Y.Yuchen Xu, K.Kourosh Saeb-Parsy‌, R.Raul Rabadan, T.Tom Maniatis‌ and A. H.Abbas H Rizvi. Enhancer‌ Dynamics and Spatial Organization Drive Anatomically Restricted Cellular‌ States in the Human Spinal Cord.January‌ 2025HAL DOI
38 miscA.André Lieutier‌ and M.Mathijs Wintraecken. Geodesics of length‌ less than πR in a set of reach‌ R are unique and continuous with respect to‌ the end points.November 2025HAL
39‌ miscC.Corentin Lunel and C.Clément Maria‌. Well-quasi-orders on embedded planar graphs.December‌ 2025HAL back to text
40 miscC.‌Clément Maria and H.Hoel Queffelec. A‌ fast algorithm for the Hecke representation of the‌ braid group, and applications to the computation of‌ the HOMFLY-PT polynomial and the search for interesting‌ braids.December 2025HAL back to text‌
41 miscL.Lorenzo Mario Amorosa, F.‌Francesco Conti, N.Nicola Quercioli, F.‌Flavio Zabini, T. L.Tayebeh Lotfi Mahyari‌, Y.Yiqun Ge and P.Patrizio Frosini‌. Reconstruction of SINR Maps from Sparse Measurements‌ using Group Equivariant Non-Expansive Operators.October 2025‌HAL
42 miscM.Maxime Multari, M.‌Mathieu Carriere, X.Xavier Amoros-Gabarron, A.‌Alexina Damy, S.Sebastien Lobentanzer, J.‌Julio Saez-Rodriguez, S.Stephanie Jaubert, A.‌Aurelien Dugourd and S.Silvia Bottini. A‌ knowledge graph and topological data analysis framework to‌ disentangle the tomato-multi pathogens complex gene regulatory network‌.April 2025HAL
43 miscZ.Ziyad‌ Oulhaj, M.Mathieu Carrière and B.Bertrand‌ Michel. Gromov-Wasserstein Bound between Reeb and Mapper‌ Graphs.2025HAL
44 miscP.Pierre‌ Pansu. Théorie de l'homotopie quantitative: d'après Guth,‌ Manin, Weinberger,....January 2025HAL
45 misc‌J.Jérôme Taupin and F.Frédéric Chazal.‌ Fermat Distance-to-Measure: a robust Fermat-like metric.April‌ 2025HAL back to text

Other scientific publications‌

46 inproceedingsR. M.Roberto M Dyke,‌ M.Mathijs Wintraecken, M.-A.Marie-Aurélie Chanut and‌ P.Pierre Alliez. Curvature-Guided Optimal Transport for‌ Rigid Point Cloud Registration.SGP 2025 -‌ Symposium on Geometry ProcessingBilbao, SpainJuly 2025‌HAL

DATASHAPE - 2025

DATASHAPE - 2025

2025Activity reportProject-Team​​﻿﻿DATASHAPE

Keywords

Computer​​﻿﻿ Science and Digital Science​​​‌

Other Research Topics​​﻿﻿ and Application Domains

1 Team members, visitors,﻿​﻿﻿ external collaborators

Research Scientists​‌﻿﻿

Faculty Members

Post-Doctoral Fellows﻿​​﻿

PhD Students

Technical Staff

Interns and Apprentices﻿​​﻿

Administrative Assistants

Visiting Scientists

External Collaborators﻿‌​‌

2﻿﻿﻿‌ Overall objectives

3 Research program​​​‌

3.1 Algorithmic aspects and﻿﻿﻿‌ new mathematical directions for﻿‌​‌ topological and geometric data﻿​​﻿ analysis

3.2 Statistical﻿​​﻿ aspects of topological and​​​‌ geometric data analysis

3.3​​﻿﻿ Topological and geometric approaches​​​‌ for machine learning

3.4 Experimental research﻿​﻿﻿ and software development

4﻿​​﻿ Application domains

5 Social​​​‌ and environmental responsibility

5.1﻿﻿﻿‌ Footprint of research activities﻿‌​‌

6﻿‌​‌ Highlights of the year﻿​​﻿

6.1 Awards

6.2 PhD defenses​​​‌

7 Latest software developments,​​﻿﻿ platforms, open data

7.1 Latest software​​﻿﻿ developments

7.1.1 GUDHI

7.1.2 Multipers﻿​﻿﻿

8 New﻿‌​‌ results

8.1 Algorithmic aspects﻿​​﻿ and new mathematical directions​​​‌ for topological and geometric﻿﻿﻿‌ data analysis

8.1.1 Efficient﻿‌​‌ and Scalable Spatial Regularization﻿​​﻿ of Optimal Transport

8.1.2 Burning or Collapsing﻿﻿﻿‌ the Medial Axis is﻿‌​‌ Unstable

8.1.3 Sparsification of​​​‌ the generalized persistence diagrams﻿​﻿﻿ for scalability through gradient​‌﻿﻿ descent

8.1.4 Multi-parameter​‌﻿﻿ Module Approximation: An Efficient​​﻿﻿ and Interpretable Invariant for​​​‌ Multi-Parameter Persistence Modules with﻿​﻿﻿ Guarantees

8.1.5 A﻿​​﻿ fast algorithm for the​​​‌ Hecke representation of the﻿﻿﻿‌ braid group, and applications﻿‌​‌ to the computation of﻿​​﻿ the HOMFLY-PT polynomial and​​​‌ the search for interesting﻿﻿﻿‌ braids

8.1.6 On Sparse Representations﻿​​﻿ of 3-Manifolds

8.1.7 Compressed data structures﻿​﻿﻿ for Heegaard splittings

8.1.8 Hardness​‌﻿﻿ of computation of quantum​​﻿﻿ invariants on 3-manifolds with​​​‌ restricted topology

8.1.9 Well-quasi-orders on embedded​​﻿﻿ planar graphs

8.1.10 Geometric characterisation of﻿﻿﻿‌ structural and regular equivalences﻿‌​‌ in undirected (hyper)graphs

8.2 Statistical​​​‌ aspects of topological and﻿﻿﻿‌ geometric data analysis

8.2.1﻿‌​‌ Gromov-Wasserstein Bound between Reeb﻿​​﻿ and Mapper Graphs

8.3 Topological​‌﻿﻿ and geometric approaches for​​﻿﻿ machine learning

8.3.1 A​​​‌ Knowledge Graph and Topological﻿​﻿﻿ Data Analysis Framework to​‌﻿﻿ Disentangle the Tomato-Multi Pathogens​​﻿﻿ Complex Gene Regulatory Network​​​‌

8.3.2 Enhancer﻿​﻿﻿ Dynamics and Spatial Organization​‌﻿﻿ Drive Anatomically Restricted Cellular​​﻿﻿ States in the Human​​​‌ Spinal Cord

8.3.3 Fermat Distance-to-Measure: a﻿‌​‌ robust Fermat-like metric

8.4​​​‌ Miscellaneous

8.4.1 Curvature-Guided Optimal﻿﻿﻿‌ Transport for Rigid Point﻿‌​‌ Cloud Registration

8.4.2​​​‌ Supervised Contamination Detection, with﻿﻿﻿‌ Flow Cytometry Application

8.4.3 Transductive Conformal Inference​​​‌ for Full Ranking

8.4.4 Supervised aggregation​​​‌ of anomaly score functions﻿​﻿﻿ for active anomaly detection​‌﻿﻿

8.4.5 Curvature﻿‌​‌ penalization of strongly anisotropic﻿​​﻿ interfaces models and their​​​‌ phase-field approximation

8.4.6 Approximate mean﻿﻿﻿‌ curvature flows of a﻿‌​‌ general varifold, and their﻿​​﻿ limit spacetime Brakke flow​​​‌

8.4.7 Théorie de l'homotopie​​﻿﻿ quantitative

9 Bilateral contracts and​‌﻿﻿ grants with industry

9.1​​﻿﻿ Bilateral contracts with industry​​​‌

10 Partnerships and​‌﻿﻿ cooperations

10.1 International initiatives​​﻿﻿

10.1.1 Associate Teams in​​​‌ the framework of an﻿​﻿﻿ Inria International Lab or​‌﻿﻿ in the framework of​​﻿﻿ an Inria International Program​​​‌

Equipe Associée TopTime

10.1.2 Participation in other​‌﻿﻿ International Programs

KTH Royal​​﻿﻿ Institute of Technology Seed​​​‌ Funding: Strengthening French –﻿​﻿﻿ Swedish AI Collaboration

SALTO exchange​​​‌ program between MPG and﻿﻿﻿‌ CNRS

10.2 International research﻿‌​‌ visitors

10.2.1 Visits of﻿​​﻿ international scientists

10.3 National​​​‌ initiatives

Extended visit

10.3.1 ANR

ANR​​​‌ Chair in AI

ANR ALGOKNOT

ANR GeMfaceT

ANR StratMesh

ANR TopModel

PEPR SN​​​‌

10.3.2​​​‌ Collaboration with other national﻿﻿﻿‌ research institutes

Confiance.ai /﻿‌​‌ IRT SystemX

2025Activity reportProject-TeamDATASHAPE

Computer Science and Digital Science‌

Other Research Topics and Application Domains

1 Team members, visitors, external collaborators

Research Scientists‌

Post-Doctoral Fellows

Interns and Apprentices

External Collaborators‌‌

2‌ Overall objectives

3 Research program‌

3.1 Algorithmic aspects and‌ new mathematical directions for‌‌ topological and geometric data analysis

3.2 Statistical aspects of topological and‌ geometric data analysis

3.3 Topological and geometric approaches‌ for machine learning

3.4 Experimental research and software development

4 Application domains

5 Social‌ and environmental responsibility

5.1‌ Footprint of research activities‌‌

6‌‌ Highlights of the year

6.2 PhD defenses‌

7 Latest software developments, platforms, open data

7.1 Latest software developments

7.1.2 Multipers

8 New‌‌ results

8.1 Algorithmic aspects and new mathematical directions‌ for topological and geometric‌ data analysis

8.1.1 Efficient‌‌ and Scalable Spatial Regularization of Optimal Transport

8.1.2 Burning or Collapsing‌ the Medial Axis is‌‌ Unstable

8.1.3 Sparsification of‌ the generalized persistence diagrams for scalability through gradient‌ descent

8.1.4 Multi-parameter‌ Module Approximation: An Efficient and Interpretable Invariant for‌ Multi-Parameter Persistence Modules with Guarantees

8.1.5 A fast algorithm for the‌ Hecke representation of the‌ braid group, and applications‌‌ to the computation of the HOMFLY-PT polynomial and‌ the search for interesting‌ braids

8.1.6 On Sparse Representations of 3-Manifolds

8.1.7 Compressed data structures for Heegaard splittings

8.1.8 Hardness‌ of computation of quantum invariants on 3-manifolds with‌ restricted topology

8.1.9 Well-quasi-orders on embedded planar graphs

8.1.10 Geometric characterisation of‌ structural and regular equivalences‌‌ in undirected (hyper)graphs

8.2 Statistical‌ aspects of topological and‌ geometric data analysis

8.2.1‌‌ Gromov-Wasserstein Bound between Reeb and Mapper Graphs

8.3 Topological‌ and geometric approaches for machine learning

8.3.1 A‌ Knowledge Graph and Topological Data Analysis Framework to‌ Disentangle the Tomato-Multi Pathogens Complex Gene Regulatory Network‌

8.3.2 Enhancer Dynamics and Spatial Organization‌ Drive Anatomically Restricted Cellular States in the Human‌ Spinal Cord

8.3.3 Fermat Distance-to-Measure: a‌‌ robust Fermat-like metric

8.4‌ Miscellaneous

8.4.1 Curvature-Guided Optimal‌ Transport for Rigid Point‌‌ Cloud Registration

8.4.2‌ Supervised Contamination Detection, with‌ Flow Cytometry Application

8.4.3 Transductive Conformal Inference‌ for Full Ranking

8.4.4 Supervised aggregation‌ of anomaly score functions for active anomaly detection‌

8.4.5 Curvature‌‌ penalization of strongly anisotropic interfaces models and their‌ phase-field approximation

8.4.6 Approximate mean‌ curvature flows of a‌‌ general varifold, and their limit spacetime Brakke flow‌

8.4.7 Théorie de l'homotopie quantitative

9 Bilateral contracts and‌ grants with industry

9.1 Bilateral contracts with industry‌

10 Partnerships and‌ cooperations

10.1 International initiatives

10.1.1 Associate Teams in‌ the framework of an Inria International Lab or‌ in the framework of an Inria International Program‌

10.1.2 Participation in other‌ International Programs

KTH Royal Institute of Technology Seed‌ Funding: Strengthening French – Swedish AI Collaboration

SALTO exchange‌ program between MPG and‌ CNRS

10.2 International research‌‌ visitors

10.2.1 Visits of international scientists

10.3 National‌ initiatives

ANR‌ Chair in AI

PEPR SN‌

10.3.2‌ Collaboration with other national‌ research institutes

Confiance.ai /‌‌ IRT SystemX

11.1 Promoting‌ scientific activities

11.1.1 Scientific‌ events: organisation

11.1.2 Scientific events: selection‌

Member of the conference‌‌ program committees

Member‌ of the editorial boards‌

11.1.4 Leadership within the scientific community

11.1.5 Scientific expertise

11.1.6 Research administration

11.2‌ Teaching - Supervision - Juries - Educational and‌ pedagogical outreach

11.2.1 Teaching

11.2.4 Productions (articles, videos, podcasts, serious‌ games, ...)

11.2.5 Participation in Live events

11.2.6 Others science outreach‌‌ relevant activities

12.1‌ Major publications

12.2 Publications‌ of the year

International‌‌ journals

Other scientific publications‌