EN FR
EN FR

2025Activity reportProject-Team​​DATASHAPE

RNSR: 201622050C

Creation of​‌ the Project-Team: 2020 October​​ 01

Each year, Inria​​​‌ research teams publish an​ Activity Report presenting their​‌ work and results over​​ the reporting period. These​​​‌ reports follow a common​ structure, with some optional​‌ sections depending on the​​ specific team. They typically​​​‌ begin by outlining the​ overall objectives and research​‌ programme, including the main​​ research themes, goals, and​​​‌ methodological approaches. They also​ describe the application domains​‌ targeted by the team,​​ highlighting the scientific or​​​‌ societal contexts in which​ their work is situated.​‌

The reports then present​​ the highlights of the​​​‌ year, covering major scientific​ achievements, software developments, or​‌ teaching contributions. When relevant,​​ they include sections on​​​‌ software, platforms, and open​ data, detailing the tools​‌ developed and how they​​ are shared. A substantial​​​‌ part is dedicated to​ new results, where scientific​‌ contributions are described in​​ detail, often with subsections​​​‌ specifying participants and associated​ keywords.

Finally, the Activity​‌ Report addresses funding, contracts,​​ partnerships, and collaborations at​​​‌ various levels, from industrial​ agreements to international cooperations.​‌ It also covers dissemination​​ and teaching activities, such​​​‌ as participation in scientific​ events, outreach, and supervision.​‌ The document concludes with​​ a presentation of scientific​​​‌ production, including major publications​ and those produced during​‌ the year.

Keywords

Computer​​ Science and Digital Science​​​‌

  • A3. Data and knowledge​
  • A3.4. Machine learning and​‌ statistics
  • A7.1. Algorithms
  • A8.​​ Mathematics of computing
  • A8.1.​​​‌ Discrete mathematics, combinatorics
  • A8.3.​ Geometry, Topology
  • A9. Artificial​‌ intelligence

Other Research Topics​​ and Application Domains

  • B1.​​​‌ Life sciences
  • B2. Digital​ health
  • B5. Industry of​‌ the future
  • B9. Society​​ and Knowledge
  • B9.5. Sciences​​​‌

1 Team members, visitors,​ external collaborators

Research Scientists​‌

  • Frederic Chazal [Team​​ leader, INRIA,​​​‌ Senior Researcher, HDR​]
  • Jean-Daniel Boissonnat [​‌INRIA, Emeritus,​​ HDR]
  • Mathieu Carrière​​​‌ [INRIA, Researcher​]
  • David Cohen-Steiner [​‌INRIA, Researcher]​​
  • Marc Glisse [INRIA​​​‌, Researcher]
  • Clément​ Maria [INRIA,​‌ Researcher]
  • Nina Lisann​​ Otter [INRIA,​​​‌ ISFP]
  • Mathijs Wintraecken​ [INRIA, ISFP​‌]

Faculty Members

  • Gilles​​ Blanchard [UNIV PARIS​​​‌ SACLAY, Associate Professor​]
  • Charly Boricaud [​‌UNIV PARIS SACLAY,​​ Professor, until Sep​​​‌ 2025]
  • Blanche Buet​ [UNIV PARIS SACLAY​‌, Associate Professor]​​
  • Remi Leclercq [UNIV​​ PARIS SACLAY, Professor​​​‌]
  • Pierre Pansu [‌UNIV PARIS SACLAY,‌​‌ Emeritus]

Post-Doctoral Fellows​​

  • Daniele Cannarsa [INRIA​​​‌, Post-Doctoral Fellow,‌ until Aug 2025]‌​‌
  • Francesco Conti [INRIA​​, Post-Doctoral Fellow]​​​‌
  • Ondrej Draganov [INRIA‌, Post-Doctoral Fellow,‌​‌ from Apr 2025]​​
  • Corentin Lunel [INRIA​​​‌, Post-Doctoral Fellow,‌ until Sep 2025]‌​‌
  • Renata Turkes [INRIA​​, Post-Doctoral Fellow,​​​‌ until Aug 2025]‌

PhD Students

  • Myriam Frikha‌​‌ [ERICSSON]
  • Hugo​​ Henneuse [UNIV PARIS​​​‌ SACLAY, until Oct‌ 2025]
  • Anna Hollands‌​‌ [UNIV PARIS SACLAY​​, from Oct 2025​​​‌]
  • Antonio Lage De‌ Sousa Leitao [Scuola‌​‌ Normale Superiore di Pisa,​​ Italy, from Nov 2024​​​‌]
  • Henrique Lovisi Ennes‌ [UNIV COTE D'AZUR‌​‌, from May 2025​​]
  • Rohit Roy [​​​‌INRIA, from Nov‌ 2025]
  • Alejandro Saldarriaga‌​‌ [DMA-ENS, from​​ Oct 2025]
  • Jérôme​​​‌ Taupin [Université Paris-Saclay‌, from Sep 2025‌​‌]

Technical Staff

  • Vincent​​ Rouvreau [INRIA,​​​‌ Engineer]
  • Hannah Schreiber‌ [INRIA, Engineer‌​‌]

Interns and Apprentices​​

  • Ludo Andrianirina Mamisoa [​​​‌UNIV COTE D'AZUR,‌ Intern, from Mar‌​‌ 2025 until Aug 2025​​]
  • Nestor Antunano Cabrera​​​‌ [INRIA, Intern‌, from Apr 2025‌​‌ until Aug 2025]​​
  • Madhav Cherupilil Sajeev [​​​‌UNIV COTE D'AZUR,‌ Intern, from Apr‌​‌ 2025 until Aug 2025​​]
  • Alberto Conforti [​​​‌INRIA, Intern,‌ from Mar 2025 until‌​‌ Aug 2025]
  • Beatriz​​ Evelbauer [ENSTA,​​​‌ from May 2025 until‌ Aug 2025]
  • Anna‌​‌ Hollands [INRIA,​​ Intern, from Apr​​​‌ 2025 until Sep 2025‌]
  • Ilian Riveiro [‌​‌Université Paris-Saclay, Intern​​, from Mar 2025​​​‌ until Aug 2025,‌ Université Paris-Saclay]
  • Aurora‌​‌ Rivet [UNIV COTE​​ D'AZUR, Intern,​​​‌ from Jun 2025 until‌ Aug 2025]
  • Jérôme‌​‌ Taupin [ENS Paris​​ , until Aug 2025​​​‌]

Administrative Assistants

  • Sophie‌ Honnorat [INRIA]‌​‌
  • Laetitia Jubely [INRIA​​, from May 2025​​​‌]

Visiting Scientists

  • Marzieh‌ Eidi [MPI MiS,‌​‌ Germany, from Apr​​ 2025 until May 2025​​​‌]
  • Yuri Gardinazzi [‌UNIV TRIESTE, from‌​‌ Nov 2025]
  • Clément​​ Levrard [UNIV PARIS​​​‌, from May 2025‌ until May 2025]‌​‌
  • Javier Perera Lago [​​Univ Séville, from​​​‌ May 2025 until Jun‌ 2025]

External Collaborators‌​‌

  • Bertrand Michel [CENTRALE​​ NANTES]
  • Martin Royer​​​‌ [SYSTEMX]

2‌ Overall objectives

During the‌​‌ last two decades, building​​ on solid theoretical and​​​‌ algorithmic bases, geometric inference‌ and computational topology have‌​‌ experienced important developments towards​​ data analysis. New mathematically​​​‌ well-founded theories gave birth‌ to the field of‌​‌ Topological Data Analysis (​​tda), which is​​​‌ now arousing interest from‌ both academia and industry.‌​‌ Although one can trace​​ back geometric approaches for​​​‌ data analysis quite far‌ in the past, tda‌​‌ really started as a​​ field with the pioneering​​​‌ works of H. Edelsbrunner‌ et al. and G.‌​‌ Carlsson et al. in​​​‌ persistent homology at the​ beginning of the century.​‌ tdais mainly motivated​​ by the idea that​​​‌ topology and geometry provide​ a powerful approach to​‌ infer robust qualitative, and​​ sometimes quantitative, information about​​​‌ the structure of data.​ It aims at providing​‌ mathematical results and methods​​ to infer, analyze and​​​‌ exploit complex data (point​ clouds, graphs, images, 3D​‌ shapes, time series...). It​​ also intends to give​​​‌ access to robust and​ efficient data structures and​‌ algorithms to represent these​​ data and that are​​​‌ amenable to precise analysis.​

The overall objective of​‌ DataShape is three-fold:

  1. to​​ settle the mathematical, statistical​​​‌ and algorithmic foundations of​ tda, and, more​‌ generally to contribute to​​ the development of topological​​​‌ and geometric approaches in​ Machine Learning and AI;​‌
  2. to develop a new​​ family of well-founded and​​​‌ efficient data structures, algorithms​ and methods to uncover​‌ and exploit the geometry​​ of data through the​​​‌ development of a state-of-the-art​ and easy-to-use open source​‌ software;
  3. to disseminate​​ and promote tda research​​​‌ and outcomes among the​ data science community through​‌ collaborations with other domains​​ of science and industrials​​​‌.

The approach of​ DataShape relies on the​‌ conviction that, to reach​​ these objectives, combining statistical,​​​‌ topological/geometric and computational approaches​ in a common framework​‌ is mandatory. For that​​ purpose, DataShape became a​​​‌ joint team with the​ Laboratoire de Mathématiques d'Orsay​‌ in 2020 and now​​ gathers a wide variety​​​‌ of expertise, going from​ fundamental mathematics to software​‌ development and industrial applications.​​ The team also considers​​​‌ that tda needs to​ be combined with other​‌ data sciences approaches and​​ tools, in particular statistical​​​‌ learning, to lead to​ successful real applications. Significant​‌ efforts have been made​​ during the evaluation period​​​‌ to develop several long​ term industrial research collaborations​‌ in data science and​​ AI.

The research program​​​‌ of DataShape is organized​ around four strongly correlated​‌ axes reflecting our will​​ to address tda challenges​​​‌ in a global and​ unified framework.

The first​‌ axis focuses on the​​ algorithmic aspects of tda​​​‌ and geometric inference as​ well as the mathematical​‌ foundations of the fields.​​ Fundamental problems are the​​​‌ construction, processing and analysis​ of discrete representations of​‌ complex and possibly high​​ dimensional shapes.

The second​​​‌ axis is dedicated to​ the statistical aspects of​‌ tda . It is​​ dedicated to the study​​​‌ of the properties of​ topological information inferred from​‌ data from a statistical​​ perspective and intends to​​​‌ propose new models and​ approaches for the development​‌ of tda in well-founded​​ probabilistic and statistical settings.​​​‌ This axis also includes​ the analysis and development​‌ of general-purpose statistical learning​​ approaches and tools that​​​‌ are currently active in​ the community and of​‌ relevance for Datashape's scientific​​ goals.

The third axis​​​‌ is driven by the​ problems raised by the​‌ use of topological and​​ geometric approaches in machine​​​‌ learning. It aims​ at better understanding the​‌ role of topological and​​ geometric structures in machine​​​‌ learning problems and at​ applying tda tools to​‌ develop specialized topological approaches​​ to be used in​​ combination with other machine​​​‌ learning methods.

The fourth‌ axis is dedicated to‌​‌ software development and experimental​​ research, mainly through​​​‌ the GUDHI platform.‌ GUDHI is intended to‌​‌ provide a high quality​​ state-of-the-art implementation of data​​​‌ structures and algorithms dedicated‌ to tdathrough an‌​‌ easy-to-use open source software.​​

Each DataShape member is​​​‌ involved in several research‌ axes ensuring strong connections‌​‌ and interactions between them.​​ Last, although the above​​​‌ 4 axes concentrate the‌ main research activities of‌​‌ the team, DataShape always​​ remains open and encourages​​​‌ its members to explore‌ new directions and approaches‌​‌ related to geometric and​​ topological methods in data​​​‌ analysis and machine learning.‌ The past experience of‌​‌ the team has shown​​ that such a strategy​​​‌ is often very fruitful‌ and may lead to‌​‌ innovative and new research​​ directions.

3 Research program​​​‌

3.1 Algorithmic aspects and‌ new mathematical directions for‌​‌ topological and geometric data​​ analysis

tda requires to​​​‌ construct and manipulate appropriate‌ representations of complex and‌​‌ high dimensional shapes. A​​ major difficulty comes from​​​‌ the fact that the‌ complexity of data structures‌​‌ and algorithms used to​​ approximate shapes rapidly grows​​​‌ as the dimensionality increases,‌ which makes them intractable‌​‌ in high dimensions. We​​ focus our research on​​​‌ simplicial complexes which offer‌ a convenient representation of‌​‌ general shapes and generalize​​ graphs and triangulations. Our​​​‌ work includes the study‌ of simplicial complexes with‌​‌ good approximation properties and​​ the design of compact​​​‌ data structures to represent‌ them.

In low dimensions,‌​‌ effective shape reconstruction techniques​​ exist that can provide​​​‌ precise geometric approximations very‌ efficiently and under reasonable‌​‌ sampling conditions. Extending those​​ techniques to higher dimensions​​​‌ as is required in‌ the context of tda‌​‌ is problematic since almost​​ all methods in low​​​‌ dimensions rely on the‌ computation of a subdivision‌​‌ of the ambient space.​​ A direct extension of​​​‌ those methods would immediately‌ lead to algorithms whose‌​‌ complexities depend exponentially on​​ the ambient dimension, which​​​‌ is prohibitive in most‌ applications. A first direction‌​‌ to by-pass the curse​​ of dimensionality is to​​​‌ develop algorithms whose complexities‌ depend on the intrinsic‌​‌ dimension of the data​​ (which most of the​​​‌ time is small although‌ unknown) rather than on‌​‌ the dimension of the​​ ambient space. Another direction​​​‌ is to resort to‌ cruder approximations that only‌​‌ captures the homotopy type​​ or the homology of​​​‌ the sampled shape. The‌ recent theory of persistent‌​‌ homology provides a powerful​​ and robust tool to​​​‌ study the homology of‌ sampled spaces in a‌​‌ stable way.

3.2 Statistical​​ aspects of topological and​​​‌ geometric data analysis

The‌ wide variety of larger‌​‌ and larger available data​​ - often corrupted by​​​‌ noise and outliers -‌ requires to consider the‌​‌ statistical properties of their​​ topological and geometric features​​​‌ and to propose new‌ relevant statistical models for‌​‌ their study.

There exist​​ various statistical and machine​​​‌ learning methods intending to‌ uncover the geometric structure‌​‌ of data. Beyond manifold​​ learning and dimensionality reduction​​​‌ approaches that generally do‌ not allow to assert‌​‌ the relevance of the​​​‌ inferred topological and geometric​ features and are not​‌ well-suited for the analysis​​ of complex topological structures,​​​‌ set estimation methods intend​ to estimate, from random​‌ samples, a set around​​ which the data is​​​‌ concentrated. In these methods,​ that include support and​‌ manifold estimation, principal curves/manifolds​​ and their various generalizations​​​‌ to name a few,​ the estimation problems are​‌ usually considered under losses,​​ such as Hausdorff distance​​​‌ or symmetric difference, that​ are not sensitive to​‌ the topology of the​​ estimated sets, preventing these​​​‌ tools to directly infer​ topological or geometric information.​‌

Regarding purely topological features,​​ the statistical estimation of​​​‌ homology or homotopy type​ of compact subsets of​‌ Euclidean spaces, has only​​ been considered recently, most​​​‌ of the time under​ the quite restrictive assumption​‌ that the data are​​ randomly sampled from smooth​​​‌ manifolds.

In a more​ general setting, with the​‌ emergence of new geometric​​ inference tools based on​​​‌ the study of distance​ functions and algebraic topology​‌ tools such as persistent​​ homology, computational topology has​​​‌ recently seen an important​ development offering a new​‌ set of methods to​​ infer relevant topological and​​​‌ geometric features of data​ sampled in general metric​‌ spaces. The use of​​ these tools remains widely​​​‌ heuristic and until recently​ there were only a​‌ few preliminary results establishing​​ connections between geometric inference,​​​‌ persistent homology and statistics.​ However, this direction has​‌ attracted a lot of​​ attention over the last​​​‌ three years. In particular,​ stability properties and new​‌ representations of persistent homology​​ information have led to​​​‌ very promising results to​ which the DataShape members​‌ have significantly contributed. These​​ preliminary results open many​​​‌ perspectives and research directions​ that need to be​‌ explored.

Our goal is​​ to build on our​​​‌ first statistical results in​ tda to develop the​‌ mathematical foundations of Statistical​​ Topological and Geometric Data​​​‌ Analysis. Combined with the​ other objectives, our ultimate​‌ goal is to provide​​ a well-founded and effective​​​‌ statistical toolbox for the​ understanding of topology and​‌ geometry of data.

3.3​​ Topological and geometric approaches​​​‌ for machine learning

This​ objective is driven by​‌ the problems raised by​​ the use of topological​​​‌ and geometric approaches in​ machine learning. The goal​‌ is both to use​​ our techniques to better​​​‌ understand the role of​ topological and geometric structures​‌ in machine learning problems​​ and to apply our​​​‌ tda tools to develop​ specialized topological approaches to​‌ be used in combination​​ with other machine learning​​​‌ methods.

3.4 Experimental research​ and software development

We​‌ develop a high quality​​ open source software platform​​​‌ called gudhi which is​ becoming a reference in​‌ geometric and topological data​​ analysis in high dimensions.​​​‌ The goal is not​ to provide code tailored​‌ to the numerous potential​​ applications but rather to​​​‌ provide the central data​ structures and algorithms that​‌ underlie applications in geometric​​ and topological data analysis.​​​‌

The development of the​ gudhi platform also serves​‌ to benchmark and optimize​​ new algorithmic solutions resulting​​​‌ from our theoretical work.​ Such development necessitates a​‌ whole line of research​​ on software architecture and​​ interface design, heuristics and​​​‌ fine-tuning optimization, robustness and‌ arithmetic issues, and visualization.‌​‌ We aim at providing​​ a full programming environment​​​‌ following the same recipes‌ that made up the‌​‌ success story of the​​ cgal  library, the reference​​​‌ library in computational geometry.‌

Some of the algorithms‌​‌ implemented on the platform​​ will also be interfaced​​​‌ to other software platforms,‌ such as the R‌​‌ software for statistical computing,​​ and languages such as​​​‌ Python in order to‌ make them usable in‌​‌ combination with other data​​ analysis and machine learning​​​‌ tools. A first attempt‌ in this direction has‌​‌ been done with the​​ creation of an R​​​‌ package called TDA in‌ collaboration with the group‌​‌ of Larry Wasserman at​​ Carnegie Mellon University (Inria​​​‌ Associated team CATS) that‌ already includes some functionalities‌​‌ of the gudhi library​​ and implements some joint​​​‌ results between our team‌ and the CMU team.‌​‌ A similar interface with​​ the Python language is​​​‌ also considered a priority.‌ To go even further‌​‌ towards helping users, we​​ will provide utilities that​​​‌ perform the most common‌ tasks without requiring any‌​‌ programming at all.

4​​ Application domains

Our work​​​‌ is mostly of a‌ fundamental mathematical and algorithmic‌​‌ nature but finds a​​ variety of applications in​​​‌ data analysis, e.g., in‌ material science, biology, sensor‌​‌ networks, 3D shape analysis​​ and processing, to name​​​‌ a few.

More specifically,‌ DataShape has developed and‌​‌ is still developing a​​ strong expertise on new​​​‌ TDA methods for Machine‌ Learning and Artificial Intelligence‌​‌ for complex data and​​ (complex) time-dependent data. This​​​‌ includes, for example:

  • domain‌ adaptation problems for time‌​‌ series (PhD of Myriam​​ Frikha with Ericsson),
  • robust​​​‌ climate modelling (collaboration with‌ University of Oxford)
  • anomaly‌​‌ detection (with IRT Systemx​​ and Confiance.AI program),
  • the​​​‌ statistical significance of biological‌ phenomena (cell cycle, stem‌​‌ cell differentiation, immune system​​ responses) that occur in​​​‌ large scale single-cell RNAseq‌ and spatial transcriptomics data‌​‌ sets (collaboration with Rizvi​​ Lab, University of Wisconsin),​​​‌
  • the analysis of gene‌ regulatory networks for plant-pathogen‌​‌ interactions (collaboration with INRAE):​​ internship of Ludo Andrianirina,​​​‌
  • the analysis of satellite‌ imaging and cartography data‌​‌ sets (collaboration with Thalès​​ Alenia Space).

5 Social​​​‌ and environmental responsibility

5.1‌ Footprint of research activities‌​‌

The weekly research seminar​​ of DataShape is now​​​‌ taking place in hybrid‌ mode. The travels for‌​‌ the team members have​​ decreased a lot these​​​‌ years to take care‌ of the environmental footprint‌​‌ of the team. The​​ use of train, instead​​​‌ of plane, is strongly‌ encouraged, when possible.

6‌​‌ Highlights of the year​​

6.1 Awards

  • Test of​​​‌ time award of the‌ Symposium on Computational Geometry‌​‌ (SoCG) for: David Cohen-Steiner​​ , Herbert Edelsbrunner, John​​​‌ Harer: Stability of persistence‌ diagrams, SoCG 2005. Online‌​‌ announcement
  • One of the​​ Prix d'excellence de l'Université​​​‌ Côte d'Azur awarded to‌ David Cohen-Steiner .
  • Best‌​‌ figure award (over the​​ last 5 years) at​​​‌ the Symposium on Computational‌ Geometry (SoCG) for: Tight‌​‌ Bounds for the Learning​​ of Homotopy à la​​​‌ Niyogi, Smale, and Weinberger‌ for Subsets of Euclidean‌​‌ Spaces and of Riemannian​​​‌ Manifolds. Dominique Attali, Hana​ Dal Poz Kouřimská, Christopher​‌ Fillmore, Ishika Ghosh, André​​ Lieutier, Elizabeth Stephenson, Mathijs​​​‌ Wintraecken, SoCG 2024.
  • Best​ paper award (honorable mention)​‌ at SIGGRAPH ASIA 2025​​ for: Efficient and scalable​​​‌ spatial regularization of optimal​ transport by Lucas Brifault,​‌ David Cohen-Steiner and Mathieu​​ Desbrun.

6.2 PhD defenses​​​‌

  • Hugo Henneuse, supervised by​ F. Chazal and P.​‌ Massart. June 13, 2025.​​
  • Charly Boricaud, supervised by​​​‌ B. Buet and S.​ Masnou. December 9, 2025.​‌

7 Latest software developments,​​ platforms, open data

In​​​‌ 2025 we developed new​ software, which we discuss​‌ below.

7.1 Latest software​​ developments

7.1.1 GUDHI

  • Name:​​​‌
    Geometric Understanding in Higher​ Dimensions
  • Keywords:
    Computational geometry,​‌ Topology, Clustering
  • Scientific Description:​​

    The Gudhi library is​​​‌ an open source library​ for Computational Topology and​‌ Topological Data Analysis (TDA).​​ It offers state-of-the-art algorithms​​​‌ to construct various types​ of simplicial complexes, data​‌ structures to represent them,​​ and algorithms to compute​​​‌ geometric approximations of shapes​ and persistent homology.

    The​‌ GUDHI library offers the​​ following interoperable modules:

    .​​​‌ Complexes: + Cubical +​ Simplicial: Rips, Witness, Alpha​‌ and Čech complexes +​​ Cover: Nerve and Graph​​​‌ induced complexes . Data​ structures and basic operations:​‌ + Simplex tree, Skeleton​​ blockers and Toplex map​​​‌ + Construction, update, filtration​ and simplification . Topological​‌ descriptors computation . Manifold​​ reconstruction . Topological descriptors​​​‌ tools: + Bottleneck and​ Wasserstein distance + Statistical​‌ tools + Persistence diagram​​ and barcode

  • Functional Description:​​​‌
    The GUDHI open source​ library will provide the​‌ central data structures and​​ algorithms that underly applications​​​‌ in geometry understanding in​ higher dimensions. It is​‌ intended to both help​​ the development of new​​​‌ algorithmic solutions inside and​ outside the project, and​‌ to facilitate the transfer​​ of results in applied​​​‌ fields.
  • News of the​ Year:

    Below is a​‌ list of changes made​​ since GUDHI 3.10.1:

    -​​​‌ Delaunay complex . The​ Delaunay complex can be​‌ equipped with different filtrations:​​ * Delaunay complex (no​​​‌ filtration values computed) *​ Delaunay-Čech complex (using minimal​‌ enclosing ball) * Alpha​​ complex (moved in this​​​‌ new section) . The​ Delaunay-Čech and Alpha complex​‌ can output square, or​​ not square, filtration values​​​‌ . An incremental version​ of the Delaunay complex​‌ (only in C++)

    -​​ Rips complex persistence scikit-learn​​​‌ like interface . A​ binding to Ripser when​‌ it accelerates the computation​​

    - Persistence graphical tools​​​‌ . Can now handle​ scikit-learn like interfaces outputs​‌ as inputs

    - Simplex​​ tree . Can now​​​‌ store additionnal data on​ each simplex (only in​‌ C++) . Can be​​ const

  • URL:
  • Publication:​​​‌
  • Contact:
    Marc Glisse​
  • Participants:
    Marc Glisse, Hannah​‌ Schreiber, 17 anonymous participants​​
  • Partners:
    Université Côte d'Azur​​​‌ (UCA), Fujitsu

7.1.2 Multipers​

  • Name:
    Multiparameter Persistence for​‌ Machine Learning
  • Keywords:
    Topology,​​ Machine learning
  • Functional Description:​​​‌
    multipers is a Python​ library for Topological Data​‌ Analysis, focused on Multiparameter​​ Persistence computation and visualizations​​​‌ for Machine Learning. It​ features several efficient computational​‌ and visualization tools, with​​ integrated, easy to use,​​​‌ auto-differentiable Machine Learning pipelines,​ that can be seamlessly​‌ interfaced with scikit-learn and​​ PyTorch. This library is​​ meant to be usable​​​‌ for non-experts in Topological‌ or Geometrical Machine Learning.‌​‌ Performance-critical functions are implemented​​ in C++ or in​​​‌ Cython, are parallelizable with‌ TBB, and have Python‌​‌ bindings and interface. It​​ can handle a very​​​‌ diverse range of datasets‌ that can be framed‌​‌ into a (finite) multi-filtered​​ simplicial or cell complex,​​​‌ including, e.g., point clouds,‌ graphs, time series, images,‌​‌ etc.
  • URL:
  • Publication:​​
  • Contact:
    David Loiseaux​​​‌
  • Participants:
    Hannah Schreiber, an‌ anonymous participant

8 New‌​‌ results

8.1 Algorithmic aspects​​ and new mathematical directions​​​‌ for topological and geometric‌ data analysis

8.1.1 Efficient‌​‌ and Scalable Spatial Regularization​​ of Optimal Transport

Participants:​​​‌ David Cohen-Steiner.

In‌ collaboration with Lucas Brifault‌​‌ (Dassault Systèmes) and Mathieu​​ Desbrun (GEOMERIX)

In this​​​‌ paper (18),‌ we introduce a novel‌​‌ approach to spatial regularization​​ of optimal transport problems.​​​‌ Based on the notion‌ of forward and backward‌​‌ "mean maps" of a​​ transport plan, we introduce​​​‌ a convex formulation of‌ optimal transport problems that‌​‌ incorporates regularization of these​​ mean maps to promote​​​‌ spatial continuity of the‌ resulting optimal plan. Unlike‌​‌ previous regularization approaches that​​ required the optimization of​​​‌ all the transport plan‌ coefficients, our formulation translates‌​‌ into an ADMM-based solver​​ combined with Sinkhorn type​​​‌ algorithms, which drastically reduces‌ the number of variables‌​‌ and scales up to​​ large problems. We demonstrate​​​‌ the usefulness and efficiency‌ of this new computational‌​‌ tool for various applications​​ and for different regularizations.​​​‌

8.1.2 Burning or Collapsing‌ the Medial Axis is‌​‌ Unstable

Participants: Mathijs Wintraecken​​.

In collaboration with​​​‌ Erin Chambers (University of‌ Notre Dame, USA), Christopher‌​‌ Fillmore (Institute of Science​​ and Technology Austria), Elizabeth​​​‌ Stephenson (Orteliu, Oslo, Norway)‌

The medial axis of‌​‌ a set consists of​​ the points in the​​​‌ ambient space without a‌ unique closest point in‌​‌ the original set. Since​​ its introduction, the medial​​​‌ axis has been used‌ extensively in many applications‌​‌ as a method of​​ computing a skeleton topologically​​​‌ equivalent to the original‌ set. Unfortunately, one limiting‌​‌ factor in the use​​ of the medial axis​​​‌ of a smooth manifold‌ is that it is‌​‌ not necessarily topologically stable​​ under small perturbations of​​​‌ the manifold. To counter‌ these instabilities, various prunings‌​‌ of the medial axis​​ have been proposed in​​​‌ the computational geometry community.‌ In this paper (‌​‌12), we examine​​ one type of pruning,​​​‌ called burning. Because of‌ the good experimental results‌​‌ it was hoped that​​ the burning method of​​​‌ simplifying the medial axis‌ would be stable. In‌​‌ this work, we show​​ a simple example that​​​‌ dashes such hopes. Based‌ on Bing's house with‌​‌ two rooms, we demonstrate​​ an isotopy of a​​​‌ shape where the medial‌ axis goes from collapsible‌​‌ to non-collapsible. More precisely,​​ we consider the standard​​​‌ deformation retract from the‌ closed ball to Bing's‌​‌ house with two rooms,​​ but stop just short​​​‌ of the point where‌ Bing's house becomes two‌​‌ dimensional. This way we​​ obtain an isotopy from​​​‌ the 3-ball to a‌ thickened version of Bing's‌​‌ house. Under this isotopy,​​​‌ the medial axis goes​ from collapsible to non-collapsible.​‌ We stress that this​​ isotopy can be made​​​‌ generic, in the sense​ of singularity theory, as​‌ developed by Arnold and​​ Thom.

8.1.3 Sparsification of​​​‌ the generalized persistence diagrams​ for scalability through gradient​‌ descent

Participant: Mathieu Carrière​​.

In collaboration with​​​‌ Seunghyun Kim, Woojin Kim​ (KAIST, South Korea)

The​‌ generalized persistence diagram (GPD)​​ is a natural extension​​​‌ of the classical persistence​ barcode to the setting​‌ of multi-parameter persistence and​​ beyond. The GPD is​​​‌ defined as an integer-valued​ function whose domain is​‌ the set of intervals​​ in the indexing poset​​​‌ of a persistence module,​ and is known to​‌ be able to capture​​ richer topological information than​​​‌ its single-parameter counterpart. However,​ computing the GPD is​‌ computationally prohibitive due to​​ the sheer size of​​​‌ the interval set. Restricting​ the GPD to a​‌ subset of intervals provides​​ a way to manage​​​‌ this complexity, compromising discriminating​ power to some extent.​‌ However, identifying and computing​​ an effective restriction of​​​‌ the domain that minimizes​ the loss of discriminating​‌ power remains an open​​ challenge.

In this work,​​​‌ we introduce a novel​ method for optimizing the​‌ domain of the GPD​​ through gradient descent optimization.​​​‌ To achieve this, we​ introduce a loss function​‌ tailored to optimize the​​ selection of intervals, balancing​​​‌ computational efficiency and discriminative​ accuracy. The design of​‌ the loss function is​​ based on the known​​​‌ erosion stability property of​ the GPD. We showcase​‌ the efficiency of our​​ sparsification method for dataset​​​‌ classification in supervised machine​ learning. Experimental results demonstrate​‌ that our sparsification method​​ significantly reduces the time​​​‌ required for computing the​ GPDs associated to several​‌ datasets, while maintaining classification​​ accuracies comparable to those​​​‌ achieved using full GPDs.​ Our method thus opens​‌ the way for the​​ use of GPD-based methods​​​‌ to applications at an​ unprecedented scale.

8.1.4 Multi-parameter​‌ Module Approximation: An Efficient​​ and Interpretable Invariant for​​​‌ Multi-Parameter Persistence Modules with​ Guarantees

Participant: Mathieu Carrière​‌.

In collaboration with​​ David Loiseaux (Inria Saclay)​​​‌ and Andrew J. Blumberg​ (Columbia University, USA)

Topological​‌ data analysis (TDA) is​​ a rapidly growing area​​​‌ of data science, whose​ most common descriptor is​‌ persistent homology, which tracks​​ the topological changes in​​​‌ growing families of subsets​ of the data set​‌ itself, called filtrations, and​​ encodes them in an​​​‌ algebraic object, called a​ persistence module. The algorithmic​‌ and theoretical properties of​​ persistence modules are now​​​‌ well understood in the​ single-parameter case, that is,​‌ when there is only​​ one filtration (e.g., feature​​​‌ scale) to study. In​ contrast, much less is​‌ known in the multi-parameter​​ case, where several filtrations​​​‌ (e.g., scale and density)​ are used simultaneously. Since​‌ multi-parameter persistence modules usually​​ encode information that is​​​‌ invisible to their single-parameter​ counterparts, it is critical​‌ to build tractable proxies​​ for them, ideally with​​​‌ some theoretical robustness guarantees.​ In this article, we​‌ introduce a new parameterized​​ family of topological descriptors,​​​‌ taking the form of​ candidate decompositions, for multi-parameter​‌ persistence modules, and we​​ a identify a subfamily​​ of these descriptors, that​​​‌ we call approximate decompositions,‌ that are controllable approximations,‌​‌ in the sense that​​ they preserve diagonal barcodes.​​​‌ Then, we introduce MMA‌ (Multipersistence Module Approximation): an‌​‌ algorithm based on matching​​ functions for computing instances​​​‌ of candidate decompositions with‌ some precision parameter δ‌​‌>0. By​​ design, MMA can handle​​​‌ an arbitrary number of‌ filtrations, and has bounded‌​‌ complexity and running time.​​ Moreover, we prove the​​​‌ robustess of MMA: when‌ computed with so-called compatible‌​‌ matching functions, we show​​ that MMA produces approximate​​​‌ decompositions (and we prove‌ that such matching functions‌​‌ exist for n=​​2 filtrations). Next, we​​​‌ restrict the focus on‌ modules that can be‌​‌ decomposed into interval summands.​​ In that case, compatible​​​‌ matching functions always exist,‌ and we show that,‌​‌ for small enough δ​​, the approximate decompositions​​​‌ obtained with such compatible‌ matching functions by MMA‌​‌ have an approximation error​​ (in terms of the​​​‌ standard interleaving and bottleneck‌ distances) that is bounded‌​‌ by δ, and​​ that reaches zero for​​​‌ an even smaller, positive‌ precision δ exact .‌​‌ Finally, we present empirical​​ evidence validating that MMA​​​‌ has state-of-the-art performance and‌ running time on several‌​‌ data sets.

8.1.5 A​​ fast algorithm for the​​​‌ Hecke representation of the‌ braid group, and applications‌​‌ to the computation of​​ the HOMFLY-PT polynomial and​​​‌ the search for interesting‌ braids

Participant: Clément Maria‌​‌.

In collaboration with​​ Hoel Queffelec (CNRS -​​​‌ Institut Montpelliérain Alexander Grothendieck).‌

Knot theory is an‌​‌ active field of mathematics,​​ in which combinatorial and​​​‌ computational methods play an‌ important role. One side‌​‌ of computational knot theory,​​ that has gained interest​​​‌ in recent years, both‌ for complexity analysis and‌​‌ practical algorithms, is quantum​​ topology and the computation​​​‌ of topological invariants issued‌ from the theory. In‌​‌ this article 40,​​ we leverage the rigidity​​​‌ brought by the representation-theoretic‌ origins of the quantum‌​‌ invariants for algorithmic purposes.​​ We do so by​​​‌ exploiting braids and the‌ algebraic properties of the‌​‌ braid group to describe,​​ analyze, and implement a​​​‌ fast algorithm to compute‌ the Hecke representation of‌​‌ the braid group. We​​ apply this construction to​​​‌ design a parameterized algorithm‌ to compute the HOMFLY-PT‌​‌ polynomial of knots, and​​ demonstrate its interest experimentally.​​​‌ Finally, we combine our‌ fast Hecke representation algorithm‌​‌ with Garside theory, to​​ implement a reservoir sampling​​​‌ search and find non-trivial‌ braids with trivial Hecke‌​‌ representations with coefficients in​​ /pℤ​​​‌. We find several‌ such braids, in particular‌​‌ proving that the Hecke​​ representation of B5​​​‌ with /2‌ coefficients is non-faithful.‌​‌

8.1.6 On Sparse Representations​​ of 3-Manifolds

Participant: Clément​​​‌ Maria.

In collaboration‌ with Kristóf Huszár (TU‌​‌ Graz).

3-manifolds are commonly​​ represented as triangulations, consisting​​​‌ of abstract tetrahedra whose‌ triangular faces are identified‌​‌ in pairs. The combinatorial​​ sparsity of a triangulation,​​​‌ as measured by the‌ treewidth of its dual‌​‌ graph, plays a fundamental​​ role in the design​​​‌ of parameterized algorithms. In‌ this work 36,‌​‌ we investigate algorithmic procedures​​​‌ that transform or modify​ a given triangulation while​‌ controlling specific sparsity parameters.​​ First, we describe a​​​‌ linear-time algorithm that converts​ a given triangulation into​‌ a Heegaard diagram of​​ the underlying 3-manifold, showing​​​‌ that the construction preserves​ treewidth. We apply this​‌ construction to exhibit a​​ fixed-parameter tractable framework for​​​‌ computing Kuperberg's quantum invariants​ of 3-manifolds. Second, we​‌ present a quasi-linear-time algorithm​​ that retriangulates a given​​​‌ triangulation into one with​ maximum edge valence of​‌ at most nine, while​​ only moderately increasing the​​​‌ treewidth of the dual​ graph. Combining these two​‌ algorithms yields a quasi-linear-time​​ algorithm that produces, from​​​‌ a given triangulation, a​ Heegaard diagram in which​‌ every attaching curve intersects​​ at most nine others.​​​‌

8.1.7 Compressed data structures​ for Heegaard splittings

Participant:​‌ Henrique Ennes, Clément​​ Maria.

Heegaard splittings​​​‌ provide a natural representation​ of closed 3-manifolds by​‌ gluing two handlebodies along​​ a common surface. These​​​‌ splittings can be equivalently​ given by two finite​‌ sets of meridians lying​​ on the surface, which​​​‌ define a Heegaard diagram.​ In this work 34​‌, we present a​​ data structure to effectively​​​‌ represent Heegaard diagrams as​ normal curves with respect​‌ to triangulations of a​​ surface, where the complexity​​​‌ is measured by the​ space required to express​‌ the normal coordinates' vectors​​ in binary. This structure​​​‌ can be significantly more​ compact than triangulations of​‌ 3-manifolds, yielding exponential gains​​ for certain families. Even​​​‌ with this succinct definition​ of complexity, we establish​‌ polynomial-time algorithms for comparing​​ and manipulating diagrams, performing​​​‌ stabilizations, detecting trivial stabilizations​ and reductions, and computing​‌ topological invariants of the​​ underlying manifolds, such as​​​‌ their fundamental and homology​ groups. We also contrast​‌ early implementations of our​​ techniques with standard software​​​‌ programs for 3-manifolds, achieving​ faster algorithms for the​‌ average cases and exponential​​ gains in speed for​​​‌ some particular presentations of​ the inputs.

8.1.8 Hardness​‌ of computation of quantum​​ invariants on 3-manifolds with​​​‌ restricted topology

Participant: Henrique​ Ennes, Clément Maria​‌.

Quantum invariants in​​ low dimensional topology offer​​​‌ a wide variety of​ valuable invariants of knots​‌ and 3-manifolds, presented by​​ explicit formulas that are​​​‌ readily computable. Their computational​ complexity has been actively​‌ studied and is tightly​​ connected to topological quantum​​​‌ computing. In this article​ 21, we prove​‌ that for any 3-manifold​​ quantum invariant in the​​​‌ Reshetikhin-Turaev model, there is​ a deterministic polynomial time​‌ algorithm that, given as​​ input an arbitrary closed​​​‌ 3-manifold M, outputs​ a closed 3-manifold M​‌' with same quantum​​ invariant, such that M​​​‌' is hyperbolic, contains​ no low genus embedded​‌ incompressible surface, and is​​ presented by a strongly​​​‌ irreducible Heegaard diagram. Our​ construction relies on properties​‌ of Heegaard splittings and​​ the Hempel distance. At​​​‌ the level of computational​ complexity, this proves that​‌ the hardness of computing​​ a given quantum invariant​​​‌ of 3-manifolds is preserved​ even when severely restricting​‌ the topology and the​​ combinatorics of the input.​​​‌ This positively answers a​ question raised by Samperton.​‌

8.1.9 Well-quasi-orders on embedded​​ planar graphs

Participant: Clément​​ Maria, Corentin Lunel​​​‌.

The central theorem‌ of topological graph theory‌​‌ states that the graph​​ minor relation is a​​​‌ well-quasi-order on graphs. It‌ has far-reaching consequences, in‌​‌ particular in the study​​ of graph structures and​​​‌ the design of (parameterized)‌ algorithms. In this article‌​‌ 39, we study​​ two embedded versions of​​​‌ classical minor relations from‌ structural graph theory and‌​‌ prove that they are​​ also well-quasi-orders on general​​​‌ or restricted classes of‌ embedded planar graphs. These‌​‌ embedded minor relations appear​​ naturally for intrinsically embedded​​​‌ objects, such as knot‌ diagrams and surfaces in‌​‌ 3. Handling​​ the extra topological constraints​​​‌ of the embeddings requires‌ careful analysis and extensions‌​‌ of classical methods for​​ the more constrained embedded​​​‌ minor relations. We prove‌ that the embedded version‌​‌ of immersion induces a​​ well-quasi-order on bounded carving-width​​​‌ plane graphs by exhibiting‌ particularly well-structured tree-decompositions and‌​‌ leveraging a classical argument​​ on well-quasi-orders on forests.​​​‌ We deduce that the‌ embedded graph minor relation‌​‌ defines a well-quasi-order on​​ plane graphs via their​​​‌ directed medial graphs, when‌ their branch-width is bounded.‌​‌ We conclude that the​​ embedded graph minor relation​​​‌ is a well-quasi-order on‌ all plane graphs, using‌​‌ classical grids theorems in​​ the unbounded branch-width case.​​​‌

8.1.10 Geometric characterisation of‌ structural and regular equivalences‌​‌ in undirected (hyper)graphs

Participant:​​ Nina Otter.

In​​​‌ collaboration with Marzieh Eidi‌ (MPI MiS).

Similarity notions‌​‌ between vertices in a​​ graph, such as structural​​​‌ and regular equivalence, are‌ one of the main‌​‌ ingredients in clustering tools​​ in complex network science.​​​‌ In this article 33‌ we generalise structural and‌​‌ regular equivalences for undirected​​ hypergraphs and provide a​​​‌ characterisation of structural and‌ regular equivalences of undirected‌​‌ graphs and hypergraphs through​​ neighbourhood graphs and Ollivier-Ricci​​​‌ curvature. Our characterisation sheds‌ new light on these‌​‌ similarity notions opening a​​ new avenue for their​​​‌ exploration. These characterisations also‌ enable the construction of‌​‌ a possibly wide family​​ of regular partitions, thereby​​​‌ offering a new route‌ to a task that‌​‌ has so far been​​ computationally challenging.

8.2 Statistical​​​‌ aspects of topological and‌ geometric data analysis

8.2.1‌​‌ Gromov-Wasserstein Bound between Reeb​​ and Mapper Graphs

Participant:​​​‌ Mathieu Carrière.

In‌ collaboration with Ziyad Oulhaj‌​‌ and Bertrand Michel (École​​ Centrale de Nantes, France)​​​‌

Since its introduction as‌ a computable approximation of‌​‌ the Reeb graph, the​​ Mapper graph has become​​​‌ one of the most‌ popular tools from topological‌​‌ data analysis for performing​​ data visualization and inference.​​​‌ However, finding an appropriate‌ metric (that is, a‌​‌ tractable metric with theoretical​​ guarantees) for comparing Reeb​​​‌ and Mapper graphs, in‌ order to, e.g., quantify‌​‌ the rate of convergence​​ of the Mapper graph​​​‌ to the Reeb graph,‌ is a difficult problem.‌​‌ While several metrics have​​ been proposed in the​​​‌ literature, none is able‌ to incorporate measure information,‌​‌ when data points are​​ sampled according to an​​​‌ underlying probability measure. The‌ resulting Reeb and Mapper‌​‌ graphs are therefore purely​​ deterministic and combinatorial, and​​​‌ substantial effort is thus‌ required to ensure their‌​‌ statistical validity. In this​​​‌ article, we handle this​ issue by treating Reeb​‌ and Mapper graphs as​​ metric measure spaces. This​​​‌ allows us to use​ Gromov-Wasserstein metrics to compare​‌ these graphs directly in​​ order to better incorporate​​​‌ the probability measures that​ data points are sampled​‌ from. Then, we describe​​ the geometry that arises​​​‌ from this perspective, and​ we derive rates of​‌ convergence of the Mapper​​ graph to the Reeb​​​‌ graph in this context.​ Finally, we showcase the​‌ usefulness of such metrics​​ for Reeb and Mapper​​​‌ graphs in a few​ numerical experiments.

8.3 Topological​‌ and geometric approaches for​​ machine learning

8.3.1 A​​​‌ Knowledge Graph and Topological​ Data Analysis Framework to​‌ Disentangle the Tomato-Multi Pathogens​​ Complex Gene Regulatory Network​​​‌

Participant: Mathieu Carrière.​

In collaboration with Maxime​‌ Multari, Xavier Amorós-Gabarrón, Alexina​​ Damy, Stéphanie Jaubert, Silvia​​​‌ Bottini (INRAE, France), Sebastian​ Lobentanzer, Julio Saez-Rodriguez and​‌ Aurélien Dugourd (Heidelberg University,​​ Germany)

Global population is​​​‌ rapidly increasing, representing a​ major challenge for food​‌ supply, exacerbated by climate​​ change and environmental degradation.​​​‌ Despite the pivotal role​ of agriculture, plant health​‌ and survival are threatened​​ by various biotic stressors.​​​‌ Although how plants respond​ to each of these​‌ individual stresses is well​​ studied, little is known​​​‌ about how they respond​ to a combination of​‌ many of these bio-aggressors​​ occurring together. To tackle​​​‌ this question, first, we​ built TomTom, a knowledge​‌ graph gathering molecular interactions​​ from nine publicly available​​​‌ databases, including transcription factors-​ or microRNAs- targets, proteinprotein​‌ interactions, and functional terms.​​ Then, we selected transcriptomics​​​‌ data of tomato subjected​ to six distinct pathogens​‌ and performed an integrative​​ analysis. We found 5561​​​‌ candidate genes involved in​ the multi-stress response of​‌ tomato. To study how​​ the response is orchestrated,​​​‌ we mapped those genes​ in TomTom and extracted​‌ a comprehensive gene regulatory​​ network (GRN) composed of​​​‌ 71 transcription factors (TF)​ and 1618 target genes.​‌ By estimating the TF​​ activity, we identified 43​​​‌ TFs responding either specifically​ to one or multiple​‌ bio-aggressors. GRN analyses with​​ a topological data analysis​​​‌ approach allowed to identify​ 18 clusters of TFs​‌ with similar properties, yielding​​ four main configurations localized​​​‌ in specific regions of​ the GRN. Finally, we​‌ found four ERF hubs​​ which cooperatively coordinate the​​​‌ tomato response to multiple​ pathogens. Our findings allowed​‌ to study the complex​​ molecular reprogramming in tomato​​​‌ upon interaction with different​ biotic agents, providing tools​‌ scalable to other questions​​ involving tomato molecular interactions​​​‌ and beyond.

8.3.2 Enhancer​ Dynamics and Spatial Organization​‌ Drive Anatomically Restricted Cellular​​ States in the Human​​​‌ Spinal Cord

Participant: Mathieu​ Carrière.

In collaboration​‌ with Elena K. Kandror,​​ Alexis Peterson, Andreas Tjärnberg,​​​‌ Yuchen Xu, Abbas H.​ Rizvi (University of Wisconsin,​‌ USA), Anqi Wang, Jun​​ Hou Fung, William Pangburn,​​​‌ Raul Rabadan, Tom Maniatis​ (Columbia University, USA), Jackson​‌ Loper (University of Michigan,​​ USA), Will Liao (NY​​​‌ Genome Center, USA), Krishnaa​ T. Mahbubani and Kourosh​‌ Saeb-Parsy (University of Cambridge,​​ UK)

Here, we report​​​‌ the spatial organization of​ RNA transcription and associated​‌ enhancer dynamics in the​​ human spinal cord at​​ single-cell and single-molecule resolution.​​​‌ We expand traditional multiomic‌ measurements to reveal epigenetically‌​‌ poised and bivalent active​​ transcriptional enhancer states that​​​‌ define cell type specification.‌ Simultaneous detection of chromatin‌​‌ accessibility and histone modifications​​ in spinal cord nuclei​​​‌ reveals previously unobserved cell-type‌ specific cryptic enhancer activity,‌​‌ in which transcriptional activation​​ is uncoupled from chromatin​​​‌ accessibility. Such cryptic enhancers‌ define both stable cell‌​‌ type identity and transitions​​ between cells undergoing differentiation.​​​‌ We also define glial‌ cell gene regulatory networks‌​‌ that reorganize along the​​ rostrocaudal axis, revealing anatomical​​​‌ differences in gene regulation.‌ Finally, we identify the‌​‌ spatial organization of cells​​ into distinct cellular organizations​​​‌ and address the functional‌ significance of this observation‌​‌ in the context of​​ paracrine signaling. We conclude​​​‌ that cellular diversity is‌ best captured through the‌​‌ lens of enhancer state​​ and intercellular interactions that​​​‌ drive transitions in cellular‌ state. This study provides‌​‌ fundamental insights into the​​ cellular organization of the​​​‌ healthy human spinal cord.‌

8.3.3 Fermat Distance-to-Measure: a‌​‌ robust Fermat-like metric

Participant:​​ Frédéric Chazal, Jérôme​​​‌ Taupin.

Given a‌ probability measure with density,‌​‌ Fermat distances and density-driven​​ metrics are conformal transformation​​​‌ of the Euclidean metric‌ that shrink distances in‌​‌ high density areas and​​ enlarge distances in low​​​‌ density areas. Although they‌ have been widely studied‌​‌ and have shown to​​ be useful in various​​​‌ machine learning tasks, they‌ are limited to measures‌​‌ with density (with respect​​ to Lebesgue measure, or​​​‌ volume form on manifold).‌ In 45, by‌​‌ replacing the density with​​ the Distance-to-Measure, we introduce​​​‌ a new metric, the‌ Fermat Distance-to-Measure, defined for‌​‌ any probability measure in​​ d. We​​​‌ derive strong stability properties‌ for the Fermat Distance-to-Measure‌​‌ with respect to the​​ measure and propose an​​​‌ estimator from random sampling‌ of the measure, featuring‌​‌ an explicit bound on​​ its convergence speed.

8.4​​​‌ Miscellaneous

8.4.1 Curvature-Guided Optimal‌ Transport for Rigid Point‌​‌ Cloud Registration

Participant: Mathijs​​ Wintraecken.

In collaboration​​​‌ with Roberto M Dyke‌ (TITANE), Marie-Aurélie Chanut (Cerema‌​‌ - Centre d'Etudes et​​ d'Expertise sur les Risques,​​​‌ l'Environnement, la Mobilité et‌ l'Aménagement), Pierre Alliez (TITANE)‌​‌

The rigid registration of​​ pairs of point sets​​​‌ is a fundamental step‌ for many downstream tasks‌​‌ including shape analysis, reconstruction​​ and localization. There has​​​‌ been a growing interest‌ in the use of‌​‌ Optimal Transport (OT) for​​ point cloud registration problems.​​​‌ However, these techniques face‌ limited adoption due to‌​‌ scalability issues—rendering them impractical—and​​ their sensitivity to missing​​​‌ data commonly encountered in‌ real-world scans. We consider‌​‌ how geometric information may​​ be incorporated into an​​​‌ OT registration framework for‌ improved accuracy and scalability.‌​‌ In this work, we​​ guide mini-batch selection by​​​‌ binning shape features based‌ on local curvature estimates.‌​‌ We demonstrate that our​​ method achieves better results​​​‌ than other OT-based methods‌ and is comparable to‌​‌ the state-of-the-art in terms​​ of successful registrations.

8.4.2​​​‌ Supervised Contamination Detection, with‌ Flow Cytometry Application

Participant:‌​‌ Gilles Blanchard, Frédéric​​ Chazal, Solenne Gaucher​​​‌.

In 15,‌ The contamination detection problem‌​‌ aims to determine whether​​​‌ a set of observations​ has been contaminated, i.e.​‌ whether it contains points​​ drawn from a distribution​​​‌ different from the reference​ distribution. Here, we consider​‌ a supervised problem, where​​ labeled samples drawn from​​​‌ both the reference distribution​ and the contamination distribution​‌ are available at training​​ time. This problem is​​​‌ motivated by the detection​ of rare cells in​‌ flow cytometry. Compared to​​ novelty detection problems or​​​‌ two-sample testing, where only​ samples from the reference​‌ distribution are available, the​​ challenge lies in efficiently​​​‌ leveraging the observations from​ the contamination detection to​‌ design more powerful tests.​​ In this article, we​​​‌ introduce a test for​ the supervised contamination detection​‌ problem. We provide non-asymptotic​​ guarantees on its Type​​​‌ I error, and characterize​ its detection rate. The​‌ test relies on estimating​​ reference and contamination densities​​​‌ using histograms, and its​ power depends strongly on​‌ the choice of the​​ corresponding partition. We present​​​‌ an algorithm for judiciously​ choosing the partition that​‌ results in a powerful​​ test. Simulations illustrate the​​​‌ good empirical performances of​ our partition selection algorithm​‌ and the efficiency of​​ our test. Finally, we​​​‌ showcase our method and​ apply it to a​‌ real flow cytometry dataset.​​

8.4.3 Transductive Conformal Inference​​​‌ for Full Ranking

Participant:​ Gilles Blanchard.

In​‌ collaboration with J-B. Fermanian​​ (U. Montpellier, IMAG, and​​​‌ Inria team IROKO), and​ P. Humbert (CNRS and​‌ Sorbonne Université)

In 22​​, we introduce a​​​‌ method based on Conformal​ Prediction (CP) to quantify​‌ the uncertainty of full​​ ranking algorithms. We focus​​​‌ on a specific scenario​ where n+m​‌ items are to be​​ ranked by some “black​​​‌ box” algorithm. It is​ assumed that the relative​‌ (ground truth) ranking of​​ n of them is​​​‌ known. The objective is​ then to quantify the​‌ error made by the​​ algorithm on the ranks​​​‌ of the m new​ items among the total​‌ (n+m​​). In such​​​‌ a setting, the true​ ranks of the n​‌ original items in the​​ total (n+​​​‌m) depend on​ the (unknown) true ranks​‌ of the m new​​ ones. Consequently, we have​​​‌ no direct access to​ a calibration set to​‌ apply a classical CP​​ method. To address this​​​‌ challenge, we propose to​ construct distribution-free bounds of​‌ the unknown conformity scores​​ using recent results on​​​‌ the distribution of conformal​ p-values. Using these scores​‌ upper bounds, we provide​​ valid prediction sets for​​​‌ the rank of any​ item. We also control​‌ the false coverage proportion,​​ a crucial quantity when​​​‌ dealing with multiple prediction​ sets. Finally, we empirically​‌ show on both synthetic​​ and real data the​​​‌ efficiency of our CP​ method for state-of-the-art algorithms​‌ such as RankNet or​​ LambdaMart.

8.4.4 Supervised aggregation​​​‌ of anomaly score functions​ for active anomaly detection​‌

Participant: Martin Royer.​​

In collaboration with Kevin​​​‌ Bleakley (Inria Celest), Mouhcine​ Mendil (IRT Saint Exupéry),​‌ Benjamin Auder (Laboratoire de​​ Mathématiques d'Orsay).

Detecting rare​​​‌ anomalies in batches of​ multidimensional data is challenging.​‌ In 11, we​​ propose a supervised active-learning​​ framework that sends a​​​‌ small number of data‌ points from each batch‌​‌ to an expert for​​ labeling as 'anomaly' or​​​‌ 'nominal', via two mechanisms:‌ (i) points most likely‌​‌ to be an anomaly​​ in the eyes of​​​‌ a supervised classifier trained‌ on previously-labeled data; and‌​‌ (ii) points suggested by​​ an active learner. Instead​​​‌ of, however, training the‌ supervised classifier directly on‌​‌ the current set of​​ labeled raw data points,​​​‌ we treat the scores‌ calculated by an ensemble‌​‌ of M unsupervised anomaly​​ detectors on each data​​​‌ point as if they‌ were the learner's input‌​‌ features. This approach generalizes​​ earlier attempts to linearly​​​‌ aggregate unsupervised anomaly detector‌ scores, and broadens the‌​‌ scope of such methods​​ to ordered data like​​​‌ time series. Results suggest‌ that this method usually‌​‌ outperforms-often significantly-linear strategies. The​​ Python library acanag provides​​​‌ an implementation of the‌ proposed method.

8.4.5 Curvature‌​‌ penalization of strongly anisotropic​​ interfaces models and their​​​‌ phase-field approximation

Participant: Blanche‌ Buet.

In collaboration‌​‌ with Jean-François Babadjian (LMO,​​ Université Paris-Saclay) and Michael​​​‌ Goldman (CMAP, Ecole Polytechnique).‌

25 studies the effect‌​‌ of anisotropy on sharp​​ or diffuse interfaces models.​​​‌ When the surface tension‌ is a convex function‌​‌ of the normal to​​ the interface, the anisotropy​​​‌ is said to be‌ weak. This usually ensures‌​‌ the lower semicontinuity of​​ the associated energy. If,​​​‌ however, the surface tension‌ depends on the normal‌​‌ in a nonconvex way,​​ this so-called strong anisotropy​​​‌ may lead to instabilities‌ related to the lack‌​‌ of lower semicontinuity of​​ the functional. We investigate​​​‌ the regularizing effects of‌ adding a higher order‌​‌ term of Willmore type​​ to the energy. We​​​‌ consider two types of‌ problems. The first one‌​‌ is an anisotropic nonconvex​​ generalization of the perimeter,​​​‌ and the second one‌ is an anisotropic nonconvex‌​‌ Mumford-Shah functional. In both​​ cases, lower semicontinuity properties​​​‌ of the energies with‌ respect to a natural‌​‌ mode of convergence are​​ established, as well as​​​‌ Γ-convergence type results by‌ means of a phase‌​‌ field approximation. In comparison​​ with related results for​​​‌ curvature dependent energies, one‌ of the original aspects‌​‌ of our work is​​ that, in the context​​​‌ of free discontinuity problems,‌ we are able to‌​‌ consider singular structures such​​ as crack-tips or multiple​​​‌ junctions.

8.4.6 Approximate mean‌ curvature flows of a‌​‌ general varifold, and their​​ limit spacetime Brakke flow​​​‌

Participant: Blanche Buet.‌

In collaboration with Gian‌​‌ Paolo Leonardi (University of​​ Trento), Simon Masnou (Université​​​‌ Lyon 1) and Abdelmouksit‌ Sagueni.

In 28,‌​‌ we propose a construction​​ of mean curvature flows​​​‌ by approximation for very‌ general initial data, in‌​‌ the spirit of the​​ works of Brakke and​​​‌ of Kim & Tonegawa‌ based on the theory‌​‌ of varifolds. Given a​​ general varifold, we construct​​​‌ by iterated push-forwards an‌ approximate time-discrete mean curvature‌​‌ flow depending on both​​ a given time step​​​‌ and an approximation parameter.‌ We show that, as‌​‌ the time step tends​​ to 0, this time-discrete​​​‌ flow converges to a‌ unique limit flow, which‌​‌ we call the approximate​​​‌ mean curvature flow. An​ interesting feature of our​‌ approach is its generality,​​ as it provides an​​​‌ approximate notion of mean​ curvature flow for very​‌ general structures of any​​ dimension and codimension, ranging​​​‌ from continuous surfaces to​ discrete point clouds. We​‌ prove that our approximate​​ mean curvature flow satisfies​​​‌ several properties: stability, uniqueness,​ Brakke-type equality, mass decay.​‌ By coupling this approximate​​ flow with the canonical​​​‌ time measure, we prove​ convergence, as the approximation​‌ parameter tends to 0,​​ to a spacetime limit​​​‌ measure whose generalized mean​ curvature is bounded. Under​‌ an additional rectifiability assumption,​​ we further prove that​​​‌ this limit measure is​ a spacetime Brakke flow.​‌

8.4.7 Théorie de l'homotopie​​ quantitative

Participant: Pierre Pansu​​​‌.

Le but de​ la théorie de l'homotopie,​‌ en topologie, c'est de​​ simplifier, après déformation continue,​​​‌ des applications continues entre​ espaces topologiques. Ce qui​‌ empêche de le faire,​​ ce sont des invariants​​​‌ homotopiques. Cela soulève des​ questions quantitatives : -​‌ Le calcul des invariants​​ est-il possible (décidable) ?​​​‌ Si oui, à quel​ coût ? - Construire​‌ des représentants de faible​​ complexité et dont les​​​‌ valeurs des invariants sont​ prescrites est-il possible ?​‌ Si oui, à quel​​ coût ? - Quelle​​​‌ est la complexité des​ déformations nécessaires ? Les​‌ réponses, souvent récentes, sont​​ d'une grande diversité. En​​​‌ outre, bien des questions​ restent ouvertes, montrant que​‌ la topologie n'a pas​​ dit son dernier mot,​​​‌ même en basses dimensions.​

9 Bilateral contracts and​‌ grants with industry

9.1​​ Bilateral contracts with industry​​​‌

  • Participants: David Cohen-Steiner.​

    Collaboration with Dassault Systèmes​‌ and Inria team Geomerix​​ (Saclay) on the applications​​​‌ of methods from geometric​ measure theory to the​‌ modelling and processing of​​ complex 3D shapes (PhD​​​‌ of Lucas Brifault, started​ in May 2022).

  • Participants:​‌ Frédéric Chazal, Myriam​​ Frikha.

    Research collaboration​​​‌ with Ericsson on transfer​ learning for temporal data​‌ with applications in telecommunications​​ (PhD of Myriam Frikha,​​​‌ started in November 2024).​

  • Participants: Frédéric Chazal,​‌ Martin Royer.

    Collaboration​​ with Thales on TDA-based​​​‌ anomaly detection for satellite​ telemetry data (started in​‌ Dec. 2025).

  • Participants: Frédéric​​ Chazal, Mathieu Carrière​​​‌.

    Research collaboration with​ Thales on topological approaches​‌ for the analysis and​​ certification of AI-based critical​​​‌ systems through the Master​ internship of Louise Méric​‌ that will continue through​​ a CIFRE PhD at​​​‌ the very beginning of​ 2026.

10 Partnerships and​‌ cooperations

10.1 International initiatives​​

10.1.1 Associate Teams in​​​‌ the framework of an​ Inria International Lab or​‌ in the framework of​​ an Inria International Program​​​‌

Equipe Associée TopTime

Participants:​ Nina Otter.

  • Title:​‌
    Topological and statistical methods​​ for time series data​​​‌
  • Partner Institution(s):
    • Australian National​ University (ANU), Australia
  • Date/Duration:​‌
    2024-2026
  • Additional information:
    Katharine​​ Turner (ANU) is the​​​‌ co-PI of the EA.​

10.1.2 Participation in other​‌ International Programs

KTH Royal​​ Institute of Technology Seed​​​‌ Funding: Strengthening French –​ Swedish AI Collaboration

Participants:​‌ Frédéric Chazal, Mathieu​​ Carrière.

  • Title:
    Geometry-informed​​​‌ AI in wireless communications​
  • Funding Institution(s)
    : KTH​‌ Stockholm, Sweden
  • Date/Duration:
    2025-2026​​
    • Joint project between the​​ SCI school, department of​​​‌ mathematics (PI: Martina Scolamiero),‌ the DataShape team and‌​‌ Ericsson (industrial PI: Francesco​​ Davide Calabrese).
SALTO exchange​​​‌ program between MPG and‌ CNRS

Participants: Nina Otter‌​‌.

  • Title:
    Higher-order interactions​​ at the crossroads of​​​‌ geometry and topology
  • Partner‌ Institution(s):
    • Max Planck Institute‌​‌ for Mathematics in the​​ Sciences, Leipzig, Germany
  • Date/Duration:​​​‌
    2024-2026
  • Additionnal info
    SALTO‌ exchange programme between the‌​‌ Max Planck Gesellschaft and​​ the CNRS. Marzieh Eidi​​​‌ (MPI MiS) is the‌ co-PI.

10.2 International research‌​‌ visitors

10.2.1 Visits of​​ international scientists

  • Wolfgang Polonik​​​‌ (UC Davis). September-October 2025‌ (1 month).
  • Marzieh Eidi‌​‌ (MPI MiS) April-May 2025​​ (2 months).

10.3 National​​​‌ initiatives

Extended visit

Participants:‌ Corentin Lunel, Clément‌​‌ Maria.

- Duration​​ : 2024-2025

- Coordinator​​​‌ : Clément Maria

-‌ Location : Institut Montpelliérain‌​‌ Alexandre Grothendieck (IMAG) -​​ Université de Montpellier

The​​​‌ visit consists of federating‌ mathematicians from IMAG working‌​‌ on low dimensional and​​ quantum topology together with​​​‌ computer scientists from Datashape,‌ to work at the‌​‌ interface of the two​​ fields.

10.3.1 ANR

ANR​​​‌ Chair in AI

Participants:‌ Frédéric Chazal, Marc‌​‌ Glisse.

- Acronym​​ : TopAI

- Type​​​‌ : ANR Chair in‌ AI.

- Title :‌​‌ Topological Data Analysis for​​ Machine Learning and AI​​​‌

- Coordinator : Frédéric‌ Chazal

- Duration :‌​‌ 2020-2026.

- Others Partners:​​ Two industrial partners, the​​​‌ French SME Sysnav and‌ the French start-up MetaFora.‌​‌

- Abstract:

The TopAI​​ project aims at developing​​​‌ a world-leading research activity‌ on topological and geometric‌​‌ approaches in Machine Learning​​ (ML) and AI with​​​‌ a double academic and‌ industrial/societal objective. First, building‌​‌ on the strong expertise​​ of the candidate and​​​‌ his team in TDA,‌ TopAI aims at designing‌​‌ new mathematically well-founded topological​​ and geometric methods and​​​‌ tools for Data Analysis‌ and ML and to‌​‌ make them available to​​ the data science and​​​‌ AI community through state-of-the-art‌ software tools. Second, thanks‌​‌ to already established close​​ collaborations and the strong​​​‌ involvement of French industrial‌ partners, TopAI aims at‌​‌ exploiting its expertise and​​ tools to address a​​​‌ set of challenging problems‌ with high societal and‌​‌ economic impact in personalized​​ medicine and AI-assisted medical​​​‌ diagnosis.

ANR ALGOKNOT

Participants:‌ Clément Maria.

-‌​‌ Acronym : ALGOKNOT.

-​​ Type : ANR Jeune​​​‌ Chercheuse Jeune Chercheur.

-‌ Title : Algorithmic and‌​‌ Combinatorial Aspects of Knot​​ Theory.

- Coordinator :​​​‌ Clément Maria.

- Duration‌ : 2020 – 2026‌​‌

- Abstract: The project​​ AlgoKnot aims at strengthening​​​‌ our understanding of the‌ computational and combinatorial complexity‌​‌ of the diverse facets​​ of knot theory, as​​​‌ well as designing efficient‌ algorithms and software to‌​‌ study their interconnections.

-​​ See also: Clément Maria​​​‌ and ANR AlgoKnot.‌

ANR GeMfaceT

Participants: Blanche‌​‌ Buet.

- Acronym:​​ GeMfaceT.

- Type: ANR​​​‌ JCJC -CES 40 –‌ Mathématiques

- Title: A‌​‌ bridge between Geometric Measure​​ and Discrete Surface Theories​​​‌

- Coordinator: Blanche Buet.‌

- Duration: 2021–2026

-‌​‌ Abstract: This project positions​​ at the interface between​​​‌ geometric measure and discrete‌ surface theories. There has‌​‌ recently been a growing​​​‌ interest in non-smooth structures,​ both from theoretical point​‌ of view, where singularities​​ occur in famous optimization​​​‌ problems such as Plateau​ problem or geometric flows​‌ such as mean curvature​​ flow, and applied point​​​‌ of view where complex​ high dimensional data are​‌ no longer assumed to​​ lie on a smooth​​​‌ manifold but are more​ singular and allow crossings,​‌ tree-structures and dimension variations.​​ We propose in this​​​‌ project to strengthen and​ expand the use of​‌ geometric measure concepts in​​ discrete surface study and​​​‌ complex data modelling and​ also, to use those​‌ possible singular disrcete surfaces​​ to compute numerical solutions​​​‌ to the aforementioned problems.​

ANR StratMesh

Participants: Jean-Daniel​‌ Boissonnat, Mathijs Wintraecken​​.

- Acronym: StratMesh.​​​‌

- Type: ANR PRC​

- Title: A bridge​‌ between Geometric Measure and​​ Discrete Surface Theories

-​​​‌ Coordinator: Mathijs Wintraecken (local),​ Guillaume Moroz (Gamble, Centre​‌ Inria de l'Université de​​ Lorraine) .

- Duration:​​​‌ 2025–2029

- Abstract: StratMesh​ aims to develop provably-correct​‌ triangulation algorithms for stratified​​ spaces. Our focus is​​​‌ on stratified spaces that​ are the projection of​‌ smooth manifolds, which arise​​ in many applications such​​​‌ as robotics, control theory,​ and medial axis computation​‌ for learning from geometric​​ data.

ANR TopModel

Participants:​​​‌ Mathieu Carrière.

-​ Acronym: TopModel.

- Type:​‌ ANR JCJC

- Title:​​ TopModel

- Coordinator: Mathieu​​​‌ Carrière

- Duration: 2024–2027​

- Abstract: The central​‌ tenet of this project​​ is the use of​​​‌ multiparameter topological data analysis​ for machine learning models,​‌ for both regularizing and​​ monitoring these models, and​​​‌ for the automatic generation​ of new features and​‌ descriptors to feed these​​ models with. On the​​​‌ theoretical front, a lot​ of efforts will be​‌ devoted to the development,​​ implementation and generalization of​​​‌ standard topological data analysis​ techniques, who (for the​‌ most part) can only​​ study the topological variations​​​‌ of at most one​ parameter (such as the​‌ data scale), so as​​ to make them suitable​​​‌ for the study of​ the topological variations of​‌ several parameters jointly (such​​ as density and scale,​​​‌ marker genes). Then, the​ focus will be on​‌ specific applications, for which​​ topological data analysis is​​​‌ known to be relevant​ and efficient, of these​‌ new multiparameter topological data​​ analysis methods for machine​​​‌ learning models. More precisely,​ we will emphasize the​‌ usefulness of our new​​ tools on data sets​​​‌ from cosmology (large scale​ structures of the Universe)​‌ and biology (single-cell sequencing,​​ mass cytometry).

PEPR SN​​​‌

Participants: Mathieu Carrière.​

- Acronym: AI4scMED.

-​‌ Type: Work package in​​ PEPR SN

- Title:​​​‌ Multiscale AI for single-cell​ based precision medicine

-​‌ Coordinator: Mathieu Carrière

-​​ Duration: 2023–2027

- Abstract:​​​‌ Cell-based precision medicine holds​ revolutionary potential for healthcare,​‌ but realizing its full​​ potential demands a deep​​​‌ understanding of disease variability​ and multiscale aspects. Single-cell​‌ (sc) multi-omics offers a​​ unique way to obtain​​​‌ molecular profiles of individual​ cells and predict disease​‌ trajectories. To harness this​​ complexity, new AI breakthroughs​​​‌ are needed. Our consortium​ will tackle methodological challenges​‌ to bridge the gap​​ between sc data and​​ personalized treatments, resolving cell​​​‌ type differences and integrating‌ sc-multi-omics with imaging for‌​‌ spatial insights.

Addressing the​​ complexity of the human​​​‌ body and combining genomics‌ with other assays, we‌​‌ will develop AI-based methods​​ to handle, integrate, analyze,​​​‌ and visualize multiscale complexity‌ in diseases. Our developments‌​‌ will leverage cutting-edge AI​​ for sc-genomic data analysis.​​​‌ To infer causal mechanisms‌ at different levels, we‌​‌ will use causal/logical/stochastic modeling​​ to integrate heterogeneous data​​​‌ and account for temporal‌ scales and biophysical priors.‌​‌

We will create network​​ inference methods to understand​​​‌ molecular mechanisms in clinical‌ samples, identifying key genes‌​‌ and predicting therapeutic impacts.​​ Precision medicine must also​​​‌ integrate variability across different‌ cell decision levels. We‌​‌ aim to build predictive​​ models, digital twins, to​​​‌ enable data-driven personalized treatments‌ by connecting intracellular dynamics,‌​‌ biochemical processes, cell populations,​​ and tissue-level organization.

10.3.2​​​‌ Collaboration with other national‌ research institutes

Confiance.ai /‌​‌ IRT SystemX

Participants: Frédéric​​ Chazal.

Research collaboration​​​‌ on anomaly detection for‌ multivariate time series using‌​‌ TDA and ML approaches.​​

11 Dissemination

11.1 Promoting​​​‌ scientific activities

11.1.1 Scientific‌ events: organisation

  • Clément Maria‌​‌ was co-organizer of the​​ QuantAzur Days, Nice, November​​​‌ 2025.
  • Nina Otter was‌ co-organiser of the conference‌​‌ “Topological methods for time-varying​​ data: theory and applications​​​‌ (TopTime)”, at the Australian‌ National University, Canberra, Australia,‌​‌ November 2025.
  • Nina Otter​​ was co-organiser of the​​​‌ workshop “Higher-order interactions at‌ the crossroads of geometry‌​‌ and topology”, Laboratoire de​​ Mathématiques d'Orsay, December 2025.​​​‌

11.1.2 Scientific events: selection‌

Member of the conference‌​‌ program committees
  • Clément Maria​​ was member of the​​​‌ program committee of the‌ 43rd International Symposium on‌​‌ Theoretical Aspects of Computer​​ Science (STACS) 2026
  • Nina​​​‌ Otter was member of‌ the program committee of‌​‌ the 11th ATMCS conference​​ (2025).
  • Nina Otter was​​​‌ member of the program‌ committee of the Applied‌​‌ Category Theory (ACT) 2025​​ Conference.

11.1.3 Journal

Member​​​‌ of the editorial boards‌
  • Frédéric Chazal is a‌​‌ member of the following​​ journal editorial boards: Discrete​​​‌ and Computational Geometry (Springer),‌ Journal of Applied and‌​‌ Computational Topology (Springer).
  • Frédéric​​ Chazal is the Editor-in-Chief​​​‌ of the Journal of‌ Applied and Computational Topology‌​‌ (Springer).

11.1.4 Leadership within​​ the scientific community

  • Frédéric​​​‌ Chazal is a member‌ of the Scientific Advisory‌​‌ Board of the Centre​​ for Topological Data Analysis​​​‌ of the Mathematical Institute‌ at Oxford.
  • Frédéric Chazal‌​‌ is a member of​​ the Scientific Council of​​​‌ EMAp (ESCOLA DE MATEMÁTICA‌ APLICADA DA FUNDAÇÃO GETULIO‌​‌ VARGAS), Rio de Janeiro,​​ Brasil.
  • Mathieu Carrière is​​​‌ a chair holder of‌ the 3IA Institute at‌​‌ Université Côte d'Azur.

11.1.5​​ Scientific expertise

  • Frédéric Chazal​​​‌ is a member of‌ the “commission prospective de‌​‌ l’I2M” (Institut de Mathématiques​​ de Marseille).
  • Clément Maria​​​‌ was a jury member‌ for the UCA-DS4H PhD‌​‌ grant allocation scheme for​​ 2025.
  • Nina Otter is​​​‌ member of the executive‌ committee of the DataIA‌​‌ institute.

11.1.6 Research administration​​

  • Marc Glisse is president​​​‌ of the CDT at‌ Inria Saclay.
  • Frédéric Chazal‌​‌ is co-responsible of the​​ “programme Mathématiques et IA”​​​‌ of the Fondation Mathématique‌ Jacques Hadamard, Paris-Saclay University‌​‌ (until Oct. 2025).
  • Frédéric​​​‌ Chazal is a member​ of the council of​‌ the Graduate School in​​ Mathematics, Paris-Saclay Univ.
  • Clément​​​‌ Maria is co-responsible of​ the CNRS-Groupe de Travail​‌ GeoAlgo.
  • Clément Maria is​​ a member of the​​​‌ steering committee of the​ QuantAzur federative institute.

11.2​‌ Teaching - Supervision -​​ Juries - Educational and​​​‌ pedagogical outreach

11.2.1 Teaching​

  • Master: Mathijs Wintraecken, Introduction​‌ to Scientific Research, 2h​​ eq-TD, mineure DS4H (Master​​​‌ and PhD)
  • Master: Frédéric​ Chazal, Analyse Topologique des​‌ Données, 30h eq-TD, Université​​ Paris-Saclay, France.
  • Master: Frédéric​​​‌ Chazal and Julien Tierny,​ Topological Data Analysis, 38h​‌ eq-TD, M2, Mathématiques, Vision,​​ Apprentissage (MVA), ENS Paris-Saclay,​​​‌ France.
  • Master: Mathieu Carrière,​ Basic Algebra for Data​‌ Analysis, 18h eq-TD, MSc​​ DSAI, Université Côte d'Azur​​​‌
  • Master: Mathieu Carrière and​ Frédéric Cazals, Geometric and​‌ Topological Methods in Data​​ Analysis, with Applications in​​​‌ Biology and Medecine ,​ 15h eq-TD, MSc DSAI,​‌ Université Côte d'Azur
  • Master:​​ Mathieu Carrière, Statistical Learning​​​‌ Theory, 15h eq-TD, MSc​ DSAI, Université Côte d'Azur​‌
  • PSL doctoral course: Eddie​​ Aamari, Frédéric Chazal, Alejandro​​​‌ Saldarriaga, 1 week, Topological​ Data Analysis.
  • Mini-course at​‌ Young Topologists Meeting 2025​​ : Frédéric Chazal, Persistent​​​‌ homology for machine Learning​ : a measure perspective,​‌ 6h, KTH Stockholm.
  • Master:​​ Marc Glisse, Conception et​​​‌ analyse d'algorithmes, 44h eq-TD,​ M1, École Polytechnique, France.​‌

11.2.2 Supervision

  • PhD in​​ progress: Rohit Roy. Triangulating​​​‌ stratified spaces. Started on​ November 2025. Mathijs Wintraecken​‌ and Pierre Alliez (TITANE).​​
  • PhD in progress: Myriam​​​‌ Frikha, Domain adaptation for​ temporal data. Started in​‌ Nov. 2024. Frédéric Chazal.​​
  • PhD in progress: Jérôme​​​‌ Taupin, Density-based metric learning​ and applications in Topological​‌ Data Analysis. Started in​​ Sept. 2025. Frédéric Chazal.​​​‌
  • PhD in progress: Anna​ Hollands, Persistent path-homology for​‌ directed-graph analysis: Statistical aspects​​ and applications to machine​​​‌ learning. Started in Oct.​ 2025. Frédéric Chazal and​‌ Bertrand Michel.
  • PhD in​​ progress: Alejandro Saldarriaga, Topological​​​‌ Deep Learning. Started in​ Nov. 2025. Eddie Aamari​‌ (ENS Paris) and Frédéric​​ Chazal.
  • PhD in progress:​​​‌ Henrique Ennes, Topological approach​ to quantum complexity. Started​‌ in Oct. 2023. Clément​​ Maria and Nicolas Nisse​​​‌ (Inria).
  • PhD in progress:​ António Leitao, Persistent homology​‌ of cover refinements and​​ applications to XAI. Started​​​‌ November 2024. Nina Otter​ and Fosca Giannotti (Scuola​‌ Normale Superiore di Pisa)​​

11.2.3 Juries

  • Marc Glisse​​​‌ was the external reviewer​ for the PhD defense​‌ of Dominic Desjardins Côté,​​ Université de Sherbrooke, Canada.​​​‌
  • Blanche Buet was a​ member of the PhD​‌ Defense jury of Rémi​​ Mougenot (12/2025), Université de​​​‌ Lorraine.
  • Mathieu Carrière was​ a member of the​‌ PhD Defense of Mohamed​​ Kissi (10/2025), Université Paris-Sorbonne,​​​‌ and Rayna Andreeva (05/2025),​ University of Edinburgh.
  • Nina​‌ Otter was a member​​ of the PhD Defense​​​‌ of Andreas Abildtrup Hansen​ (09/2025), The Technical University​‌ of Denmark.

11.2.4 Productions​​ (articles, videos, podcasts, serious​​​‌ games, ...)

  • Clément Maria​ : portrait de chercheur,​‌ exposition Street Science à​​ Nice.
  • Clément Maria :​​​‌ Article Mapping the algorithmic​ complexity of topological quantum​‌ computing dans le magazine​​ de l’IdEx d’Université Côte​​​‌ d’Azur INSIGHTS.

11.2.5 Participation​ in Live events

  • Blanche​‌ Buet participated in Fête​​ de la Science (at​​ IMO, 10/2025) and in​​​‌ RJMI (at Ens Paris‌ Saclay, 11/2025). Blanche Buet‌​‌ gave a popularization seminar​​ to L3 students (at​​​‌ IMO, 12/2025). Blanche Buet‌ is part of the‌​‌ organizing committee of the​​ FMJH Welcome days for​​​‌ masters (at IMO, 09/2025).‌

11.2.6 Others science outreach‌​‌ relevant activities

  • Frédéric Chazal​​ gave a general audience​​​‌ introductory presentation on Artificial‌ Intelligence at Université pour‌​‌ Tous de Bourgogne (March​​ 2025).
  • Frédéric Chazal participated​​​‌ in round tables and‌ gave talks on different‌​‌ aspects of AI at​​ the Academie du Renseignement.​​​‌

12 Scientific production

12.1‌ Major publications

  • 1 article‌​‌G.Gilles Blanchard,​​ A. A.Aniket Anand​​​‌ Deshmukh, U.Urun‌ Dogan, G.Gyemin‌​‌ Lee and C.Clayton​​ Scott. Domain Generalization​​​‌ by Marginal Transfer Learning‌.Journal of Machine‌​‌ Learning Research222​​2021, 1-55HAL​​​‌DOI
  • 2 articleJ.-D.‌Jean-Daniel Boissonnat and M.‌​‌Mathijs Wintraecken. The​​ Topological Correctness of PL​​​‌ Approximations of Isomanifolds.‌Foundations of Computational Mathematics‌​‌22July 2021,​​ 967 - 1012HAL​​​‌DOI
  • 3 articleB.‌Blanche Buet and M.‌​‌Martin Rumpf. Mean​​ curvature motion of point​​​‌ cloud varifolds.ESAIM:‌ Mathematical Modelling and Numerical‌​‌ Analysis5652022​​, 1773 - 1808​​​‌HALDOI
  • 4 inproceedings‌M.Mathieu Carriere and‌​‌ A. J.Andrew J​​ Blumberg. Multiparameter Persistence​​​‌ Images for Topological Machine‌ Learning.NeurIPS 2020‌​‌ - 33rd Conference on​​ Neural Information Processing Systems​​​‌Vancouver / Virtuel, Canada‌December 2020HAL
  • 5‌​‌ inproceedingsM.Mathieu Carriere​​, F.Frédéric Chazal​​​‌, M.Marc Glisse‌, Y.Yuichi Ike‌​‌ and H.Hariprasad Kannan​​. Optimizing persistent homology​​​‌ based functions.ICML‌ 2021 - 38th International‌​‌ Conference on Machine Learning​​PMLR 139Proceedings of​​​‌ the 38th International Conference‌ on Machine Learning, ICML‌​‌ 2021.Virtual conference, United​​ StatesJuly 2021,​​​‌ 1294-1303HAL
  • 6 article‌D.David Cohen-Steiner,‌​‌ A.André Lieutier and​​ J.Julien Vuillamy.​​​‌ Lexicographic Optimal Homologous Chains‌ and Applications to Point‌​‌ Cloud Triangulations.Discrete​​ and Computational Geometry68​​​‌September 2022HALDOI‌
  • 7 articleR.Rémi‌​‌ Gribonval, G.Gilles​​ Blanchard, N.Nicolas​​​‌ Keriven and Y.Yann‌ Traonmilin. Compressive Statistical‌​‌ Learning with Random Feature​​ Moments.Mathematical Statistics​​​‌ and Learning32‌August 2021, 113–164‌​‌HALDOI
  • 8 article​​C.Clément Maria and​​​‌ J.Jonathan Spreer.‌ A Polynomial-Time Algorithm to‌​‌ Compute Turaev–Viro Invariants TV​​ 4,q of​​​‌ 3-Manifolds with Bounded First‌ Betti Number.Foundations‌​‌ of Computational Mathematics20​​5November 2019,​​​‌ 1013-1034HALDOI
  • 9‌ articleM.Miguel O'Malley‌​‌, S.Sara Kalisnik​​ and N.Nina Otter​​​‌. Alpha magnitude.‌Journal of Pure and‌​‌ Applied Algebra22711​​November 2023, 107396​​​‌HALDOI

12.2 Publications‌ of the year

International‌​‌ journals

International peer-reviewed conferences

Reports & preprints

Other scientific publications​‌