EN FR
EN FR
FLOWERS - 2025

2025Activity report​​Project-TeamFLOWERS

RNSR: 200820949R​​​‌
  • Research center Inria Centre‌ at the University of‌​‌ Bordeaux
  • In partnership with:​​Ecole nationale supérieure des​​​‌ techniques avancées - Institut‌ polytechnique de Paris, Université‌​‌ de Bordeaux
  • Team name:​​ FLOW in Exploration, leaRning,​​​‌ and diScovery

Creation of‌ the Project-Team: 2025 March‌​‌ 01

Each year, Inria​​ research teams publish an​​​‌ Activity Report presenting their‌ work and results over‌​‌ the reporting period. These​​ reports follow a common​​​‌ structure, with some optional‌ sections depending on the‌​‌ specific team. They typically​​ begin by outlining the​​​‌ overall objectives and research‌ programme, including the main‌​‌ research themes, goals, and​​ methodological approaches. They also​​​‌ describe the application domains‌ targeted by the team,‌​‌ highlighting the scientific or​​ societal contexts in which​​​‌ their work is situated.‌

The reports then present‌​‌ the highlights of the​​ year, covering major scientific​​​‌ achievements, software developments, or‌ teaching contributions. When relevant,‌​‌ they include sections on​​ software, platforms, and open​​​‌ data, detailing the tools‌ developed and how they‌​‌ are shared. A substantial​​ part is dedicated to​​​‌ new results, where scientific‌ contributions are described in‌​‌ detail, often with subsections​​ specifying participants and associated​​​‌ keywords.

Finally, the Activity‌ Report addresses funding, contracts,‌​‌ partnerships, and collaborations at​​ various levels, from industrial​​​‌ agreements to international cooperations.‌ It also covers dissemination‌​‌ and teaching activities, such​​ as participation in scientific​​​‌ events, outreach, and supervision.‌ The document concludes with‌​‌ a presentation of scientific​​ production, including major publications​​​‌ and those produced during‌ the year.

Keywords

Computer‌​‌ Science and Digital Science​​

  • A5.1.1. Engineering of interactive​​​‌ systems
  • A5.1.2. Evaluation of‌ interactive systems
  • A5.1.4. Brain-computer‌​‌ interfaces, physiological computing
  • A5.1.5.​​ Body-based interfaces
  • A5.1.6. Tangible​​​‌ interfaces
  • A5.1.7. Multimodal interfaces‌
  • A5.8. Natural language processing‌​‌
  • A5.10.5. Robot interaction (with​​ the environment, humans, other​​​‌ robots)
  • A5.10.8. Cognitive robotics‌ and systems
  • A6.3.1. Inverse‌​‌ problems
  • A9.2. Machine learning​​
  • A9.4. Natural language processing​​​‌
  • A9.5. Robotics and AI‌
  • A9.7. AI algorithmics
  • A9.9.‌​‌ Distributed AI, Multi-agent
  • A9.10.​​ Hybrid approaches for AI​​​‌
  • A9.11. Generative AI
  • A9.12.1.‌ Object recognition
  • A9.12.2. Activity‌​‌ recognition
  • A9.13. Agentic AI​​​‌
  • A9.14. Evaluation of AI​ models
  • A9.16. Societal impact​‌ of AI

Other Research​​ Topics and Application Domains​​​‌

  • B1.2.1. Understanding and simulation​ of the brain and​‌ the nervous system
  • B1.2.2.​​ Cognitive science
  • B5.8. Learning​​​‌ and training
  • B9. Society​ and Knowledge
  • B9.1. Education​‌
  • B9.1.1. E-learning, MOOC
  • B9.1.2.​​ Serious games
  • B9.2. Art​​​‌
  • B9.2.1. Music, sound
  • B9.2.4.​ Theater
  • B9.6. Humanities
  • B9.6.1.​‌ Psychology
  • B9.6.8. Linguistics
  • B9.7.​​ Knowledge dissemination

1 Team​​​‌ members, visitors, external collaborators​

Research Scientists

  • Pierre-Yves Oudeyer​‌ [Team leader,​​ INRIA, Senior Researcher​​​‌, HDR]
  • Clément​ Moulin-Frier [INRIA,​‌ Researcher, until Apr​​ 2025]
  • Hélène Sauzéon​​​‌ [INRIA, Professor​ Detachement, HDR]​‌

Faculty Member

  • Cécile Mazon​​ [UNIV BORDEAUX,​​​‌ Associate Professor]

Post-Doctoral​ Fellows

  • Olivier Clerc [​‌INRIA, Post-Doctoral Fellow​​]
  • Cedric Colas [​​​‌INRIA, Post-Doctoral Fellow​]
  • Sina Khajehabdollahi [​‌INRIA, Post-Doctoral Fellow​​, from Apr 2025​​​‌ until Jul 2025]​
  • Marion Pech [INRIA​‌, Post-Doctoral Fellow,​​ until Mar 2025]​​​‌
  • Leslie Tricoche [UNIV​ BORDEAUX, from Oct​‌ 2025]

PhD Students​​

  • Timothe Boulet [INRIA​​​‌]
  • Thomas Carta [​INRIA, from Feb​‌ 2025 until Nov 2025​​]
  • Thomas Carta [​​​‌UNIV BORDEAUX, until​ Jan 2025]
  • Marko​‌ Cvjetko [UNIV BORDEAUX​​]
  • Marie-Sarah Desvaux [​​​‌UNIV BORDEAUX]
  • Juliette​ Deyts [UNIV BORDEAUX​‌]
  • Loris Gaven [​​INRIA]
  • Gautier Hamon​​​‌ [INRIA, until​ Apr 2025]
  • Sina​‌ Khajehabdollahi [INRIA,​​ until Mar 2025]​​​‌
  • Grgur Kovac [INRIA​, from Aug 2025​‌ until Nov 2025]​​
  • Grgur Kovac [INRIA​​​‌, until Jul 2025​]
  • Jeremy Perez [​‌UNIV BORDEAUX]
  • Matisse​​ Poupard [UNIV BORDEAUX​​​‌, from Sep 2025​]
  • Matisse Poupard [​‌INRIA, from Apr​​ 2025 until Jul 2025​​​‌]
  • Matisse Poupard [​CATIE, CIFRE,​‌ until Mar 2025]​​
  • Julien Pourcel [INRIA​​​‌]
  • Clément Romac [​HUGGING FACE SAS,​‌ CIFRE]
  • Julien Rosenberger​​ [INRIA, from​​​‌ Oct 2025]
  • Julien​ Rosenberger [UCL,​‌ from May 2025 until​​ Sep 2025]
  • Isabeau​​​‌ Saint-Supery [UNIV BORDEAUX​, until Apr 2025​‌]
  • Paul Tabbara [​​INRIA, from Dec​​​‌ 2025]
  • Nicolas Yax​ [ENS Paris]​‌

Technical Staff

  • Camille Anthounet​​ [UNIV BORDEAUX,​​​‌ Engineer, from Nov​ 2025]
  • Zacharie Bugaud​‌ [INRIA, Engineer​​, until Jan 2025​​​‌]
  • Ludovic Matar [​INRIA, Engineer,​‌ from Feb 2025]​​

Interns and Apprentices

  • Hana​​​‌ Al Mrayati [UNIV​ BORDEAUX, Intern,​‌ from Mar 2025 until​​ Jun 2025]
  • Loic​​​‌ Blouin [UNIV BORDEAUX​, Intern, from​‌ May 2025 until Aug​​ 2025]
  • Sophie Lepennetier​​​‌ [UNIV PADOVA,​ Intern, from Sep​‌ 2025 until Nov 2025​​]
  • Eliott Poisson [​​​‌UNIV BORDEAUX, Intern​, from May 2025​‌ until Jul 2025]​​
  • Paul Tabbara [INRIA​​​‌, Intern, from​ May 2025 until Nov​‌ 2025]
  • Kan Yao​​ [INRIA, Intern​​, from Mar 2025​​​‌ until Jul 2025]‌

Administrative Assistants

  • Fabienne Cuyollaa‌​‌ [INRIA]
  • Nathalie​​ Robin [INRIA]​​​‌

External Collaborators

  • Eleni Nisioti‌ [UNIV COPENHAGUE]‌​‌
  • Didier Roy [EPFL​​]

2 Overall objectives​​​‌

Abstract: This project-team aims‌ to study the fundamental‌​‌ mechanisms that can enable​​ open-ended learning and development​​​‌ in humans and machines,‌ i.e. how individuals, or‌​‌ groups of individuals, can​​ continuously discover and learn​​​‌ novel skills of increasing‌ complexity. We also aim‌​‌ to leverage this fundamental​​ understanding for human-centered real-world​​​‌ applications in education and‌ in assisted scientific discovery.‌​‌

In particular, we focus​​ on studying mechanisms enabling​​​‌ Autotelic and Aligned Intelligence‌ in humans and machines.‌​‌ A first key ingredient​​ of open-ended learning is​​​‌ curiosity-driven autotelic learning, which‌ is the ability of‌​‌ individuals to set and​​ pursue their own goals​​​‌ (from the greek ‘telos’/goal,‌ and ‘auto’/self), a form‌​‌ of intrinsic motivation pushing​​ organisms to continuously seek​​​‌ new knowledge and skills.‌ self-organizing their own learning‌​‌ curriculum, using meta-cognition and​​ leading to creative exploration.​​​‌

To enable abstraction, collective‌ intelligence, and alignment of‌​‌ autotelic systems on human​​ cultures (values, preferences), we​​​‌ also aim to study‌ how language and social‌​‌ interaction, both as a​​ communication system and as​​​‌ a cognitive tool, can‌ guide autotelic exploration. Symmetrically,‌​‌ using multi-scale models, we​​ aim to study how​​​‌ curiosity-driven autotelic exploration could‌ self-organize at the group‌​‌ level. We also aim​​ to study what are​​​‌ the ecosystemic and evolutionary‌ origins of autotelic systems.‌​‌

Context: Humans explore, learn​​ and discover continuously novel​​​‌ skills and knowledge, through‌ open-ended processes. In fact,‌​‌ humans, and some other​​ life forms, are equipped​​​‌ with intrinsic motivation systems‌ (“curiosity”) pushing them to‌​‌ spontaneously explore and actively​​ seek new knowledge 102​​​‌, setting and pursuing‌ their own goals (they‌​‌ are autotelic 90),​​ ranging from the most​​​‌ concrete (e.g. stack cubes)‌ to the most abstract‌​‌ (e.g. invent new maths​​ problems). This happens at​​​‌ the level of individuals,‌ starting with children who‌​‌ eagerly and spontaneously explore​​ their bodies and their​​​‌ environment as they develop,‌ up to adults of‌​‌ all ages and all​​ backgrounds. This autotelic exploration​​​‌ process also benefits from‌ social and collective dynamics,‌​‌ leveraging past discoveries and​​ being guided to align​​​‌ with the culture (values,‌ preferences, ethics) of a‌​‌ given group 111.​​ The iteration of this​​​‌ collective intelligence process, accumulating‌ and transmitting discoveries over‌​‌ generations, gives rise to​​ open-ended cultural evolution, and​​​‌ to autotelic exploration at‌ the level of collectives.‌​‌ Understanding the mechanisms that​​ enable the origins and​​​‌ functionalities of autotelic learning‌ in interaction with social‌​‌ groups and culture, giving​​ rise to open-endedness, is​​​‌ still a major mystery‌ for science.

We study‌​‌ these mechanisms from four​​ complementary scientific perspectives, structured​​​‌ into three main objectives:‌

  • Objective 1: Improve understanding‌​‌ of human autotelic and​​ aligned intelligence.
    A first​​​‌ objective of this project‌ is to advance our‌​‌ fundamental understanding of the​​ origins, mechanisms and functionalities​​​‌ of autotelic learning and‌ exploration, and how this‌​‌ interacts and is aligned​​​‌ with collective dynamics. This​ will involve a combination​‌ of computational models for​​ developing new theories and​​​‌ hypotheses, as well as​ design and analysis of​‌ new human experimental paradigms​​ analysed with these computational​​​‌ models. Particular scientific questions​ we target include studying​‌ links between autotelic learning,​​ metacognition (one’s own ability​​​‌ to know and control​ one’s own knowledge and​‌ cognitive functions) and creativity​​ in humans, e.g. how​​​‌ do these skills develop​ in childhood and which​‌ internal and external factors​​ influence them ? How​​​‌ do they link to​ language and processes of​‌ social interaction ?
  • Objective​​ 2: Building curiosity-driven autotelic​​​‌ and aligned AI systems.​
    Our second major objective​‌ will be to build​​ and study curiosity-driven autotelic​​​‌ artificial agents that learn​ by interacting with external​‌ environments and within socio-cultural​​ collectives. To do this,​​​‌ we leverage and extend​ state-of-the art deep reinforcement​‌ learning algorithms and transform​​ them into autotelic RL​​​‌ systems. Also and crucially,​ we will study how​‌ algorithms for autotelic learning​​ can be made better​​​‌ aligned (teachable, driveable), more​ robust and more creative​‌ using language both as​​ a tool for social​​​‌ interaction, and as a​ cognitive tool to make​‌ abstractions and leveraging knowledge​​ acquired by others. To​​​‌ achieve this, we will​ use pre-trained generative AI​‌ models as a cognitive​​ tool bootstrap, enabling us​​​‌ to address the poor​ sample efficiency (i.e. require​‌ large amounts of environment​​ interaction) and poor generalisation​​​‌ of classical (autotelic) RL​ algorithms. In addition, autotelic​‌ architectures will enable us​​ to achieve incremental grounding​​​‌ and alignment of generative​ AI models with external​‌ physical and social dynamics.​​ To improve further abstraction,​​​‌ generalisation, we aim to​ establish links between autotelic​‌ learning and program synthesis​​ techniques, whereby 1) autotelic​​​‌ generative models will self-improve​ their coding abilities by​‌ setting learnable coding problems​​ of increasing complexity; 2)​​​‌ we will use code​ for autotelic procedural self-generation​‌ of environments, tasks and​​ policies. Because of its​​​‌ expressivity and abstractness, we​ believe working in code​‌ spaces will open new​​ perspective on open-endedness. We​​​‌ also aim to study​ how autotelic algorithms, using​‌ the learning progress theory,​​ will enable frugal adaptation​​​‌ of generative models thanks​ to automatic curriculum learning.​‌ Finally, we aim to​​ study how groups of​​​‌ autotelic language-augmented agents can​ work in group and​‌ give rise to higher-order​​ forms of autotelic collective​​​‌ intelligence, using multi-agent autotelic​ reinforcement learning techniques and​‌ measurement tools from the​​ field of cultural evolution​​​‌ to track collective innovations.​
  • Objective 3: Applications in​‌ education and assisted scientific​​ discovery.

    Objective 3.1: Train​​​‌ curiosity-driven autotelic learning in​ humans across the lifespan.​‌ We aim to develop​​ educational technologies and interventions​​​‌ that help children and​ adults across the lifespan​‌ to learn in ways​​ that are more motivating​​​‌ and more efficient, for​ example by stimulating curiosity​‌ and meta-cognition and using​​ both models of curiosity​​​‌ and generative AI. Our​ approach combines 1) an​‌ interdisciplinary perspective using both​​ cognitive science, educational sciences​​​‌ and machine learning; 2)​ a user-centric approach with​‌ real-world field studies, in​​ particular with real classrooms​​ in the French educational​​​‌ system, or with field‌ studies with adult of‌​‌ ageing populations; 3) consideration​​ of both neurotypical and​​​‌ neurodiverse populations. Beyond directly‌ training curiosity and meta-cognition,‌​‌ and given their transversal​​ role, we also aim​​​‌ to study how personalised‌ training techniques (e.g. using‌​‌ adaptive curriculum with algorithms​​ that maximize learning progress​​​‌ measures) can enable more‌ efficient and more motivating‌​‌ training of disciplinary skills​​ (e.g. maths, languages) and​​​‌ other cognitive dimensions (attention,‌ working memory, etc). Beyond‌​‌ showing effective impact in​​ RCTs (randomized control trials),​​​‌ our objective is that‌ our techniques and interventions‌​‌ be used large-scale in​​ the real world. To​​​‌ achieve this, we will‌ combine focused collaboration with‌​‌ the educational institutions and​​ edTech industry, with user-centered​​​‌ design of open licence‌ pedagogical material that will‌​‌ aim to be directly​​ and easily reusable by​​​‌ teachers. We also aim‌ to help public action‌​‌ in this domain, through​​ interaction and advising with​​​‌ national and European public‌ institutions.

    Objective 3.2: Assisted‌​‌ scientific discovery with autotelic​​ exploration algorithms. Some of​​​‌ the greatest scientific challenges‌ include the study and‌​‌ design of novel materials,​​ molecules or networks with​​​‌ complex dynamics, where the‌ space of possible self-organized‌​‌ behaviour is often initially​​ mostly unknown, and the​​​‌ space of parameters very‌ large, making exploration and‌​‌ discoveries very costly and​​ difficult for physicists, chemists​​​‌ or biologists. We aim‌ to study and show‌​‌ how autotelic aligned exploration​​ algorithms can be used​​​‌ as powerful discovery assistants‌ in these contexts. We‌​‌ believe they have specific​​ capabilities making them highly​​​‌ relevant for this application:‌ they are made to‌​‌ explore and discover in​​ a sample efficient manner​​​‌ a high diversity of‌ behaviours in complex systems‌​‌ (autotelic), while being driveable​​ so that scientists can​​​‌ drive them in directions‌ of interest (aligned). To‌​‌ maximize diversity, we aim​​ to develop methods learning​​​‌ a diversity of goal‌ representations (autotelic and quality-diversity‌​‌ exploration using meta-diversity search).​​ To enable abstractness and​​​‌ high-level guidance from human‌ scientists (e.g. to provide‌​‌ feedback on measures of​​ interestingness), we aim to​​​‌ leverage language and multimodal‌ generative models. To make‌​‌ fast progress in this​​ direction, we first aim​​​‌ to use artificial life‌ environments, such as continuous‌​‌ cellular automata, as an​​ experimental domain, aiming to​​​‌ use autotelic exploration algorithms‌ to help discover the‌​‌ origins of autopoietic systems​​ (and even autotelic systems​​​‌ self-organised from the ground‌ up) as well as‌​‌ study how evolutionary processes​​ themselves could self-organise. For​​​‌ further real world impact,‌ we aim to develop‌​‌ collaborations with physics/chemistry/biology academic​​ labs, as well as​​​‌ various industrial companies working‌ on the design of‌​‌ new physical or biomolecular​​ systems.

Beyond core scientific​​​‌ questions across disciplines, this‌ project addresses two key‌​‌ societal challenges: 1) How​​ can we build AI​​​‌ systems that serve humans‌ and human societies in‌​‌ their diversity, helping their​​ curiosity and cultures to​​​‌ bloom? 2) How can‌ we provide educational opportunities‌​‌ for all children, and​​ adults across the lifespan,​​​‌ in a world with‌ many challenges, to become‌​‌ intrinsically motivated learners, critical​​​‌ thinkers, autotelic explorers?

3​ Research program

3.1 Background:​‌

Around the mid-20th century,​​ psychologists started studying the​​​‌ hypothesis that humans, and​ some other animals, are​‌ endowed with mechanisms of​​ intrinsic motivation, also called​​​‌ “curiosity” in everyday language,​ leading them to spontaneously​‌ explore novel activities for​​ their own sake. Such​​​‌ curiosity-driven exploration processes were​ hypothesised to play important​‌ roles in learning, both​​ in cognitive and educational​​​‌ sciences: however, until the​ start of the 21st​‌ century, research for understanding​​ of the underlying mechanisms​​​‌ was still very scarce.​ This also explains why​‌ such mechanisms were overlooked​​ in machine learning and​​​‌ robotics.

In the first​ years of the 2000s,​‌ several labs in the​​ world began studying these​​​‌ mechanisms through proposing various​ computational theories and hypotheses.​‌ Among these groups, Pierre-Yves​​ Oudeyer and his colleagues,​​​‌ first at Sony CSL​ Paris and then at​‌ Inria Bordeaux, proposed several​​ theoretical ideas and techniques​​​‌ to build some of​ the foundations of a​‌ new emerging field studying​​ curiosity at the cross-roads​​​‌ of AI, machine learning,​ cognitives sciences, psychology and​‌ neuroscience. In particular, one​​ major contribution has been​​​‌ the development of the​ Learning Progress Hypothesis (LPH),​‌ proposing that human brains​​ are intrinsically motivated to​​​‌ explore activities with high​ learning progress, leveraging meta-cognitive​‌ processes and leading to​​ the self-organisation of efficient​​​‌ learning curricula 113,​ 143. A second​‌ major contribution has been​​ the development of a​​​‌ theoretical framework to account​ for autotelic learning, a​‌ form of learning where​​ individuals learn to represent,​​​‌ sample and pursue their​ own goals 141,​‌ 89.

Based on​​ several proof-of-concept studies of​​​‌ these computational theories 142​, the Flowers team​‌ was founded in 2011​​ by Pierre-Yves Oudeyer (joining​​​‌ Inria) and David Filliat​ (Ensta ParisTech), with a​‌ research program aiming at​​ scaling up these theories​​​‌ along two main dimensions:​ 1) showing how the​‌ LPH could account for​​ key properties of sensorimotor​​​‌ in human infants; 2)​ showing how it was​‌ possible to develop curiosity-driven​​ autotelic learning algorithms that​​​‌ would enable high-dimensional real​ world robots to acquire​‌ complex sensorimotor skills in​​ a human-like way. Several​​​‌ major results were achieved​ in the 2011-2016 period​‌ along these lines.

In​​ the 2017-25 period, we​​​‌ have operated a strategic​ scientific and applicative pivot:​‌ while keeping curiosity-driven autotelic​​ learning in humans and​​​‌ machines as our core​ research activity, we 1)​‌ started projects testing our​​ theoretical predictions in human​​​‌ psychology experiments, and articulated​ links between curiosity and​‌ metacognition; 2) Integrated modern​​ Deep RL techniques with​​​‌ autotelic algorithms, and shifted​ from the developmental robotics​‌ to the machine learning​​ community as target of​​​‌ our contributions to the​ design of more open,​‌ flexible and robust learning​​ machines; 3) Shifted from​​​‌ sensorimotor autotelic learning to​ language-based abstract yet grounded​‌ autotelic learning, and built​​ synergetic bridges with recent​​​‌ advances in generative AI;​ 4) Scaled up our​‌ research in educational technology​​ by taking a translational​​​‌ approach and developing industrial​ collaborations, with actions to​‌ support public policies; 5)​​ Started the new application​​ domain of automated scientific​​​‌ discovery. These constitute the‌ pillars of our current‌​‌ research program, structured as​​ follows:

3.2 Understanding Autotelic​​​‌ Learning in Humans

3.2.1‌ Curiosity, meta-cognitionand agency across‌​‌ the lifespan.

The Learning​​ Progress hypothesis, as well​​​‌ as other theories of‌ curiosity-driven learning, all assume‌​‌ meta-cognitive competencies (e.g. ability​​ to evaluate one’s own​​​‌ uncertainty, knowledge gaps or‌ learning progress) as well‌​‌ as forms of agency.​​ However, experimental studies of​​​‌ human curiosity have so‌ far mostly overlooked studying‌​‌ the influence of meta-cognition​​ and agency, let alone​​​‌ simply measure them together‌ with various dimensions of‌​‌ curiosity 132. Another​​ major limit of current​​​‌ models and experimental studies‌ of human curiosity has‌​‌ been that they have​​ not studied how curiosity​​​‌ develops across the lifespan.‌ Actually, the scientific community‌​‌ knows very little on​​ how various forms of​​​‌ curiosity change across childhood,‌ adolescence, and up to‌​‌ ageing populations.

We will​​ aim to address some​​​‌ of these limitations by‌ collaborating with various international‌​‌ groups, including M. Gruber​​ (Univ. Cardiff) and Y.​​​‌ Fandakova (Max Planck Institute‌ for Human Development) with‌​‌ whom we just submitted​​ a major ANR/DFG/ESRC project​​​‌ on this topic. In‌ particular, we propose an‌​‌ interdisciplinary approach to make​​ new breakthroughs in understanding​​​‌ how metacognition contributes to‌ the development of curiosity-based‌​‌ learning, and set the​​ stage for educational interventions​​​‌ that could help children‌ develop their curiosity. Given‌​‌ the links between curiosity​​ and metacognition, and the​​​‌ fact that metacognition continues‌ to improve across childhood‌​‌ and adolescence, we formulate​​ the hypothesis that the​​​‌ efficiency of curiosity-based learning,‌ i.e. the ability to‌​‌ inquire about and prioritise​​ learning of information associated​​​‌ with high curiosity, improves‌ across child and adolescent‌​‌ development.

3.2.2 Experimental paradigms​​ for studying autotelic learning​​​‌ in humans.

Another limit‌ of existing experimental studies‌​‌ of human curiosity, including​​ the ones mentioned above,​​​‌ has been that most‌ of them focused so‌​‌ far on studying how​​ humans prefer exploring one​​​‌ of several pre-existing stimuli‌ or learning activities 163‌​‌. However, as shown​​ in our theoretical and​​​‌ AI work described above,‌ and as argued in‌​‌ complementary arguments from Laura​​ Schulz and Junyi Chu​​​‌ 84, exploration of‌ self-generated goals, including arbitrary‌​‌ goals or games, may​​ be key in accounting​​​‌ for human development, and‌ further in accounting for‌​‌ human innovation and cultural​​ evolution 85. Only​​​‌ very few exploratory experimental‌ protocols have started to‌​‌ be investigated in the​​ literature 93: we​​​‌ aim to further develop‌ this form of experimental‌​‌ protocol, informed by predictions​​ made by our theoretical​​​‌ models, in collaboration with‌ researchers such as G.‌​‌ Molinaro and A. Collins​​ (Univ. Berkeley, both in​​​‌ a 6 months research‌ visit at Inria Flowers‌​‌ and Mnemosyne in 2024)​​ J. Chu (Harvard Univ,​​​‌ US), L. Rat-Fischer (Univ.‌ Nanterre) and A. Ruggeri‌​‌ (TU Munich).

3.2.3 Links​​ between curiosity and creativity​​​‌ for autotelic learning in‌ children:

The ability to‌​‌ imagine abstract and new​​ goals is essential for​​​‌ creative discovery and open-ended‌ learning throughout life. Children‌​‌ achieve this by using​​​‌ the compositionality of language​ as a tool to​‌ imagine situations they have​​ never experienced before, targeting​​​‌ them as goals during​ play 147, 168​‌. Echoing the IMAGINE​​ architecture 88, an​​​‌ intrinsically motivated deep reinforcement​ learning architecture modelling compositional​‌ imagination (the creation of​​ new linguistic associations for​​​‌ new goals), we aim​ to investigate the links​‌ between curiosity and creativity​​ in humans, focusing on​​​‌ the metacognitive role of​ language in guiding autonomous​‌ learning behaviours. Although the​​ nature of the links​​​‌ between curiosity and creativity​ is currently not well​‌ defined, a recent meta-analysis​​ shows that higher levels​​​‌ of curiosity are significantly​ associated with higher levels​‌ of creativity 156.​​ Divergent thinking mechanisms are​​​‌ said to be the​ cognitive resource common to​‌ both skills 104,​​ 117, and some​​​‌ authors even identify curiosity​ as a facilitator, a​‌ trigger for creativity 106​​: high curiosity states​​​‌ induce better ideation and​ greater idea associations conducive​‌ to problem solving. Also,​​ both creativity and curiosity​​​‌ are governed by metacognitive​ processes of self-regulation of​‌ learning 151 enabling the​​ identification of information gaps,​​​‌ problem situations or uncertainties,​ the generation of ideas,​‌ paths to resolution, and​​ monitoring and evaluating the​​​‌ value of ideas as​ creative output or as​‌ majoring knowledge 109,​​ 112. We aim​​​‌ to investigate developmental differences​ on curiosity-based learning and​‌ problem-solving tasks while studying​​ their relationships and their​​​‌ dependency to intrapersonal factors​ (especially metacognitive skills and​‌ personality dimensions such as​​ epistemic curiosity, creativity or​​​‌ intellectual humility traits) in​ late childhood (from 6​‌ to 11 yo). To​​ achieve experiments needed to​​​‌ address these topics, we​ will leverage an educational​‌ Léa-Ifé collaboration network established​​ with 10 primary schools​​​‌ around Bordeaux. As a​ whole, in this part​‌ of the project, we​​ aim to demonstrate that​​​‌ curiosity as a process​ to seek knowledge in​‌ the face of self-generated​​ goals of knowledge gaps,​​​‌ or as a metacognitive​ feeling 103, leads​‌ to better initiation of​​ the creative process.

3.2.4​​​‌ Curiosity to learn about​ others and social interaction:​‌

Social curiosity is defined​​ as the desire to​​​‌ acquire knowledge about others​ in society, encompassing an​‌ interest in their emotions,​​ thoughts, and behaviours. This​​​‌ type of curiosity can​ be divided into two​‌ forms 146: 1)​​ empathetic curiosity (the desire​​​‌ to acquire knowledge about​ others), and 2) relational​‌ curiosity (the desire to​​ interact with others). Like​​​‌ other types of curiosity,​ social curiosity motivates people​‌ to engage in exploratory​​ behaviours directed toward the​​​‌ social world, seeking novel​ information about how people​‌ think, behave, and feel.​​ 110 proposes three functions​​​‌ of social curiosity: 1)​ acquiring information useful for​‌ learning and development, 2)​​ establishing interpersonal relationships and​​​‌ increasing a sense of​ social belonging, and 3)​‌ controlling the social world​​ by making it more​​​‌ predictable and manageable. Thus,​ social curiosity enhances social​‌ functioning and has been​​ linked to improved social​​​‌ behaviour adaptation, the ability​ to establish and maintain​‌ social relationships, and better​​ social judgement abilities 110​​. Recently, another distinction​​​‌ has been proposed in‌ social curiosity 114:‌​‌ 1) overt social curiosity,​​ an explicit interest in​​​‌ understanding other people, which‌ motivates direct communication with‌​‌ others; 2) covert social​​ curiosity, an “hidden” interest​​​‌ that motivates more indirect‌ and furtive behaviours to‌​‌ understand others, such as​​ discreetly observing people, listening​​​‌ to others’ conversations, and‌ reading tabloids and human-interest‌​‌ stories. Covert curiosity is​​ often associated with negative​​​‌ outcomes like gossiping or‌ spying 114, but‌​‌ it can also drive​​ the understanding of the​​​‌ social world through observation‌ and finally motive interactions‌​‌ with others 105.​​ On the other hand,​​​‌ overt social curiosity has‌ been linked with open-mindedness,‌​‌ extraversion, and sociability 114​​, and was associated​​​‌ with better job performance‌ 105.

3.2.5 Autotelic‌​‌ game invention and cultural​​ transmission.

Leveraging the theoretical​​​‌ ideas on the interaction‌ between autotelic learning and‌​‌ cultural evolution as described​​ in the previous section,​​​‌ we also aim to‌ study experimentally these interactions‌​‌ in chains of humans​​ incentivized to transmit to​​​‌ each other games or‌ artefacts of their own‌​‌ intrinsically motivated invention (either​​ physical or video games,​​​‌ e.g. using experimental setups‌ like 93). We‌​‌ aim to design new​​ experimental protocols and run​​​‌ them both in various‌ age ranges in European‌​‌ populations, as well as​​ in populations in non​​​‌ western culture leveraging associated‌ collaborations with Maxime Derex‌​‌ at IAST, Toulouse, Sheina​​ Lew-Levy at Durham University,​​​‌ and Sarah Pope-Caldwell at‌ Georgia State University.

3.3‌​‌ Building Curiosity-Driven Autotelic and​​ Aligned AI

3.3.1 Language-Augmented​​​‌ Autotelic Agents with Foundational‌ Models

We will develop‌​‌ architectures where LLMs function​​ as cognitive tools for​​​‌ autotelic RL agents across‌ five dimensions: (1) LLM-based‌​‌ agents with environmental alignment—extending​​ our work on grounding​​​‌ LLMs through online RL‌ 79 where LLMs generate‌​‌ goals, evaluate achievement, relabel​​ experiences, and provide natural​​​‌ language interfaces. We will‌ extend goal generation to‌​‌ creative, time-extended, and learning-oriented​​ goals including self-generated causal​​​‌ questions and hypotheses 101‌, while correcting hallucinations‌​‌ through incremental LLM updates​​ via environment interaction. (2)​​​‌ Multimodal grounding and social‌ environments—extending our SocialAI School‌​‌ framework 118 to incorporate​​ theory of mind, joint​​​‌ intentionality, and social norms,‌ investigating whether social curiosity‌​‌ can drive efficient acquisition​​ of complex social skills.​​​‌ (3) Real-time human-in-the-loop learning—enabling‌ agents to interpret instructions,‌​‌ respond to feedback, explain​​ exploration processes, and adapt​​​‌ to user preferences for‌ education and discovery applications.‌​‌ (4) Learning to use​​ cognitive tools—agents will learn​​​‌ when to invoke APIs,‌ generate/execute code, query knowledge‌​‌ bases, or request human​​ assistance, including chain-of-thought and​​​‌ self-reflection mechanisms. (5) Metacognitive‌ curriculum learning with coordinated‌​‌ interestingness measures—leveraging our MAGELLAN​​ architecture 47 which enables​​​‌ LLM agents to learn‌ metacognitive predictions of their‌​‌ own competence and learning​​ progress across large language-defined​​​‌ goal spaces. By capturing‌ semantic relationships between goals,‌​‌ MAGELLAN enables sample-efficient progress​​ estimation and dynamic adaptation​​​‌ to evolving goal spaces.‌ We will extend this‌​‌ to develop meta-diversity search​​ algorithms 97 leveraging LLMs​​​‌ to generate novel conceptual‌ dimensions, enabling exploration across‌​‌ objective (learning progress, novelty)​​​‌ and subjective, culturally-contextualized criteria​ 158. Long-term objectives​‌ will include studying how​​ agents pursue goals across​​​‌ extended timescales using cultural​ artifacts for long-term planning.​‌

3.3.2 Program Synthesis for​​ Abstract and Verifiable Intelligence​​​‌

We will explore autotelic​ learning in formal language​‌ spaces where goals and​​ policies are represented as​​​‌ programs. Our ACES architecture​ demonstrates autotelic LLMs self-improving​‌ coding skills by iteratively​​ generating diverse problems and​​​‌ solutions using code interpreters​ 150. This addresses​‌ three limitations: environments are​​ not truly open-ended (code​​​‌ is), LLMs lack grounding​ (interpreters provide it), and​‌ code LLMs struggle beyond​​ training distributions (autotelic learning​​​‌ enables self-improvement). We will​ train small models with​‌ advanced coding capabilities and​​ extend to mathematical problem​​​‌ invention and theorem proving.​ Within Inria LLM4Code, we​‌ will develop autotelic LLMs​​ interacting with proof assistants​​​‌ like Coq and Lean​ 157. This approach​‌ provides compact interpretable representations,​​ formal verification, and compositional​​​‌ generalization. Progress on benchmarks​ like ARC 83 will​‌ validate these methods.

3.3.3​​ Curiosity in Cultural Evolution,​​​‌ Collective Intelligence, and AI​ Science Teams

Understanding how​‌ groups coordinate curiosity-driven exploration​​ and self-organize collective intelligence​​​‌ represents both a fundamental​ scientific question and a​‌ path toward transformative applications.​​ This research direction addresses​​​‌ several interconnected challenges. First,​ how individual curiosity combines​‌ with social transmission and​​ collective innovation 111,​​​‌ 95 is still poorly​ understood and modeled. Second,​‌ as generative AI increasingly​​ participates in human cultural​​​‌ production, understanding cultural evolution​ in hybrid human-AI groups​‌ becomes essential for anticipating​​ societal impacts. Third, many​​​‌ scientific and creative challenges​ require coordinated teams leveraging​‌ complementary expertise and perspectives—motivating​​ our vision of autotelic​​​‌ AI science teams collaborating​ with human researchers.

Near-term​‌ work will investigate coordination​​ when self-generated goals conflict​​​‌ in shared environments. Some​ goals require collaboration with​‌ agents possessing complementary skills—agents​​ must negotiate joint goals​​​‌ serving individual curiosity. We​ will study how network​‌ topology influences innovation dynamics​​ 137, how agents​​​‌ develop communication protocols 75​, and whether groups​‌ display curiosity at collective​​ levels—pursuing structured exploration maximizing​​​‌ diversity of learned goals​ through simple individual-level mechanisms.​‌ The increasing role of​​ generative AI in cultural​​​‌ production necessitates understanding these​ dynamics in hybrid settings—the​‌ emerging field of "Machine​​ Culture" 77. We​​​‌ will systematically investigate how​ interaction protocols, social structures,​‌ and model capabilities shape​​ cultural dynamics in LLM​​​‌ populations and mixed human-AI​ groups (collaboration with Derex,​‌ IAST).

3.4 Applications in​​ Education and Scientific Discovery​​​‌

3.4.1 Training Curiosity and​ Metacognition Across the Lifespan​‌

Addressing 21st century educational​​ challenges—inclusive education, cross-disciplinary skills​​​‌ (attention, curiosity, learning to​ learn), and digital transformation—requires​‌ technologies fostering curiosity and​​ metacognition. Our ZPDES algorithm​​​‌ personalizes curricula by maximizing​ learning progress 87.​‌ RCTs with >1,000 children​​ demonstrated enhanced learning efficiency​​​‌ and motivation versus expert-designed​ curricula. We will extend​‌ ZPDES to new domains​​ (attention training, language learning)​​​‌ and populations: aging adults,​ neurodiverse learners, professional contexts​‌ (sports, gaming). To address​​ ZPDES's requirement for expert-formatted​​​‌ content, we will leverage​ generative AI to automate​‌ exercise generation from textbooks,​​ training smaller LLMs for​​ lightweight systems avoiding foreign-hosted​​​‌ dependencies. We will scale‌ up metacognitive skills and‌​‌ curious question training 69​​, studying transfer to​​​‌ creativity and including pioneering‌ work using GPT-3 conversational‌​‌ agents in real classrooms​​ 70.

A distinctive​​​‌ objective involves metacognitive empowerment:‌ developing interventions helping children‌​‌ understand their own learning​​ progress to self-generate curricula​​​‌ independently—addressing limited technology contexts‌ while fostering autonomy. Long-term‌​‌ work will include teacher​​ training programs embedding curiosity-fostering​​​‌ practices (Peterson, 2020) and‌ unplugged activity versions. Partnerships‌​‌ with educational institutions (Académie​​ de Bordeaux), industry (EvidenceB,​​​‌ Ubisoft), and NGOs (France‌ IOI) will enable deployment‌​‌ and policy influence.

3.4.2​​ Assisted Scientific Discovery with​​​‌ Autotelic Exploration

Scientists studying‌ complex systems face challenges‌​‌ mapping behavioral spaces when​​ lacking models and representations​​​‌ with scarce experimental resources.‌ Autotelic algorithms offer sample-efficient‌​‌ diverse behavior discovery while​​ remaining steerable through natural​​​‌ language. Proof-of-concept work efficiently‌ maps spaces in cellular‌​‌ automata 98 and gene​​ regulatory networks (99​​​‌, with Levin, Harvard),‌ independently adopted by physics‌​‌ researchers (U. Washington). We​​ will maximize diversity through​​​‌ meta-diversity search and leverage‌ language/multimodal models for abstract‌​‌ guidance.

Near-term work will​​ leverage artificial life as​​​‌ testbeds, studying origins of‌ autopoietic systems and self-organizing‌​‌ evolutionary processes in cellular​​ automata 34, 149​​​‌ (2023 Best Paper ALife).‌ Long-term objectives will transition‌​‌ to real-world systems through​​ collaborations: Levin (synthetic biology),​​​‌ Murugan (soft condensed matter,‌ U. Chicago), Aymonier (chemistry,‌​‌ ICMCB Bordeaux), with applications​​ to power networks, neuromuscular​​​‌ models, and artistic domains.‌ We will explore autotelic‌​‌ algorithms for mathematical problem/proof​​ exploration within LLM4Code. Success​​​‌ will establish autotelic exploration‌ as methodology for materials‌​‌ science, systems biology, and​​ mathematical discovery, with industrial​​​‌ translation potential (e.g. Solvay/Syensqo).‌

4 Application domains

Neuroscience,‌​‌ Developmental Psychology and Cognitive​​ Sciences Being primarily experts​​​‌ in curiosity and its‌ links with open-ended learning,‌​‌ our aim has been​​ to build and grow​​​‌ internationally an integrated science‌ of curiosity. By leveraging‌​‌ and integrating concepts and​​ techniques also often used​​​‌ in AI, psychology and‌ education, we aim to‌​‌ reinforce our existing contributions​​ in this direction, ranging​​​‌ from building theories and‌ experiments that add to‌​‌ the corpus of scientific​​ knowledge on curiosity, to​​​‌ leading the organisation of‌ international events dedicated to‌​‌ this integrated science. As​​ an example, co-leading the​​​‌ organisation of a Gordon‌ Research Conference series entitled‌​‌ “The New Science of​​ Curiosity” (see).​​​‌ Complementarily, a European ORA‌ project on the cognitive‌​‌ science study of curiosity​​ and metacognition (with M.​​​‌ Gruber and Y. Fandakova).‌ Other examples are the‌​‌ study of the role​​ of intrinsic motivation in​​​‌ adoption of technologies fostering‌ autonomy in ageing populations,‌​‌ with a view to​​ assessing its positive value​​​‌ against cognitive aging as‌ a protective ingredient. This‌​‌ includes: CuriousTECH associate team​​ with M. Fernendes from​​​‌ the Cognitive Neuroscience Lab‌ of the University of‌​‌ Waterloo, the InnovCare project​​ (with S. Lechevalier) within​​​‌ the PPR Autonomie-France 2030‌ (and with Fondation France-Japan‌​‌ of EHESS), the project​​ VBHI - France 2030​​​‌ (IHU, S. Debette), with‌ F. Lotte and F.‌​‌ Wagner from Inria.

Development​​​‌ and open-endedness in generative​ AI There have been​‌ revolutionary advances in AI​​ in the last few​​​‌ years, especially around generative​ systems such as multi-modal​‌ foundational models. However, as​​ described above, these systems​​​‌ are still strongly limited​ in several key dimensions:​‌ they are not pro-active​​ agents interacting with external​​​‌ environments, they lack grounding,​ meta-cognition and curiosity. One​‌ of our goals is​​ to make fundamental scientific​​​‌ and technological contributions to​ adapt and extend current​‌ generative AI systems by​​ integrating forms of curiosity,​​​‌ meta-cognition and grounding, for​ which we recently made​‌ proofs of concepts, and​​ vice-versa take advantage of​​​‌ powerful capabilities of foundational​ models to build new​‌ kinds of curiosity-driven learning​​ systems capable of creative​​​‌ and abstract exploration learning​ and discovery.

Machine culture​‌ Beyond technological advances, generative​​ AI is also starting​​​‌ to have a major​ influence on human cultural​‌ evolution. They are now​​ massively used as intermediation​​​‌ platforms between individuals and​ existing corpuses of knowledge​‌ and culture, conveying multiple​​ forms of biased cultural​​​‌ perspectives that they can​ amplify. This phenomenon has​‌ recently become massive as​​ social networks are pervaded​​​‌ by bots powered by​ generated AI systems, playing​‌ the roles of humans​​ with particular opinion or​​​‌ backgrounds, and increasingly interacting​ directly among each other,​‌ beyond interaction with humans.​​ While generative AI offers​​​‌ unique potential in enabling​ humans make discoveries and​‌ know and understand each​​ others' cultures, these properties​​​‌ have also been leveraged​ by diverse organisations to​‌ influence in unfair and​​ dangerous manners what populations​​​‌ think and do. Even​ though this poses major​‌ societal issues, this evolution​​ has been so rapid​​​‌ that basic scientific understanding​ of cultural evolution in​‌ hybrid human-machine groups is​​ strongly lacking. Thus, we​​​‌ believe the parts of​ our project which aim​‌ at modelling cultural evolution​​ in groups of generative​​​‌ AI agents, or hybrid​ groups, as well as​‌ its links with properties​​ of curiosity-driven learning at​​​‌ the level of individuals,​ has a potential to​‌ make very useful contributions​​ to these high stake​​​‌ issues

Translational educational technologies​ that foster curiosity-driven and​‌ critical mind We live​​ in a world that​​​‌ is evolving fast: global​ factors such as climate​‌ change and geopolitical processes​​ fragilize the context children​​​‌ live in. New technologies,​ such as generative AI,​‌ are profoundly impacting economic​​ dynamics, democracy and cultural​​​‌ evolution. Yet, in most​ educational contexts, including in​‌ Europe, what is taught​​ in classrooms is very​​​‌ similar to what was​ taught 50 years ago.​‌ And even for so-called​​ “fundamental knowledge”, studies such​​​‌ as PISA show a​ worrying decrease of skills​‌ and motivation in children.​​ As mentioned in a​​​‌ recent report from OECD​ 166, we believe​‌ it is essential to​​ train children to become​​​‌ autonomous lifelong learner, through​ fostering and training their​‌ curiosity and their critical​​ minds, their ability to​​​‌ go search by themselves​ new information, and to​‌ question the validity of​​ information they collect, as​​​‌ well as question their​ own knowledge and opinions.​‌ Thus, our research program​​ aimed to train curiosity​​ and the associated metacognitive​​​‌ skills that underlie the‌ critical mindset, has the‌​‌ potential to contribute in​​ this perspective. We aim​​​‌ to leverage our fundamental‌ research in translational projects‌​‌ where we will work​​ directly with major educational​​​‌ stakeholders from the start‌ (e.g. students, teachers, parents,‌​‌ educational institutions like individual​​ schools, Académie de Bordeaux,​​​‌ edTech companies like EvidenceB,‌ government and in particular‌​‌ ministry of education) to​​ build educational interventions that​​​‌ will be efficient, adapted‌ to the needs and‌​‌ constraints of real world​​ educational contexts, and with​​​‌ the aim of large‌ scale adoption and use‌​‌ (a first step in​​ this direction are the​​​‌ AdaptivMaths and MIA Seconde‌ educational software now deployed‌​‌ in all French primary​​ schools and supported by​​​‌ the French ministry of‌ education).

Generative AI and‌​‌ education: scientific understanding of​​ stakes and opportunities in​​​‌ support of public policies‌ One particular topic we‌​‌ focus on is the​​ study of the opportunities​​​‌ and challenges of generative‌ AI in education. While‌​‌ very recent (ChatGPT was​​ introduced only 1.5 years​​​‌ ago), generative AI has‌ already very importantly impacted‌​‌ the educational world in​​ the last few months.​​​‌ More than 50% of‌ children in the 12-18‌​‌ age range have already​​ used generative AI systems​​​‌ for their homework, and‌ this tendency is quickly‌​‌ rising, including in Europe.​​ Associated challenges include forms​​​‌ of uses of generative‌ AI by students that‌​‌ may harm their abilities​​ to learn, understand, and​​​‌ be motivated to put‌ effort and be actively‌​‌ engaged in these processes.​​ Also, it impacts profoundly​​​‌ the way teachers design‌ homework - for which‌​‌ students are already massively​​ using these tools. On​​​‌ the other hand, generative‌ AI opens unique opportunities‌​‌ for rich personalised tutoring,​​ ranging from opportunities to​​​‌ obtain tailored explanations and‌ feedback, to getting the‌​‌ opportunity to discuss and​​ train in foreign languages.​​​‌ Such opportunities may be‌ particularly magnified for countries‌​‌ where the educational system​​ is underdeveloped 135.​​​‌ Key aspects of our‌ research program are geared‌​‌ towards studying these opportunities​​ and challenges, for example​​​‌ running field studies in‌ middle and high schools‌​‌ to understand how students​​ currently (mis)understand and (mis)use​​​‌ generative AI tools. In‌ complement, we continue working‌​‌ on outreach, especially developing​​ educational tools enabling to​​​‌ improve generative AI literacy‌ in students, teachers and‌​‌ parents, for example by​​ further developing and disseminating​​​‌ our pedagogical video series‌ “ChatGPT explained in 5mn”‌​‌ (see), which has​​ has been integrated in​​​‌ various tools from DNE‌ and in the European‌​‌ mooc made for introducing​​ teachers to AI (the​​​‌ AI4Teacher mooc). Participating in‌ popular science events, visiting‌​‌ middle and high schools​​ and welcoming students in​​​‌ the lab, writing popular‌ science books (such as‌​‌ “C'est pas moi, c'est​​ l'IA”, published by Nathan),​​​‌ and participating in discussions‌ on these topics in‌​‌ wide audience media constitute​​ another application axis. Given​​​‌ the high societal challenges‌ associated with this line‌​‌ of work, we also​​ aim to strongly develop​​​‌ our activities in informing‌ and supporting public policies:‌​‌ a key vector for​​​‌ such public support is​ actively participating in interactions​‌ and discussions with public​​ bodies that analyze current​​​‌ stakes and propose new​ actions and laws. In​‌ this lens we recently​​ supported Inria in writing​​​‌ notes on generative AI​ and its societal dimensions​‌ for the cabinet of​​ E. Macron, we participated​​​‌ in interviews from senators​ preparing a report on​‌ AI and education. We​​ made presentations of the​​​‌ stakes associated to training​ curiosity and metacognition using​‌ AI technologies at Conseil​​ Scientifique de l'Education Nationale,​​​‌ and at an annual​ scientific event organized by​‌ DNE (Direction du Numérique​​ Educatif) and were invited​​​‌ by BPI to participate​ to evaluation and monitoring​‌ of projects related to​​ education/edTech by this institution.​​​‌ We are also working​ to develop collaborations between​‌ Inria and the UK​​ AI safety institute, towards​​​‌ building a French institution​ similar to the UK​‌ one. This includes developing​​ a collaboration with Chris​​​‌ Summerfield on doing field​ studies to assess the​‌ current state of use​​ of generative AI in​​​‌ middle and high schools​ to inform public policies​‌ on this topic. Lastly,​​ our activities aimed at​​​‌ sharing AI models and​ data that are fully​‌ open-source (open weights and​​ open data) and trained​​​‌ on data associated with​ appropriate rights (we are​‌ here also collaborating with​​ the Hugging Face company​​​‌ to distribute these open​ models and data on​‌ their platform). For example,​​ we recently built a​​​‌ project with the EvidenceB​ company, in collaboration with​‌ Région Ile-de-France, to build​​ an open model trained​​​‌ on data from free​ manuals, for which authors​‌ will be retribution in​​ an appropriate manner: this​​​‌ kind of model will​ enable wider and legally​‌ compliant access to AI​​ models by the edTech​​​‌ ecosystem in France.

Automated​ discovery in science Machine​‌ learning algorithms integrating intrinsically-motivated​​ goal exploration processes (IMGEPs)​​​‌ with flexible modular representation​ learning are very promising​‌ directions to help human​​ scientists discover novel structures​​​‌ in complex dynamical systems,​ in fields ranging from​‌ biology to physics. The​​ automated discovery project aims​​​‌ to boost the efficiency​ of these algorithms by​‌ empowering discovery in science​​ and engineering. These entail​​​‌ real-world applications with high​ societal stakes, such as​‌ helping scientists make new​​ discoveries that may for​​​‌ e.g. help build more​ sustainable materials, generate cleaner​‌ energy or save energy,​​ find molecules with medical​​​‌ applications, design accessible and​ efficient educational tools, or​‌ help design more sustainable​​ forms of plant growing​​​‌ in agriculture. In many​ cases, the complexity of​‌ self-organising materials or biological​​ systems involves significant scientific​​​‌ and engineering challenges for​ understanding, controlling and inventing.​‌ Following several of our​​ recent proof-of-concept projects 96​​​‌, 99, we​ aim to do translational​‌ research also in this​​ domain, enabling chemists, physicists​​​‌ and biologists, in both​ academia and industry, to​‌ efficiently use our tools​​ for curiosity-driven exploration to​​​‌ help them make new​ discoveries. In particular, we​‌ are now starting exploring​​ several new collaborations in​​​‌ these fields: with Solvay/Syensqo​ we have started several​‌ discussions to develop collaborations​​ on using autotelic exploration​​ algorithms to efficiently explore​​​‌ and map the space‌ of material design and‌​‌ properties, with the aim​​ to help scientists at​​​‌ Syensqo to discover new‌ materials with high environmental‌​‌ and functionality properties; with​​ IRT Saint Exupery,​​​‌ we have an ongoing‌ consortium collaboration around the‌​‌ project AIxIA, where we​​ study the use of​​​‌ autotelic exploration algorithms to‌ map the space of‌​‌ interference behaviours on embedded​​ software and hardware.

Building​​​‌ self-organising AI from the‌ ground up When using‌​‌ continuous cellular automata as​​ a playground for designing​​​‌ and evaluating our algorithms‌ for curiosity-driven automated discovery‌​‌ for the sciences, we​​ are also actually making​​​‌ direct contributions to the‌ domain of Artificial Life.‌​‌ In particular, we believe​​ the tools and approach​​​‌ we are taking, in‌ particular exploring the self-organisation‌​‌ of sensorimotor agency and​​ open-ended evolutionary processes, has​​​‌ the potential to have‌ significant impact in this‌​‌ domain. This has been​​ attested recently by our​​​‌ Best Paper award at‌ the Alife 2023 conference‌​‌ (and also wider impact,​​ e.g. through > 2​​​‌ millions views of the‌ popular science videos of‌​‌ Sciences Etonnantes and EGO​​ presenting - in part​​​‌ - our work on‌ this topic). As we‌​‌ are aiming to study​​ the self-organisation of basic​​​‌ forms of memory, learning,‌ and even autotelic learning,‌​‌ in such environments, this​​ may also constitute a​​​‌ foundational approach to build‌ AI systems from the‌​‌ ground up, possibly opening​​ new possibilities in terms​​​‌ of robustness, adaptivity and‌ generalisation

5 Social and‌​‌ environmental responsibility

5.1 Footprint​​ of research activities

AI​​​‌ is a field of‌ research that currently requires‌​‌ a lot of computational​​ resources, which is a​​​‌ challenge as these resources‌ have an environmental cost.‌​‌ In the team we​​ try to address this​​​‌ challenge in two ways:‌

  • by working on developmental‌​‌ machine learning approaches that​​ model how humans manage​​​‌ to learn open-ended and‌ diverse repertoires of skills‌​‌ under severe limits of​​ time, energy and compute:​​​‌ for example, curiosity-driven learning‌ algorithms can be used‌​‌ to guide agent's exploration​​ of their environment so​​​‌ that they learn a‌ world model in a‌​‌ sample efficient manner, i.e.​​ by minimizing the number​​​‌ of runs and computations‌ they need to perform‌​‌ in the environment;
  • by​​ monitoring the number of​​​‌ CPU and GPU hours‌ required to carry out‌​‌ our experiments. For instance,​​ our work 9 used​​​‌ a total of 2.5‌ cpu years. More globally,‌​‌ our work uses large​​ scale computational resources, such​​​‌ as the Jean Zay‌ supercomputer platform, in which‌​‌ we use several hundred​​ thousands hours of GPU​​​‌ and CPU each year.‌

5.2 Impact of research‌​‌ results

Our research activities​​ are organized along two​​​‌ fundamental research axis (models‌ of human learning and‌​‌ algorithms for developmental machine​​ learning) and one application​​​‌ research axis (involving multiple‌ domains of application, see‌​‌ the Application Domains section).​​ This entails different dimensions​​​‌ of potential societal impact:‌

  • Towards autonomous agents that‌​‌ can be shaped to​​ human preferences and be​​​‌ explainable We work on‌ reinforcement learning architectures where‌​‌ autonomous agents interact with​​​‌ a social partner to​ explore a large set​‌ of possible interactions and​​ learn to master them,​​​‌ using language as a​ key communication medium. As​‌ a result, our work​​ contributes to facilitating human​​​‌ intervention in the learning​ process of agents (e.g.​‌ digital assistants, video games​​ characters, robots), which we​​​‌ believe is a key​ step towards more explainable​‌ and safer autonomous agents.​​
  • Reproducibility of research:​​​‌ By releasing the codes​ of our research papers,​‌ we believe that we​​ help efforts in reproducible​​​‌ science and allow the​ wider community to build​‌ upon and extend our​​ work in the future.​​​‌ In that spirit, we​ also provide clear explanations​‌ on the statistical testing​​ methods when reporting the​​​‌ results.
  • Digital transformation and​ Competences' challenges facing schools​‌ in the 21st century.​​ We expect our findings​​​‌ to inform the broader​ societal challenges inherent to​‌ the School of the​​ 21st Century, ranging from​​​‌ helping children (and their​ teachers) to develop cross-domain​‌ skills for learning such​​ as curiosity and meta-cognition,​​​‌ while improving inclusivity in​ schools (learners with disabilities,​‌ especially cognitive disabilities) as​​ well as promoting lifelong​​​‌ learning in older adults​ (successful aging), using cognitive-based​‌ research findings.
  • AI and​​ personalized educational technologies to​​​‌ reduce inequalities due to​ neurodiversity The Flowers team​‌ develops AI technologies aiming​​ to personalize sequences of​​​‌ educational activities in digital​ educational apps: this entails​‌ the central challenge of​​ designing systems which can​​​‌ have equitable impact over​ a diversity of students​‌ and reduce inequalities in​​ academic achievement. Using models​​​‌ of curiosity-driven learning to​ design AI algorithms for​‌ such personalization, we have​​ been working to enable​​​‌ them to be positively​ and equitably impactful across​‌ several dimensions of diversity:​​ for young learners or​​​‌ for aging populations; for​ learners with low initial​‌ levels as well as​​ for learners with high​​​‌ initial levels; for "normally"​ developping children and for​‌ children with developmental disorders;​​ and for learners of​​​‌ different socio-cultural backgrounds (e.g.​ we could show in​‌ the KidLearn project that​​ the system is equally​​​‌ impactful along these various​ kinds of diversities).
  • Health:​‌ Bio-printing The Flowers team​​ is studying the use​​​‌ of curiosity-driven exploration algorithm​ in the domain of​‌ automated discovery, enabling scientists​​ in physics/chemistry/biology to efficiently​​​‌ explore and build maps​ of the possible structures​‌ of various complex systems.​​ One particular domain of​​​‌ application we are studying​ is bio-printing, where a​‌ challenge consists in exploring​​ and understanding the space​​​‌ of morphogenetic structures self-organized​ by bio-printed cell populations.​‌ This could facilitate the​​ design and bio-printing of​​​‌ personalized skins or organoids​ for people that need​‌ transplants, and thus could​​ have major impact on​​​‌ the health of people​ needing such transplants.
  • Tools​‌ for human creativity and​​ the arts Curiosity-driven exploration​​​‌ algorithms could also in​ principle be used as​‌ tools to help human​​ users in creative activities​​​‌ ranging from writing stories​ to painting or musical​‌ creation, which are domains​​ we aim to consider​​​‌ in the future, and​ thus this constitutes another​‌ societal and cultural domain​​ where our research could​​ have impact.
  • Education to​​​‌ AI As artificial intelligence‌ takes a greater role‌​‌ in human society, it​​ is of foremost importance​​​‌ to empower individuals with‌ understanding of these technologies.‌​‌ For this purpose, the​​ Flowers lab has been​​​‌ actively involved in educational‌ and popularization activities, in‌​‌ particular by designing educational​​ robotics kits that form​​​‌ a motivating and tangible‌ context to understand basic‌​‌ concepts in AI: these​​ include the Inirobot kit​​​‌ (used by >30k primary‌ school students in France‌​‌ (see) and​​ the Poppy Education kit​​​‌ (see) now‌ supported by the Poppy‌​‌ Station educational consortium (​​see)
  • Health: optimization​​​‌ of intervention strategies during‌ pandemic events Modelling the‌​‌ dynamics of epidemics helps​​ proposing control strategies based​​​‌ on pharmaceutical and non-pharmaceutical‌ interventions (contact limitation, lock‌​‌ down, vaccination, etc). Hand-designing​​ such strategies is not​​​‌ trivial because of the‌ number of possible interventions‌​‌ and the difficulty to​​ predict long-term effects. This​​​‌ task can be cast‌ as an optimization problem‌​‌ where state-of-the-art machine learning​​ algorithms such as deep​​​‌ reinforcement learning, might bring‌ significant value. However, the‌​‌ specificity of each domain​​ – epidemic modelling or​​​‌ solving optimization problem –‌ requires strong collaborations between‌​‌ researchers from different fields​​ of expertise. Due to​​​‌ its fundamental multi-objective nature,‌ the problem of optimizing‌​‌ intervention strategies can benefit​​ from the goal-conditioned reinforcement​​​‌ learning algorithms we develop‌ at Flowers. In this‌​‌ context, we have developped​​ EpidemiOptim, a Python toolbox​​​‌ that facilitates collaborations between‌ researchers in epidemiology and‌​‌ optimization (see).​​

6 Highlights of the​​​‌ year

  • Renewal of the‌ team: After a decade‌​‌ of research and applications,​​ the team was renewed​​​‌ and is now named‌ the Flowers AI &‌​‌ CogSci Lab. This new​​ name highlights our activities​​​‌ at the cross-roads of‌ AI and cognitive sciences,‌​‌ studying curiosity and its​​ roles in open-ended learning​​​‌ in humans and machines,‌ from individuals to collectives.‌​‌ Our new detailed research​​ program is available here​​​‌. The team is‌ associated with both Inria‌​‌ and the University of​​ Bordeaux, France.
  • Understanding human​​​‌ curiosity and metacognition. We‌ started a new European‌​‌ project in collaboration with​​ cognitive neuroscience labs of​​​‌ M. Gruber's in Univ.‌ Cardiff, and Y. Fandakova's‌​‌ in University of Trier,​​ aiming to study the​​​‌ joint development of curiosity‌ and metacognition in adolescents,‌​‌ through a set of​​ behavioural and neuro-imaging studies.​​​‌ This project also aims‌ to leverage new insights‌​‌ to be applied in​​ educational technologies. We collaborated​​​‌ with Alexandr Ten, Michiko‌ Sasaki and Kou Murayama‌​‌ (Univ. Tuebingen) in producing​​ a theoretical framework enabling​​​‌ to integrate multiple theories‌ of curiosity developed across‌​‌ the litterature, and relating​​ them to the well-known​​​‌ "Curious U" effect: this‌ framework was published in‌​‌ the Open Mind journal​​ 42. Collaborating with​​​‌ A. Tricot (Univ. Montpellier),‌ we developped a theoretical‌​‌ perspective to study the​​ links between intrinsic motivation​​​‌ and cognitive load in‌ the context of extended-reality‌​‌ educational interventions 39,​​ and published an associated​​​‌ study about the use‌ of virtual reality for‌​‌ optimizing cognitive load and​​​‌ intrinsic motivation in educational​ technologies 38. Finally,​‌ we collaborated with M.​​ Derex (IAST, Toulouse), as​​​‌ well as with Sheina​ Lew-Levy (Durham University) and​‌ Sarah Pope-Caldwell (Georgia State​​ University), in the design​​​‌ and implementation of a​ study of cross-cultural similarities​‌ and differences in curiosity-driven​​ exploration, conducted in Congo​​​‌ with Bayaka and Bandongo​ populations.
  • Autotelic curiosity and​‌ open-ended learning in agentic​​ generative AI. We continued​​​‌ building the foundations of​ a new generation of​‌ genAI systems that are​​ open-ended, curious, autotelic, grounded​​​‌ and continuously self-improving. To​ do so, we leveraged​‌ GLAM, an approach​​ we designed 3 years​​​‌ ago to turn LLMs​ into agents that learn​‌ to solve goals in​​ interactive environments though online​​​‌ RL (not produce texts​ that humans like, but​‌ achieve practical goals!), as​​ the basis for building​​​‌ curious agents that sample​ their own goals. We​‌ designed MAGELLAN (published​​ at ICML 2025), a​​​‌ method enabling genAI agents​ to navigate very large​‌ spaces of goals, where​​ millions of them may​​​‌ be either two easy​ or difficult 47.​‌ MAGELLAN makes it possible​​ by leveraging the learning​​​‌ progress hypothesis, which we​ developed to account for​‌ human curiosity-driven learning: goals​​ that are sampled in​​​‌ priority are those with​ high expected learning progress.​‌ Achieving this requires advanced​​ metacognitive skills, which LLMs​​​‌ lacked so far: MAGELLAN​ learns these metacognitives skills,​‌ enabling to predict learning​​ progress in goals that​​​‌ were never sampled, using​ semantic information in embedding​‌ spaces. Curious LLM agents​​ can also enact artificial​​​‌ scientists that explore the​ environment to hypothesize, experiment,​‌ test, confirm or revise​​ abstract rules to build​​​‌ human-readable world models. First​ steps in this directions​‌ were made in WorldLLM​​59. Imagining new​​​‌ abstract goals that maximize​ learning progress is another​‌ challenges for autotelic genAI​​ systems. One approach is​​​‌ to formulate them directly​ as code, such as​‌ in the AutotelicLLM agent​​ (40) enabling​​​‌ open-ended exploration in Crafter​ (a 2D Minecraft). These​‌ projects involved collaborations with​​ S. Aissi, O. Sigaud,​​​‌ L. Soulier, N. Tome​ (Sorbonne Université), S. Lamprier​‌ (Univ. Angers), T. Wolf​​ (Hugging Face) and G.​​​‌ Pourcel (Univ. Amsterdam).
  • Autotelic​ Generative AI for Self-Improving​‌ Program Synthesis and ARC-Prize.​​ In the context of​​​‌ the LLM4Code Inria challenge,​ the team started collaboration​‌ on projects at the​​ intersection of generative AI,​​​‌ program synthesis and AI​ assisted discovery in mathematics.​‌ In particular, we developped​​ collaborations with N. Fijalkow​​​‌ (Labri, CNRS), X. Hinault​ (Mnemosyne), G. Baudart (PiCube).​‌ This year we continued​​ leveraging the ACES method​​​‌, enabling large language​ models to self-generate diverse​‌ and challenging programming puzzles​​ (using autotelic exploration), to​​​‌ transpose and adapt it​ in the domain of​‌ mathematics. In the domain​​ of program synthesis, we​​​‌ developed a new approach,​ called SOAR , enabling​‌ continuous self-improvement of LLMs​​ as operator of evolutionary​​​‌ algorithms. This new method,​ published at ICML 2025,​‌ enabled to push the​​ state-of-the-art on the ARC-AGI​​​‌ 1 benchmark (category of​ approaches based on open-source​‌ models and program synthesis),​​ and was awarded the​​ 2nd place at the​​​‌ ARC-Prize (paper category). We‌ also started to explore‌​‌ how full-fledged generative AI​​ agents can be used​​​‌ to search and optimize‌ program controllers for simulated‌​‌ agents 44.
  • Collective​​ intelligence and social learning​​​‌ in AI systems. We‌ continued exploring key questions‌​‌ at the crossroads of​​ AI and society, studying​​​‌ how methods from human‌ sciences may (or not)‌​‌ be used to understand​​ socio-cultural properties of genAI.​​​‌ In particular, this year‌ we focused on studying‌​‌ fundamental properties of GenAI​​ systems as cultural transmission​​​‌ technologies: they massively (re)produce‌ cultural artifacts (e.g. texts)‌​‌ which are in turn​​ viewed by/influencing both humans​​​‌ and other GenAI systems.‌ It's important to understand‌​‌ the dynamics of the​​ evolution of cultural artefacts​​​‌ when GenAI are part‌ of the transmission chains.‌​‌ As first steps in​​ this direction, we adapted​​​‌ the so-called "iterative chain‌ design" from the cultural‌​‌ evolution community, where LLMs​​ basically play a version​​​‌ of the telephone game.‌ This allowed us to‌​‌ identify dynamical properties like​​ collapse or attractors that​​​‌ depend on various properties‌ of data. This work‌​‌ resulted in one paper​​ published at ICLR 202​​​‌60, and another‌ at EMNLP 202549‌​‌. This involved collaborations​​ with Maxime Derex (IAST​​​‌ Toulouse), Cédric Colas (MIT),‌ Gaia Molinaro (UC Berkeley),‌​‌ Eleni Nisioti and Sebastian​​ Risi (ITU Copenhagen), Peter-Ford​​​‌ Dominey (INSERM), Ida Momennejad‌ (Microsoft Research), Remy Portelas‌​‌ (Ubisoft).
  • Education, generative AI​​ and cognitive training We​​​‌ started a large scale‌ collaborative project, called GAIMHE‌​‌, to study the​​ design of educational technologies​​​‌ that combine the power‌ of pedagogically grounded ITS‌​‌ for cross-exercise personalization, with​​ the flexibility of generative​​​‌ AI for pre-generation of‌ exercices and within-exercise personalization.‌​‌ This project, funded by​​ BPI, involves collaboration with​​​‌ the EvidenceB company,‌ as well as ClassCode‌​‌ and Café Pédagogique educational​​ NGOs.

    We also continued​​​‌ working on developing and‌ evaluating in classrooms various‌​‌ pedagogical interventions training curiosity​​ and metacognition (both conceptually​​​‌ and procedurally), and focused‌ on studying the comparative‌​‌ impact of interventions when​​ made by teachers themselves​​​‌ as opposed to researchers.‌ Furthermore, we continued conducting‌​‌ our series of experimentations​​ in middle schools to​​​‌ study whether schoolchildren understand‌ and know how to‌​‌ use generative AI tools​​ in the context of​​​‌ educational exercices, showing strong‌ limits and pointing to‌​‌ two needs: training their​​ metacognition and their AI​​​‌ litteracy (a first series‌ of results is available‌​‌ here). Also, we​​ developed a software library​​​‌ tools (LLM4humanities library)‌ enabling to use LLMs‌​‌ to partially automate qualitative​​ analysis methods in social​​​‌ sciences, leveraging our prior‌ work in this direction‌​‌ 170, opening new​​ perspectives for studying qualitatively​​​‌ large text corpuses or‌ verbal data from psychology‌​‌ or educational experiments. We​​ also continued working on​​​‌ evaluating the use of‌ adaptive personalization algorithms (in‌​‌ particular ZPDES, based​​ on the learning progress​​​‌ theory) for cognitive training,‌ and with diverse populations.‌​‌ This was associated to​​ a review of AI-based​​​‌ approaches to cognitive training,‌ published in Plos One‌​‌.This involved collaborations with​​​‌ R. Abdelghani and Kou​ Murayama (University of Tuebingen),​‌ C. Kidd (Univ. Berkeley).​​

    We also continued working​​​‌ on developing frameworks and​ tools to support social​‌ curiosity among the stakeholders​​ working with children with​​​‌ ASD 41, 35​, 58. Finally,​‌ we continued to develop​​ and adapt AI-based personalization​​​‌ technologies for supporting learning​ and well-being in aging​‌ populations, for example through​​ cognitive training 36 and​​​‌ monitoring of daily activities​ 46.

  • Curiosity-driven AI​‌ for assisted scientific discovery​​: We continued studying​​​‌ how curiosity-driven AI algorithms​ can enable scientists (physicists,​‌ chemists, biologists, etc) explore​​ and map the space​​​‌ of self-organized behaviours in​ diverse complex systems 96​‌. We published a​​ milestone article (34​​​‌) in Science Advances​ presenting the results of​‌ our multi-year projects using​​ autotelic AI algorithm to​​​‌ investigate the possibilities for​ guided self-organization of robust​‌ sensorimotor agents from low-level​​ interactions in continuous cellular​​​‌ automata (see web site​). We also started​‌ to explore how generative​​ AI semantic models can​​​‌ be used to drive​ open-ended exploration of self-organized​‌ patterns in cellular automata​​ (Alife 2025 paper 48​​​‌), how autotelic algorithms​ can explore full ecosystems​‌ (Alife 2025 paper 51​​), to use autotelic​​​‌ reinforcement learning techniques to​ control and grow in​‌ an online manner such​​ self-organized patterns (ALife 2025​​​‌ paper 45), and​ to study human-guidance in​‌ these loops 52.​​ On this line of​​​‌ research, we continued developing​ collaborations with M. Levin​‌ at Tufts University. In​​ particular, we studied how​​​‌ autotelic AI systems (IMGEP​ algorithms) can enable cost​‌ effective discovery of diverse​​ sophisticated and robust behaviors​​​‌ in gene regulatory networks,​ resulting in a milestone​‌ paper published in eLife​​ 99.
  • Clément Moulin-Frier​​​‌ recently moved for personal​ reasons to Inria Lyon,​‌ but we are still​​ strongly collaborating with him.​​​‌ He joined the BioTiC​ Inria team which focuses​‌ on fundamental research in​​ theoretical and computational biology,​​​‌ with a specific a​ specific interest in modeling​‌ evolutionary processes. We are​​ exploring two main research​​​‌ directions in collaboration with​ BioTiC: (1) Large-scale evo-evolutionary​‌ simulations in cellular automata,​​ which is an important​​​‌ topic in both teams​ (from a Computational Biology​‌ in BioTic 92 vs.​​ an Artificial Life perspective​​​‌ in Flowers 55)​ and (2) Studying the​‌ similarities and differences between​​ biological vs. cultural evolution,​​​‌ both in the natural​ world and in computer​‌ simulation (see Section 8.2.6​​).
  • Reverse engineering large​​​‌ language models and cheaply​ predicting their performances in​‌ benchmarks. In collaboration with​​ N. Yax and Stefano​​​‌ Palminteri (ENS), we developed​ a new algorithmic method,​‌ called PhyloLM and published​​ at ICLR 2025 56​​​‌, that aims at​ reverse-engineering generative AI model​‌ origins from only black-box​​ access — which models​​​‌ derive from which (e.g.​ reusing data or algorithm​‌ or architecture or other​​ features). It proved remarkably​​​‌ powerful at reconstructing model​ evolutionary trees and predicting​‌ benchmark performance cheaply. This​​ approach opens new possibilities​​​‌ for safety applications as​ hundreds of new models​‌ appear daily.
  • Software The​​ team continued to develop​​ several key software libraries:​​​‌ Lamorel, enabling LLMs‌ to be used as‌​‌ agents in interactive environments;​​ AdTool, enabling easy​​​‌ use of autotelic exploration‌ algorithms for automated discoveries‌​‌ in physics/chemistry/Alife; Vivarium,​​ for building and running​​​‌ multi-agents simulations using Jax,‌ with a focus on‌​‌ educational use; LLM4Humanities,​​ to enable researchers in​​​‌ human sciences leverage generative‌ AI models for tasks‌​‌ like annotations or analysis​​ of texts corpuses, using​​​‌ a solid methodological approach.‌
  • Outreach The team participated‌​‌ to multiple events such​​ as the science festival​​​‌ at Cap Sciences (Bordeaux),‌ AI days for teachers‌​‌ (Bordeaux), Main à la​​ pâte foundation, Learning Show​​​‌ (Rennes), Science with and‌ for society (Bordeaux), the‌​‌ Chiche program (Nouvelle Aquitaine),​​ Academic days of Poitiers​​​‌ academy, or CogniForum, and‌ welcomed several middle and‌​‌ high-school students for their​​ internships. The team also​​​‌ continued to produce the‌ pedagogical video series "ChatGPT‌​‌ explained in 5 mn",​​ aimed at training generative​​​‌ AI literacy in a‌ wide diversity of students‌​‌ (e.g. high school), available​​ here. They are​​​‌ under a Creative Commons‌ licence, CC-BY, enabling open‌​‌ and free reuse. They​​ were already integrated in​​​‌ the MOOC AI4T (see‌ here), as well‌​‌ as in an internal​​ training platform of "Académie​​​‌ du Numérique du Ministère‌ de la défense", in‌​‌ a mobile app made​​ by Inria with educational​​​‌ materials related to AI‌ (see here), and‌​‌ are being adapted and​​ integrated in a training​​​‌ platform for the whole‌ population of civil servants‌​‌ in France, coordinated by​​ DINUM.
  • Support to public​​​‌ policy The team was‌ involved in several major‌​‌ actions to support public​​ policies on the topic​​​‌ of AI and education.‌ Members of the team‌​‌ designed and conducted training​​ sessions in different academies​​​‌ for supervisory staff and‌ teachers, e.g. ETAPP-IA day‌​‌ in Nouvelle-Aquitaine (January 2025);​​ departmental training of CPE​​​‌ and documentary teachers of‌ Nouvelle-Aquitaine during a day‌​‌ at the Lycée Les​​ Iris in Lormont (May​​​‌ 2025); Academic Days of‌ Innovation for teachers of‌​‌ Nouvelle-Aquitaine, Spring Days of​​ Education Research at INSPEs,​​​‌ (June 2025); PhilosophIA Citizens'‌ Convention (April 2025), twin‌​‌ conference of Cnesco/Cardie Charente-Maritime​​ (January 2025), working group​​​‌ Education and Cognitive Sciences‌ of the academies of‌​‌ Créteil, Versailles and Paris,​​ scheduled for March 2026.​​​‌ H. Sauzéon and PY.‌ Oudeyer were interviewed and‌​‌ wrote reports to contribute​​ to the report of​​​‌ French Senate on AI‌ and education. PY‌​‌ Oudeyer was auditioned by​​ the commission on cultural​​​‌ and educational affairs in‌ the French parliament, to‌​‌ discuss the major challenges​​ and opportunities of AI​​​‌ and education.
  • PY Oudeyer‌ was selected by the‌​‌ French National Research Agency​​ (ANR) as one of​​​‌ 20 researchers across all‌ disciplines to highlight research‌​‌ projects funded in the​​ last 20 years, and​​​‌ at the occasion of‌ celebrating the 20th anniversay‌​‌ of ANR. He was​​ also invited to give​​​‌ a keynote talk on‌ curiosity-driven learning in humans:‌​‌ learning progress, autotelic exploration​​ and open-ended development, at​​​‌ the Budapest Conference on‌ Cognitive Development, see‌​‌ video.

6.1 Awards​​​‌

  • Julien Pourcel, Cédric Colas​ and Pierre-Yves Oudeyer were​‌ awarded the 2nd place​​ ARC-Prize in the paper​​​‌ category, for their article​ and method SOAR,​‌ published at ICML 2025.​​ This method introduced a​​​‌ novel approach to enable​ self-improvement of LLMs when​‌ used as operators of​​ evolutionary search algorithms in​​​‌ general program synthesis, and​ pushed the frontier of​‌ state-of-the-art results on the​​ ARC-AGI 1 benchmark (in​​​‌ the category of approaches​ using open-source models and​‌ program synthesis).
  • We received​​ two Best Paper Awards​​​‌ at the Evostar 2025​ conference for our paper​‌ Emergent kin selection of​​ altruistic feeding via non-episodic​​​‌ neuroevolution55 (Best​ paper of the EvoApp​‌ track + Best student​​ paper award to the​​​‌ first author Max Taylor-Davis​).
  • Didier Roy, Pierre-Yves​‌ Oudeyer (authors) and Clémentine​​ Latron (illustrator) obtained the​​​‌ 38th Prize Roberval,​ fo the best popular​‌ science book in the​​ youth category, for their​​​‌ book C'est pas (moi),​ c'est l'IA (Nathan),​‌ and were selected among​​ the 3 finalists of​​​‌ Prize "Goût des Sciences"​ organized by the French​‌ Ministry of Higher Education,​​ Science and Space. D.​​​‌ Roy and P-Y. Oudeyer​ gave many general public​‌ presentations in the context​​ of this book.
  • Matisse​​​‌ Poupart was awarded the​ Best PhD prize from​‌ R3NumEd, the research​​ network on educational technologies​​​‌ in Nouvelle-Aquitaine, for his​ PhD entitled Curious and​‌ therefore not overloaded: Towards​​ an integrated understanding of​​​‌ curiosity and cognitive load​ in XR learning environments​‌.
  • Leana Petitot, Hélène​​ Sauzéon and Pierre Dragicevic​​​‌ obtained an Honorable mention​ at the ACM-CHI2 conference​‌ for their paper entitled​​ "The Effect of Augmented​​​‌ Reality on Involuntary Autobiographical​ Memory", on co-design of​‌ an augmented reality (AR)​​ application simulating a museum​​​‌ visit in the context​ of the I-am Associated​‌ Team, 2023, integrated with​​ an evaluation of involuntary​​​‌ and uncontrollable memory revival.​ This study confirmed our​‌ hypothesis: AR enhances this​​ type of memory compared​​​‌ to 3D images 53​, suggesting potential cognitive​‌ manipulations.

6.2 PhD defenses​​

7​​ Latest software developments, platforms,​​​‌ open data

7.1 Latest​ software developments

7.1.1 SocialAI​‌

  • Name:
    SocialAI: Benchmarking Socio-Cognitive​​ Abilities in Deep Reinforcement​​ Learning Agents
  • Keywords:
    Artificial​​​‌ intelligence, Deep learning, Reinforcement‌ learning, Large Language Models‌​‌
  • Functional Description:

    Source code​​ for the paper https://arxiv.org/abs/2107.00956.​​​‌

    A suite of environments‌ for testing socio-cognitive abilities‌​‌ of artificial agents. Environments​​ can be used in​​​‌ the multimodal setting (suitable‌ for RL agents) and‌​‌ in the pure text​​ setting (suitable for Large​​​‌ Language Model-based agents). Also‌ contains RL and LLM‌​‌ baselines.

  • URL:
  • Contact:​​
    Grgur Kovac

7.1.2 AutoDisc​​​‌

  • Keyword:
    Complex Systems
  • Functional‌ Description:
    AutoDisc is a‌​‌ software built for automated​​ scientific discoveries in complex​​​‌ systems (e.g. self-organizing systems).‌ It can be used‌​‌ as a tool to​​ experiment automated discovery of​​​‌ various systems using exploration‌ algorithms (e.g. curiosity-driven). Our‌​‌ software is fully Open​​ Source and allows user​​​‌ to add their own‌ systems, exploration algorithms or‌​‌ visualization methods.
  • URL:
  • Contact:
    Clément Romac

7.1.3​​​‌ ADTool

  • Keywords:
    Machine learning,‌ Python, Cellular automaton, Physical‌​‌ simulation, Pattern discovery, Exploration​​
  • Functional Description:

    ADTool is​​​‌ a versatile and open-source‌ Python framework designed to‌​‌ explore complex parametric systems​​ using IMGEP algorithms (Intrinsic​​​‌ Motivation for Goal Exploration‌ Processes) as described in‌​‌ https://arxiv.org/pdf/1708.02190. This curiosity-driven approach​​ enables automatic exploration and​​​‌ the discovery of new‌ behaviors across a wide‌​‌ range of domains, offering​​ a novel way to​​​‌ study complex systems.

    With‌ ADTool, users can explore‌​‌ cellular automata such as​​ Lenia, Particle Lenia, and​​​‌ Flowlenia to uncover patterns‌ and emergent behaviors. Its‌​‌ capabilities extend to drug​​ discovery, exploring chemical spaces​​​‌ to identify promising protein-ligand‌ affinity profiles. The framework‌​‌ also ventures into physics,​​ with applications such as​​​‌ searching for trajectories in‌ the N-body problem, simulating‌​‌ the Kuramoto model, exploring​​ the Gray-Scott reaction-diffusion system,​​​‌ and studying hypergraph rewriting‌ systems for Wolfram physics.‌​‌ In digital art, ADTool​​ fosters creativity by exploring​​​‌ processes like subtractive sound‌ synthesis and other artistic‌​‌ methods.

    The framework is​​ designed to be flexible​​​‌ and extensible, allowing users‌ to define their own‌​‌ systems and integrate custom​​ exploration strategies. It includes​​​‌ mechanisms for saving discoveries‌ to disk, making it‌​‌ easier to resume experiments​​ or share results with​​​‌ collaborators. Additionally, an integrated‌ visualization tool provides a‌​‌ user-friendly interface to track​​ exploration progress, enhancing the​​​‌ understanding and analysis of‌ results.

    The scientific foundation‌​‌ of ADTool lies in​​ "curiosity-search" algorithms, which autonomously​​​‌ explore behavioral spaces to‌ identify interesting phenomena without‌​‌ predefined objectives. These algorithms,​​ initially developed for robotic​​​‌ learning, are now applied‌ to the study of‌​‌ emergent behaviors in various​​ systems.

    Whether you are​​​‌ a physicist, chemist, biologist,‌ or digital artist, ADTool‌​‌ can help you explore​​ and understand complex systems.​​​‌

    Reproducibility is guarantied with‌ a predifined Python environment‌​‌ and experiments can be​​ launched with a simple​​​‌ command line: python3 run.py‌ –config_file examples/grayscott/gray_scott.json

  • Contact:
    Zacharie‌​‌ Bugaud

7.1.4 Kids Ask​​

  • Keywords:
    Human Computer Interaction,​​​‌ Cognitive sciences
  • Functional Description:‌
    Kids Ask is a‌​‌ web-based educational platform that​​ involves an interaction between​​​‌ a child and a‌ conversational agent. The platform‌​‌ is designed to teach​​ children how to generate​​​‌ curiosity-based questions and use‌ them in their learning‌​‌ in order to gain​​​‌ new knowledge in an​ autonomous way.
  • URL:
  • Contact:
    Rania Abdelghani

7.1.5​​ ToGather

  • Keywords:
    Education, Handicap,​​​‌ Environment perception
  • Scientific Description:​
    With participatory design methods,​‌ we have designed an​​ interactive website application for​​​‌ educational purposes. This application​ aims to provide interactive​‌ services with continuously updated​​ content for the stakeholders​​​‌ of school inclusion of​ children with specific educational​‌ needs.
  • Functional Description:
    Website​​ gathering information on middle​​​‌ school students with neurodevelopmental​ disorders. Authentication is required​‌ to access the site's​​ content. Each user can​​​‌ only access the student​ file(s) of the young​‌ person(s) they are accompanying.​​ A student file contains​​​‌ 6 tabs, in which​ each type of user​‌ can add, edit or​​ delete information: 1. Profile:​​​‌ to quickly get to​ know the student 2.​‌ Skills: evaluation at a​​ given moment and evolution​​​‌ over time 3. Compendium​ of tips: includes psycho-educational​‌ tips 4. Meetings: manager​​ and reports 5. News:​​​‌ share information over time​ 6. Contacts: contact information​‌ for stakeholders The student​​ only has the right​​​‌ to view information about​ him/her.
  • Publication:
  • Contact:​‌
    Cécile Mazon
  • Participant:
    4​​ anonymous participants

7.1.6 mc_training​​​‌

  • Name:
    Platform for metacognitive​ training
  • Keywords:
    Human Computer​‌ Interaction, Education
  • Functional Description:​​

    This is a web​​​‌ platform for children between​ 9 and 11 years​‌ old, designed to help​​ children practice 4 metacognitive​​​‌ skills that are thought​ to be involved in​‌ curiosity-driven learning: - the​​ ability to identify uncertainties​​​‌ - the ability to​ generate informed hypotheses -​‌ the ability to ask​​ questions - the ability​​​‌ to evaluate the value​ of a preconceived inference.​‌

    Children work on a​​ reading-comprehension tasks and, for​​​‌ each of these skills,​ the platform offers help​‌ through a "conversation" with​​ conversational agents that give​​​‌ instructions to perform the​ task, with respect to​‌ every skill, and can​​ give suggestions if the​​​‌ child asks for it.​

  • Contact:
    Rania Abdelghani

7.1.7​‌ Evolution of adaptation mechanisms​​ in complex environments

  • Name:​​​‌
    Plasticity and evolvability under​ environmental variability: the joint​‌ role of fitness-based selection​​ and niche-limited competition
  • Keywords:​​​‌
    Evolution, Ecology, Dynamic adaptation​
  • Functional Description:

    This is​‌ the code accompannying our​​ paper Plasticity and evolvability​​​‌ under environmental variability: the​ joint role of fitness-based​‌ selection and niche-limited competition"​​ which is to be​​​‌ presented at the Gecco​ 2022 conference.

    In this​‌ work we have studied​​ the evolution of a​​​‌ population of agents in​ a world where the​‌ fitness landscape changes with​​ generations based on climate​​​‌ function and a latitudinal​ model that divides the​‌ world in different niches.​​ We have implemented different​​​‌ selection mechanisms (fitness-based selection​ and niche-limited competition).

    The​‌ world is divided into​​ niches that correspond to​​​‌ different latitudes and whose​ state evolves based on​‌ a common climate function.​​

    We model the plasticity​​​‌ of an individual using​ tolerance curves originally developed​‌ in ecology. Plasticity curves​​ have the form of​​​‌ a Gaussian the capture​ the benefits and costs​‌ of plasticity when comparing​​ a specialist (left) with​​​‌ a generalist (right) agent.​

    The repo contains the​‌ following main elements :​​

    folder source contains the​​ main functionality for running​​​‌ a simulation scripts/run/reproduce_gecco.py can‌ be used to rerun‌​‌ all simulations in the​​ paper scripts/evaluate contains scripts​​​‌ for reproducing figures. reproduce_figures.py‌ will produce all figures‌​‌ (provided you have already​​ run scripts/run/reproduce_gecco.py to generate​​​‌ the data) folder projects‌ contains data generated from‌​‌ running a simulation How​​ to run To install​​​‌ all package dependencies you‌ can create a conda‌​‌ environment as:

    conda env​​ create -f environment.yml

    All​​​‌ script executions need to‌ be run from folder‌​‌ source. Once there, you​​ can use simulate.py, the​​​‌ main interface of the‌ codebase to run a‌​‌ simulation, For example:

    python​​ simulate.py –project test_stable –env_type​​​‌ stable –num_gens 300 –capacity‌ 1000 –num_niches 10 –trials‌​‌ 10 –selection_type NF –climate_mean_init​​ 2

    will run a​​​‌ simulation with an environment‌ with a climate function‌​‌ whose state is constantly​​ 2 consisting of 100​​​‌ niches for 300 generations‌ and 10 independent trials.‌​‌ The maximum population size​​ will be 1000*2 and​​​‌ selection will be fitness-based‌ (higher fitness means higher‌​‌ chances of reproduction) and​​ niche limited (individuals reproduce​​​‌ independently in each niche‌ and compete only within‌​‌ a niche),

    You can​​ also take a look​​​‌ at scripts/run/reproduce_gecco.py to see‌ which flags were used‌​‌ for the simulations presented​​ in the paper.

    Running​​​‌ all simulations requires some‌ days. You can instead‌​‌ download the data produced​​ by running scripts/run/reproduce_gecco.py from​​​‌ this google folder and‌ unzip them under the‌​‌ projects directory.

  • URL:
  • Contact:
    Eleni Nisioti

7.1.8​​​‌ SAPIENS

  • Name:
    SAPIENS: Structuring‌ multi-Agent toPology for Innovation‌​‌ through ExperieNce Sharing
  • Keywords:​​
    Reinforcement learning, Multi-agent
  • Functional​​​‌ Description:

    SAPIENS is a‌ reinforcement learning algorithm where‌​‌ multiple off-policy agents solve​​ the same task in​​​‌ parallel and exchange experiences‌ on the go. The‌​‌ group is characterized by​​ its topology, a graph​​​‌ that determines who communicates‌ with whom.

    All agents‌​‌ are DQNs and exchange​​ experiences have the form​​​‌ of transitions from their‌ replay buffers.

    Using SAPIENS‌​‌ we can define groups​​ of agents that are​​​‌ connected with others based‌ on a a) fully-connected‌​‌ topology b) small-world topology​​ c) ring topology or​​​‌ d) dynamic topology.

    Install‌ required packages You can‌​‌ install all required python​​ packages by creating a​​​‌ new conda environment containing‌ the packages in environment.yml:‌​‌

    conda env create -f​​ environment.yml

    And then activating​​​‌ the environment:

    conda activate‌ sapiens

    Example usages Under‌​‌ notebooks there is a​​ Jupyter notebook that will​​​‌ guide you through setting‌ up simulations with a‌​‌ fully-connected and a dynamic​​ social network structure for​​​‌ solving Wordcraft tasks. It‌ also explains how you‌​‌ can access visualizations of​​ the metrics produced during​​​‌ th$

    Reproducing the paper‌ results Scripts under the‌​‌ scripts directory are useful​​ for reproducing results and​​​‌ figures appearing in the‌ paper.

    With scripts/reproduce_runs.py you‌​‌ can run all simulations​​ presented in the paper​​​‌ from scratch.

    This file‌ is useful for looking‌​‌ at how the experiments​​ were configured but better​​​‌ avoid running it: simulations‌ will run locally and‌​‌ sequentially and will take​​ months to complete.

    Instead,​​​‌ you can access the‌ data files output by‌​‌ simulations on this online​​​‌ repo.

    Download this zip​ file and uncompress it​‌ under the projects directory.​​ This should create a​​​‌ projects/paper_done sub-directory.

    You can​ now reproduce all visualization​‌ presented in the paper.​​ Run:

    python scripts/reproduce_visuals.py

    This​​​‌ will save some general​ plots under visuals, while​‌ project-specific plots are saved​​ under the corresponding project​​​‌ in projects/paper_done

  • URL:
  • Contact:
    Eleni Nisioti

7.1.9​‌ architect-builder-abig

  • Name:
    Architect-Builder Iterated​​ Guiding
  • Keyword:
    Artificial intelligence​​​‌
  • Functional Description:

    Codebase for​ the paper Learning to​‌ guide and to be​​ guided in the Architect-Builder​​​‌ Problem

    ABIG stands for​ Architect-Builder Iterated Guiding and​‌ is an algorithmic solution​​ to the Architect-Builder Problem.​​​‌ The algorithm leverages a​ learned model of the​‌ builder to guide it​​ while the builder uses​​​‌ self-imitation learning to reinforce​ its guided behavior.

  • URL:​‌
  • Contact:
    Tristan Karch​​

7.1.10 EAGER

  • Name:
    Exploit​​​‌ question-Answering Grounding for effective​ Exploration in language-conditioned Reinforcement​‌ learning
  • Keywords:
    Reinforcement learning,​​ Language, Question Generation Question​​​‌ Answering, Reward shaping
  • Functional​ Description:
    A novel QG/QA​‌ framework for RL called​​ EAGER In EAGER, an​​​‌ agent reuses the initial​ language goal sentence to​‌ generate a set of​​ questions (QG): each of​​​‌ these self-generated questions defines​ an auxiliary objective. Here,​‌ generating a question consists​​ in masking a word​​​‌ of the initial language​ goal. Then the agent​‌ tries to answer these​​ questions (guess the missing​​​‌ word) only by observing​ its trajectory so far.​‌ When it manages to​​ answer a question correctly​​​‌ (QA) it obtains an​ intrinsic reward proportional to​‌ its confidence in the​​ answer. The QA module​​​‌ is trained using a​ set of successful example​‌ trajectories. If the agent​​ follows a path too​​​‌ different from correct ones​ at some point in​‌ its trajectory, the QA​​ module will not answer​​​‌ the question correctly, resulting​ in zero intrinsic reward.​‌ The sum of all​​ the intrinsic rewards measures​​​‌ the quality of a​ trajectory in relation to​‌ the given goal. In​​ other words, maximizing this​​​‌ intrinsic reward incentivizes the​ agent to produce behaviour​‌ that unambiguously explains various​​ aspects of the given​​​‌ goal.
  • URL:
  • Contact:​
    Thomas Carta

7.1.11 Flow-Lenia​‌

  • Name:
    Flow Lenia: Mass​​ conservation for the study​​​‌ of virtual creatures in​ continuous cellular automata
  • Keywords:​‌
    Cellular automaton, Self-organization
  • Functional​​ Description:

    This repo contains​​​‌ the code to run​ the Flow Lenia system​‌ which is a continuous​​ parametrized cellular automaton with​​​‌ mass conservation. This work​ extends the classic Lenia​‌ system with mass conservation​​ and allows to implement​​​‌ new feature like local​ parameter, environment components etc​‌

    Several declination of the​​ system (1 or several​​​‌ channels etc ) are​ available

    Please refer to​‌ the associated paper for​​ the details of the​​​‌ system

    Implemented in JAX​

  • URL:
  • Contact:
    Gautier​‌ Hamon

7.1.12 Kidlearn: money​​ game application

  • Functional Description:​​​‌
    The games is instantiated​ in a browser environment​‌ where students are proposed​​ exercises in the form​​​‌ of money/token games (see​ Figure 1). For​‌ an exercise type, one​​ object is presented with​​​‌ a given tagged price​ and the learner has​‌ to choose which combination​​ of bank notes, coins​​ or abstract tokens need​​​‌ to be taken from‌ the wallet to buy‌​‌ the object, with various​​ constraints depending on exercises​​​‌ parameters. The games have‌ been developed using web‌​‌ technologies, HTML5, javascript and​​ Django.
    Figure 1.a
    Figure 1.b
    Figure 1.c
    Figure 1.d
    Figure 1:​​​‌ Four principal regions are‌ defined in the graphical‌​‌ interface. The first is​​ the wallet location where​​​‌ users can pick and‌ drag the money items‌​‌ and drop them on​​ the repository location to​​​‌ compose the correct price.‌ The object and the‌​‌ price are present in​​ the object location. Four​​​‌ different types of exercises‌ exist: M : customer/one‌​‌ object, R : merchant/one​​ object, MM : customer/two​​​‌ objects, RM : merchant/two‌ objects.
  • URL:
  • Contact:‌​‌
    Benjamin Clement

7.1.13 cognitive-testbattery​​

  • Name:
    Cognitive test battery​​​‌ of human attention and‌ memory
  • Keywords:
    Open Access,‌​‌ Cognitive sciences
  • Scientific Description:​​
    Cognitive test batteries are​​​‌ widely used in diverse‌ research fields, such as‌​‌ cognitive training, cognitive disorder​​ assessment, or brain mechanism​​​‌ understanding. Although they need‌ flexibility according to the‌​‌ objectives of their usage,​​ most of the test​​​‌ batteries are not be‌ available as open-source software‌​‌ and not be tuned​​ by researchers in detail.​​​‌ The present study introduces‌ an open-source cognitive test‌​‌ battery to assess attention​​ and memory, using a​​​‌ javascript library, p5.js. Because‌ of the ubiquitous nature‌​‌ of dynamic attention in​​ our daily lives, it​​​‌ is crucial to have‌ tools for its assessment‌​‌ or training. For that​​ purpose, our test battery​​​‌ includes seven cognitive tasks‌ (multiple-objects tracking, enumeration, go/no-go,‌​‌ load-induced blindness, task-switching, working​​ memory, and memorability), common​​​‌ in cognitive science literature.‌ By using the test‌​‌ battery, we conducted an​​ online experiment to collect​​​‌ the benchmark data. Results‌ conducted on two separate‌​‌ days showed the high​​ cross-day reliability. Specifically, the​​​‌ task performance did not‌ largely change with the‌​‌ different days. Besides, our​​ test battery captures diverse​​​‌ individual differences and can‌ evaluate them based on‌​‌ the cognitive factors extracted​​ from latent factor analysis.​​​‌ Since we share our‌ source code as open-source‌​‌ software, users can expand​​ and manipulate experimental conditions​​​‌ flexibly. Our test battery‌ is also flexible in‌​‌ terms of the experimental​​ environment, i.e., it is​​​‌ possible to experiment either‌ online or in a‌​‌ laboratory environment.
  • Functional Description:​​
    The evaluation battery consists​​​‌ of 6 cognitive activities‌ (serious games: multi-object tracking,‌​‌ enumeration, go/no-go, Corsi, load-induced​​ blindness, taskswitching, memorability). Easily​​​‌ deployable as a web‌ application, it can be‌​‌ re-used and modified for​​ new experiments. The tool​​​‌ is documented in order‌ to facilitate the deployment‌​‌ and the analysis of​​ results.
  • URL:
  • Publication:​​​‌
  • Contact:
    Maxime Adolphe‌
  • Participant:
    4 anonymous participants‌​‌

7.1.14 Sensorimotor-lenia

  • Keywords:
    Cellular​​ automaton, Gradient descent, Curriculum​​​‌ Learning
  • Functional Description:
    Source‌ code for the search‌​‌ of sensorimotor agency in​​ cellular automata associated to​​​‌ this blogpost https://developmentalsystems.org/sensorimotor-lenia/. The‌ code allows to find‌​‌ rules in the cellular​​ automata Lenia (through gradient​​​‌ descent, curriculum learning and‌ diversity search) that lead‌​‌ to the self-organization of​​ moving agents robust to​​​‌ perturbation by obstacles.
  • URL:‌
  • Contact:
    Gautier Hamon‌​‌

7.1.15 Lamorel

  • Keywords:
    Large​​​‌ Language Models, Reinforcement learning,​ Distributed computing
  • Scientific Description:​‌

    Lamorel allows for seamless​​ scaling of LLMs when​​​‌ using embodied artificial agents​ such as Reinforcement Learning​‌ agents. One can use​​ and modify the LLM​​​‌ in any part of​ such agents (policy, goal​‌ sampler, social peer...). Lamorel​​ is particularly useful when​​​‌ performing large-scale experiments on​ clusters.

    It was already​‌ used in several papers,​​ notably leading to the​​​‌ first paper performing online​ RL on an LLM-based​‌ agent in an embodied​​ environment (Carta et. al,​​​‌ 2023).

  • Functional Description:

    Lamorel​ was initially designed to​‌ easily use LLMs in​​ interactive environments. It is​​​‌ especially made for high​ throughput using a distributed​‌ architecture. The philosophy of​​ *Lamorel* is to be​​​‌ very permissive and allow​ as much as possible​‌ usage of LLMs while​​ maintaining scaling: the application​​​‌ should run with 1​ or N LLMs.

    For​‌ this reason, it is​​ not specialised neither in​​​‌ RL nor in particular​ in RLHF. Our examples​‌ illustrate how *Lamorel* can​​ be used for various​​​‌ applications including RLHF-like finetuning.​ However, one must understand​‌ that *Lamorel*'s philosophy means​​ that users must implement​​​‌ themselves what they want​ to do with the​‌ LLM(s).

    This is why​​ we advise users knowing​​​‌ in advance they want​ to do RLHF, especially​‌ without any modification of​​ classic implementations, to use​​​‌ libs specialised in RLHF​ that already come with​‌ RL implementations (e.g. RL4LMs,​​ TRL). On the other​​​‌ hand, users more inclined​ to experiment with implementations​‌ or looking for an​​ LLM lib they can​​​‌ use in different projects​ may prefer Lamorel.

    Here​‌ are Lamorel's key features:​​ 1. Abstracts the use​​​‌ of LLMs (e.g. tonekization,​ batches) into simple calls​‌

    2. Provides a method​​ to compute the log​​​‌ probability of token sequences​ (e.g. action commands) given​‌ a prompt 3. Is​​ made for scaling up​​​‌ your experiments by deploying​ multiple instances of the​‌ LLM and dispatching the​​ computation thanks to a​​​‌ simple configuration file 4.​ Provides access to open-sourced​‌ LLMs from the Hugging​​ Face's hub along with​​​‌ Model Parallelism to use​ multiple GPUs for an​‌ LLM instance 5. Allows​​ one to give their​​​‌ own PyTorch modules to​ compute custom operations (e.g.​‌ to add new heads​​ on top of the​​​‌ LLM) 6. Allows one​ to train the LLM​‌ (or part of it)​​ thanks to a Data​​​‌ Parallelism setup where the​ user provides its own​‌ update method

  • URL:
  • Publications:
  • Contact:
    Clément​ Romac

7.1.16 GLAM

  • Name:​‌
    Grounding LAnguage Models
  • Keywords:​​
    Large Language Models, Reinforcement​​​‌ learning
  • Scientific Description:
    Recent​ works successfully leveraged Large​‌ Language Models' (LLM) abilities​​ to capture abstract knowledge​​​‌ about world's physics to​ solve decision-making problems. Yet,​‌ the alignment between LLMs'​​ knowledge and the environment​​​‌ can be wrong and​ limit functional competence due​‌ to lack of grounding.​​ In this paper, we​​​‌ study an approach (named​ GLAM) to achieve this​‌ alignment through functional grounding:​​ we consider an agent​​​‌ using an LLM as​ a policy that is​‌ progressively updated as the​​ agent interacts with the​​ environment, leveraging online Reinforcement​​​‌ Learning to improve its‌ performance to solve goals.‌​‌ Using an interactive textual​​ environment designed to study​​​‌ higher-level forms of functional‌ grounding, and a set‌​‌ of spatial and navigation​​ tasks, we study several​​​‌ scientific questions: 1) Can‌ LLMs boost sample efficiency‌​‌ for online learning of​​ various RL tasks? 2)​​​‌ How can it boost‌ different forms of generalization?‌​‌ 3) What is the​​ impact of online learning?​​​‌ We study these questions‌ by functionally grounding several‌​‌ variants (size, architecture) of​​ FLAN-T5.
  • Functional Description:
    GLAM​​​‌ is a new approach‌ to achieve alignment between‌​‌ a Large Language Model​​ (LLM) and a considered​​​‌ environment/world through functional grounding:‌ we consider an agent‌​‌ using an LLM as​​ a policy that is​​​‌ progressively updated as the‌ agent interacts with the‌​‌ environment, leveraging online Reinforcement​​ Learning to improve its​​​‌ performance to solve goals.‌
  • URL:
  • Publication:
  • Contact:
    Clément Romac

7.1.17​​ SBMLtoODEjax

  • Keywords:
    SBML, JAX,​​​‌ Python, Numerical simulations, Numerical‌ optimization, Automatic differentiation, Ordinary‌​‌ differential equations, Biomedical data​​
  • Scientific Description:
    Advances in​​​‌ bioengineering and biomedicine demand‌ a deep understanding of‌​‌ the dynamic behavior of​​ biological systems, ranging from​​​‌ protein pathways to complex‌ cellular processes. Biological networks‌​‌ like gene regulatory networks​​ and protein pathways are​​​‌ key drivers of embryogenesis‌ and physiological processes. Comprehending‌​‌ their diverse behaviors is​​ essential for tackling diseases,​​​‌ including cancer, as well‌ as for engineering novel‌​‌ biological constructs. Despite the​​ availability of extensive mathematical​​​‌ models represented in Systems‌ Biology Markup Language (SBML),‌​‌ researchers face significant challenges​​ in exploring the full​​​‌ spectrum of behaviors and‌ optimizing interventions to efficiently‌​‌ shape those behaviors. Existing​​ tools designed for simulation​​​‌ of biological network models‌ are not tailored to‌​‌ facilitate interventions on network​​ dynamics nor to facilitate​​​‌ automated discovery. Leveraging recent‌ developments in machine learning‌​‌ (ML), this paper introduces​​ SBMLtoODEjax, a lightweight library​​​‌ designed to seamlessly integrate‌ SBML models with ML-supported‌​‌ pipelines, powered by JAX.​​ SBMLtoODEjax facilitates the reuse​​​‌ and customization of SBML-based‌ models, harnessing JAX's capabilities‌​‌ for efficient parallel simulations​​ and optimization, with the​​​‌ aim to accelerate research‌ in biological network analysis.‌​‌
  • Functional Description:
    SBMLtoODEjax extends​​ SBMLtoODEpy, a python library​​​‌ developed in 2019 for‌ converting SBML files into‌​‌ python files written in​​ Numpy/Scipy. The chosen conventions​​​‌ for the generated variables‌ and modules are slightly‌​‌ different from the standard​​ SBML conventions (used in​​​‌ the SBMLtoODEpy library) with‌ the aim here to‌​‌ accommodate for more flexible​​ manipulations while preserving JAX-like​​​‌ functional programming style.
  • URL:‌
  • Publication:
  • Contact:‌​‌
    Mayalen Etcheverry
  • Partner:
    Tufts​​ University

7.1.18 Vivarium

  • Name:​​​‌
    Large-scale simulator for research‌ and teaching in Artificial‌​‌ Intelligence and Artificial Life​​
  • Keywords:
    Simulation, Artificial intelligence,​​​‌ Artificial Life, Multi-Agents System,‌ Teaching of programming, Research‌​‌
  • Functional Description:

    This project​​ aims to seize these​​​‌ opportunities through the design‌ and implementation of a‌​‌ software platform providing an​​ integrated simulation environment for​​​‌ research, teaching, and dissemination‌ in the fields of‌​‌ Artificial Intelligence (AI) and​​ Artificial Life (AL). The​​​‌ project is titled The‌ Vivarium, which reflects a‌​‌ fundamental aspect of the​​​‌ convergence between these two​ domains: the emergence of​‌ complex behaviors, whether in​​ the natural or artificial​​​‌ world, necessarily relies on​ a need for adaptation​‌ to a complex environment​​ in which many autonomous​​​‌ entities interact.

    It will​ be used as an​‌ educational software in a​​ course from CISC Master​​​‌ at UPF-Barcelona in January​ 2025.

  • Release Contributions:

    This​‌ release corresponds to the​​ state of the repo​​​‌ after all fixes were​ made following the SDIC​‌ course at Universitat Pompeu​​ Fabra of Barcelone (CSIM​​​‌ Master) in January 2025​ .

    This version mostly​‌ focuses on educational purposes,​​ with ready-to-use practical sessions​​​‌ in notebooks/sessions. Corentin Léger​ was the main contributor​‌ over the last year.​​

  • News of the Year:​​​‌
    Corentin Léger, ingénieur de​ recherche recruté sur l'ANR​‌ JCJC ECOCURL (porté par​​ Clément Moulin-Frier) a mené​​​‌ un gros travail de​ développement du logiciel au​‌ cours de l'année 2024.​​ Ses applications pour l'enseignement​​​‌ sont maintenant validées. Le​ logiciel a notamment été​‌ utilisé pendant 10 heures​​ de travaux pratiques dans​​​‌ le Master CSIC de​ Universitat Pompeu Fabra à​‌ Barcelone, Espagne.
  • URL:
  • Contact:
    Clément Moulin-Frier
  • Participant:​​​‌
    3 anonymous participants

7.1.19​ LLM_Culture

  • Keywords:
    LLM, Multi-Agents​‌ System, Natural language processing​​
  • Functional Description:

    Code for​​​‌ the 'Cultural evolution in​ populations of Large Language​‌ Models' paper. This repository​​ provides a comprehensive framework​​​‌ for studying the cultural​ evolution of linguistic content​‌ in populations of Large​​ Language Models (LLM).

    It​​​‌ allows organizing LLM agents​ into networks wherein each​‌ agent interacts with neighboring​​ agents by exchanging texts.​​​‌ Each agent can be​ assigned specific personalities and​‌ transmission instructions, serving as​​ prompts for generating new​​​‌ texts from their neighbors’​ narratives. Once the network​‌ structure and agent characteristics​​ are defined, you can​​​‌ simulate the cultural evolution​ of texts across generations​‌ of agents. We also​​ provide built-in metrics and​​​‌ vizualizations to analyze the​ results.

  • URL:
  • Contact:​‌
    Jeremy Perez
  • Participant:
    2​​ anonymous participants

7.1.20 TelephoneGameLLMs​​​‌

  • Keywords:
    Large Language Models,​ Multi-Agents System, Cultural Evolution​‌
  • Functional Description:
    Code for​​ the paper "When LLMs​​​‌ Play the Telephone Game:​ Cumulative Changes and Attractors​‌ in Iterated Cultural Transmissions"​​ https://arxiv.org/abs/2407.04503 In this paper,​​​‌ we introduce conceptual and​ methodological tools for evaluating​‌ Large Language Models in​​ multi-turn settings. Those tools​​​‌ are inspired by cultural​ evolutionary theory, and in​‌ particular by the concepts​​ of cultural attractors.
  • URL:​​​‌
  • Publication:
  • Contact:​
    Jeremy Perez

7.1.21 styr​‌

  • Name:
    Stick To Your​​ Role
  • Keywords:
    LLM, Cognitive​​​‌ sciences
  • Functional Description:

    Code​ for our paper https://arxiv.org/abs/2402.14846​‌ and leaderboard https://huggingface.co/spaces/flowers-team/StickToYourRoleLeaderboard.

    Enables​​ evaluating LLMs using personal​​​‌ value questionnaires (PVQ, SVS).​ More precisely, it instructs​‌ the LLM to simulated​​ various personas and exposes​​​‌ it to different contexts​ (e.g. long reddit posts).​‌ Then it evaluates the​​ value stability of the​​​‌ simulated population between those​ contexts. Additionally, it computes​‌ confirmatory factor analysis (CFI,​​ SRMR, RMSEA), and the​​​‌ structure of expressed values​ (stress metric).

  • URL:
  • Contact:
    Grgur Kovac

7.1.22​​ transformerXL_PPO_JAX

  • Keywords:
    Reinforcement learning,​​​‌ Transformer
  • Functional Description:

    This​ repository provides a JAX​‌ implementation of TranformerXL with​​ PPO in a RL​​ setup following : "Stabilizing​​​‌ Transformers for Reinforcement Learning"‌ from Parisotto et al.‌​‌ (https://arxiv.org/abs/1910.06764).

    The code uses​​ the PureJaxRL template for​​​‌ PPO and copied some‌ of the code from‌​‌ hugging face trasnformer XL​​ repo transferring it to​​​‌ JAX. We also took‌ inspiration from the pytorch‌​‌ code in https://github.com/MarcoMeter/episodic-transformer-memory-ppo, which​​ has some simplification of​​​‌ gradient propagation and positional‌ encoding compared to transformerXL‌​‌ as it is described​​ in the original paper​​​‌ (https://arxiv.org/abs/1901.02860). The training handles‌ [Gymnax](https://github.com/RobertTLange/gymnax) environment.

    We also‌​‌ tested it on Craftax​​, on which it​​​‌ beat the baseline presented‌ in the paper (https://arxiv.org/abs/2402.16801)‌​‌ including PPO-RNN, training with​​ unsupervised environment design and​​​‌ intrinsic motivation. Notably we‌ reach the 3rd level‌​‌ (the sewer) and obtain​​ several advanced advancements, which​​​‌ was not achieved by‌ the methods presented in‌​‌ the paper. See Craftax​​ Results for more informations.​​​‌

    The training of a‌ 5M transformer on craftax‌​‌ for 1e9 steps (with​​ 1024 environments) takes about​​​‌ 6h30 on a single‌ A100.

  • Contact:
    Gautier Hamon‌​‌

7.1.23 ER-MRL

  • Keywords:
    Reinforcement​​ learning, Evolutionary Algorithms, Recurrent​​​‌ network
  • Functional Description:

    Code‌ for the "Evolving-Reservoirs-for-Meta-Reinforcement-Learning" (ER-MRL)‌​‌ paper (https://arxiv.org/abs/2312.06695).

    We adopt​​ a computational framework based​​​‌ on meta reinforcement learning,‌ modeling the interplay between‌​‌ evolution and development. At​​ the evolutionary scale, we​​​‌ evolve reservoirs, a family‌ of recurrent neural networks‌​‌ generated from hyperparameters. These​​ evolved reservoirs are then​​​‌ utilized to facilitate the‌ learning of a behavioral‌​‌ policy through reinforcement learning.​​ This is done by​​​‌ encoding the environment state‌ through the reservoir before‌​‌ providing it to the​​ agent's policy.

  • Contact:
    Corentin​​​‌ Leger

7.1.24 LLM4Humanities

  • Keywords:‌
    LLM, Python, Data Generator,‌​‌ Generative AI
  • Scientific Description:​​
    Qualitative research in experimental​​​‌ psychology and the humanities‌ often relies on manual‌​‌ annotation of textual data​​ using defined codebooks. This​​​‌ process is indispensable but‌ time-consuming and costly. Moreover,‌​‌ best practices require at​​ least two independent annotators​​​‌ in order to compute‌ inter-rater reliability (IRR), which‌​‌ further increases the required​​ resources. IRR is crucial​​​‌ to distinguish variance due‌ to coder subjectivity from‌​‌ variance due to the​​ phenomenon under study, yet​​​‌ in practice it is‌ frequently omitted, misreported, or‌​‌ computed using inadequate metrics​​ (e.g., raw percentage agreement​​​‌ or simple correlations). The‌ objective of the LLM4Humanities‌​‌ project is to design​​ an open-source, Python-based toolkit​​​‌ and web application that‌ leverages large language models‌​‌ (LLMs) to support, accelerate,​​ and improve the methodological​​​‌ rigor of qualitative annotation‌ workflows. In addition to‌​‌ annotation assistance, LLM4Humanities includes​​ a generation mode designed​​​‌ to support the creation‌ of experimental material. In‌​‌ this mode, users can​​ select one or several​​​‌ template items (e.g., a‌ mathematics exercise) and specify‌​‌ a set of constraints.​​ The system then generates​​​‌ multiple new variants of‌ the item. These generated‌​‌ items can subsequently be​​ passed through the same​​​‌ annotation and evaluation pipeline,‌ providing a first automated‌​‌ assessment of the quality​​ and consistency of the​​​‌ generated content.
  • Functional Description:‌
    LLM4Humanities is an open-source,‌​‌ Python-based toolkit and web​​ application that integrates LLM-assisted​​​‌ annotation, inter-rater reliability analysis,‌ and the generation of‌​‌ controlled variants of experimental​​​‌ material within a single​ end-to-end workflow
  • URL:
  • Contact:
    Olivier Clerc
  • Participant:​​
    Olivier Clerc

7.2 New​​​‌ platforms

7.2.1 ToGather application​

Participants: Cécile Mazon,​‌ Hélène Sauzéon, Eric​​ Meyer, Isabeau Saint-Supery​​​‌.

  • Name:
    Application for​ Specialized education
  • Keywords:
    Parent-professional​‌ relationships; user-centered design; school​​ inclusion; autism spectrum disorder;​​​‌ ecosystemic approach
  • Participants:
    Isabeau​ Saint-supery, Cécile Mazon, Hélène​‌ Sauzéon, Agilonaute
  • Scientific Description:​​
    With participatory design methods,​​​‌ we have designed an​ interactive website application for​‌ educational purposes. This application​​ aims to provide interactive​​​‌ services with continuously updated​ content for the stakeholders​‌ of school inclusion of​​ children with specific educational​​​‌ needs. Especially, the services​ provide: 1) the student's​‌ profile with strengths and​​ weaknesses; 2) an evaluation​​​‌ and monitoring over time​ of the student's repertoire​‌ of acquired, emerging or​​ targeted skills; 3) a​​​‌ shared notebook of effective​ psycho-educational solutions for the​‌ student ; 4) a​​ shared messaging system for​​​‌ exchanging "news" about the​ student and his/her family​‌ and, 5) a meeting​​ manager allowing updates of​​​‌ evaluations (student progress). This​ application is currently assessed​‌ with a field study.​​ Then, it will be​​​‌ transferred to the Academy​ of Nouvelle-Aquitaine-Bordeaux of the​‌ National Education Ministery.
  • URL:​​
    The website is not​​​‌ online yet, but all​ informations such as tutorials​‌ are here.
  • Publication:​​

8 New results​​​‌

The team's research program,​ within the domain of​‌ developmental artificial intelligence, aims​​ to study mechanisms of​​​‌ open-ended learning, and in​ particular the role of​‌ curiosity-driven autotelic learning and​​ the role of language​​​‌ as a cognitive tool.​ We study these topics​‌ both in humans and​​ AI systems, both at​​​‌ the level of individuals​ and at the level​‌ of cultural groups, and​​ both at the fundamental​​​‌ and application levels.

Here,​ we present our recent​‌ results along the following​​ research dimensions:

  • Open-ended learning​​​‌ and autotelic AI with​ large language models;
  • Models​‌ of cultural evolution in​​ humans and AI systems;​​​‌
  • An Eco-Evo-Devo perspective on​ Artificial Intelligence;
  • Generative AI​‌ and educational technologies;
  • Theories​​ of human curiosity-driven learning​​​‌
  • Curiosity-driven learning in educational​ technologies;
  • Curiosity-driven AI for​‌ assisted scientific discovery;

8.1​​ Open-ended learning and autotelic​​​‌ AI with large language​ models

The team continued​‌ to lay the foundations​​ of autotelic AI 89​​​‌, i.e. the science​ stuyding mechanisms enabling artificial​‌ agents to learn to​​ represent and sample their​​​‌ own goals and achieve​ open-ended learning.

8.1.1 ACES:​‌ Generating a Diversity of​​ Challenging Programming Puzzles with​​​‌ Autotelic Generative Models

Participants:​ Julien Pourcel [correspondant],​‌ Cédric Colas, Gaia​​ Molinaro, Pierre-Yves Oudeyer​​​‌, Laetitia Teodorescu.​

Motivation. 

In this project,​‌ we examine how one​​ can generate an interesting​​​‌ diversity of programming puzzles​ (same domain as Codeplay).​‌ We recall that this​​ is an important case​​​‌ study for linguistic autotelic​ agents because it is​‌ a first step towards​​ generalist agents inventing their​​​‌ own problems. Inspired by​ the Evolution Through Large​‌ Models (ELM) method where​​ authors evolve robot morphologies​​​‌ expressed as Sodarace programs​ using a Large Language​‌ Model as a mutation​​ operator, we aim to​​ develop an evolutionary method​​​‌ to create a diverse‌ population of problems using‌​‌ pretrained Language Models. We​​ remark that diversity-producing methods​​​‌ (such as Map-Elites) need‌ a Behavioral Characterization (BC)‌​‌ space in which to​​ measure the diversity of​​​‌ their evolved populations; this‌ is feasible with virtual‌​‌ creatures but seems pretty​​ hard with programming puzzles.​​​‌ We thus introduce the‌ notion of a Semantic‌​‌ BC space, composed of​​ abstract categories, and labelling​​​‌ inside this space is‌ done through LLM responses.‌​‌ In our case, we​​ introduce 10 programming descriptors:​​​‌

  • 0 - Sorting and‌ Searching
  • 1 - Counting‌​‌ and Combinatorics
  • 2 -​​ Trees and Graphs
  • 3​​​‌ - Mathematical Foundations
  • 4‌ - Bit Manipulation
  • 5‌​‌ - String Manipulation
  • 6​​ - Geometry and Grid​​​‌ Problems
  • 7 - Recursion‌ and Dynamic Programming
  • 8‌​‌ - Stacks and Queues​​
  • 9 - Optimization Algorithms​​​‌

We then define an‌ archive of generated programming‌​‌ puzzles and their solutions,​​ and the position of​​​‌ a puzzle in the‌ archive is given by‌​‌ the combination of descriptors​​ that the puzzle-solution pair​​​‌ belongs to (the semantic‌ representation of a puzzle‌​‌ thus being a 10-dimensional​​ vector). The semantic archive​​​‌ is used to store‌ puzzles.

We then perform‌​‌ experiments with the following​​ algorithms:

  • ACES: our​​​‌ proposed method samples a‌ target cell (combination of‌​‌ descriptors) in the archive​​ at random and populates​​​‌ a few-shot prompt for‌ the language model with‌​‌ puzzles from neighboring cells​​ in the archive. See​​​‌ Figure 2 for an‌ illustration.
  • ELM Semantic:‌​‌ based on ELM, example​​ puzzles and solutions are​​​‌ given as few-shot in-context‌ examples and a puzzle‌​‌ sampled from the archive​​ is then mutated.
  • ELM​​​‌: same as the‌ previous one, except we‌​‌ do not use the​​ semantic archive for sampling:​​​‌ instead we build an‌ archive with centroidal voronoi‌​‌ tessellations, from the embedding​​ of puzzles inside the​​​‌ latent space of a‌ Language Model. This baseline‌​‌ allows us to compare​​ the semantic archive with​​​‌ a more classical one;‌
  • Static Gen: In‌​‌ this method, puzzles are​​ sampled from the train​​​‌ set and added as‌ few-shot examples in the‌​‌ prompt;

For all experiments​​ we seed the archive​​​‌ with the P3 train‌ set.

Results. 

We report‌​‌ results of our runs​​ in Figure 3.​​​‌ Overall, the methods based‌ on semantic archives, ACES‌​‌ and ELM-Semantic, achieve the​​ highest diversity in the​​​‌ semantic space. We report‌ diversity measures inside the‌​‌ embedding spaces of various​​ smaller language models in​​​‌ Figure 4. In‌ these figures we see‌​‌ that overall ACES outperforms​​ other methods in this​​​‌ measure of diversity. We‌ additionally perform tests of‌​‌ the suitability of generated​​ puzzles as finetuning data​​​‌ for smaller LMs. For‌ all methods, we finetune‌​‌ a smaller model (​​OpenLlama-3b) on the​​​‌ generated set and we‌ test the pass@k metric‌​‌ for different values of​​ k on the P3​​​‌ test set; we report‌ the scores in Figure‌​‌ 5. From that​​ figure we see that​​​‌ we encounter a tradeoff‌ between how diverse the‌​‌ data is and how​​​‌ useful it is to​ get a high score​‌ on the P3 test​​ set. Further work is​​​‌ needed to get data​ that is both diverse​‌ and useful.a

Figure 2

Overview of​​ ACES. ACES maintains an​​​‌ archive of discovered puzzles​ grouped into cells indexed​‌ by their semantic representation​​ (skill combination). ACES runs​​​‌ in several steps: 1)​ sample a target semantic​‌ goal and relevant examples​​ from the archive. 2)​​​‌ given these, generate a​ puzzle f and its​‌ solution g with the​​ puzzle generator. 3) test​​​‌ the validity of that​ pair by running assert(f(g())​‌ in the interpreter. 4)​​ if the pair is​​​‌ valid, obtain its semantic​ representation with the puzzle​‌ labeler. 5) add the​​ new pair to its​​​‌ corresponding cell in the​ archive.

Figure 2:​‌ Overview of ACES. ACES​​ maintains an archive of​​​‌ discovered puzzles grouped into​ cells indexed by their​‌ semantic representation (skill combination).​​ ACES runs in several​​​‌ steps: 1) sample a​ target semantic goal and​‌ relevant examples from the​​ archive. 2) given these,​​​‌ generate a puzzle f​ and its solution g​‌ with the puzzle generator.​​ 3) test the validity​​​‌ of that pair by​ running assert(f(g()) in the​‌ interpreter. 4) if the​​ pair is valid, obtain​​​‌ its semantic representation with​ the puzzle labeler. 5)​‌ add the new pair​​ to its corresponding cell​​​‌ in the archive.
Figure 3.a
Figure 3.b
Figure 3.c
Figure 3.d
Figure 3.e

Diversity​ of generated puzzles in​‌ semantic space. We report​​ the evolution of several​​​‌ diversity metrics computed in​ the semantic space as​‌ a function of the​​ number of puzzle-solution pairs​​​‌ generated by the puzzle​ generator. Semantic algorithms (algname​‌ and ELM semantic) achieve​​ higher diversity in the​​​‌ semantic space.

Figure 3​: Diversity of generated​‌ puzzles in semantic space.​​ We report the evolution​​​‌ of several diversity metrics​ computed in the semantic​‌ space as a function​​ of the number of​​​‌ puzzle-solution pairs generated by​ the puzzle generator. Semantic​‌ algorithms (algname and ELM​​ semantic) achieve higher diversity​​​‌ in the semantic space.​
Figure 4.a
Figure 4.b
Figure 4.c

Diversity of generated puzzles​‌ in embedding spaces. We​​ report the evolution of​​​‌ the pairwise distance between​ puzzle-solution pair embeddings as​‌ a function of the​​ number of generated puzzle-solution​​​‌ pairs, for three different​ embedding representation spaces (average​‌ across seeds).

Figure 4​​: Diversity of generated​​​‌ puzzles in embedding spaces.​ We report the evolution​‌ of the pairwise distance​​ between puzzle-solution pair embeddings​​​‌ as a function of​ the number of generated​‌ puzzle-solution pairs, for three​​ different embedding representation spaces​​​‌ (average across seeds).
Figure 5

Downstream​ performance on the P3​‌ test set. Pass@k is​​ the fraction of puzzles​​​‌ solved after k attempts​ (k[1:10]).​‌ Green overlaps with yellow.​​

Figure 5: Downstream​​​‌ performance on the P3​ test set. Pass@k is​‌ the fraction of puzzles​​ solved after k attempts​​​‌ (k[1:10]).​ Green overlaps with yellow.​‌

8.1.2 MAGELLAN: Metacognitive Generalization​​ of Learning Progress for​​​‌ Online RL in LLM​ agents

Participants: Loris Gaven​‌ [correspondant], Thomas Carta​​, Clément Romac,​​​‌ Cédric Colas, Pierre-Yves​ Oudeyer, Olivier Sigaud​‌ [ISIR Sorbonne Université, Paris,​​ France], Sylvain Lamprier​​ [Univ Angers, LERIA].​​​‌

We are developing MAGELLAN‌47, a method‌​‌ designed to enable LLM-based​​ reinforcement learning (RL) agents​​​‌ to estimate their own‌ Learning Progress (LP) and‌​‌ use it to dynamically​​ organize their training curriculum.​​​‌ By leveraging the LLM's‌ rich semantic representations, MAGELLAN‌​‌ allows agents to generalize​​ LP estimations to unseen,​​​‌ language-defined goals, overcoming limitations‌ of classical methods that‌​‌ require direct evaluation of​​ each goal.

MAGELLAN uses​​​‌ the LLM to generate‌ latent representations of goals‌​‌ and tasks, capturing their​​ semantic relationships. It continuously​​​‌ monitors the agent's performance‌ over time, estimating LP‌​‌ as the change in​​ success rates for specific​​​‌ goals. This approach enables‌ the agent to identify‌​‌ goals where it is​​ making progress and focus​​​‌ its training on those‌ areas. MAGELLAN's integration ensures‌​‌ that the LLM-based agent​​ can simultaneously refine its​​​‌ policy and competence estimations,‌ adapting both to new‌​‌ tasks in real time.​​

Our experiments in the​​​‌ Little-Zoo environment, which‌ features hierarchical and commonsense-driven‌​‌ tasks, demonstrate that MAGELLAN​​ effectively prioritizes high-LP goals,​​​‌ even when faced with‌ novel or unseen tasks.‌​‌ Unlike traditional LP estimation​​ methods, which rely on​​​‌ direct evaluations and struggle‌ with generalization, MAGELLAN enables‌​‌ the agent to quickly​​ identify meaningful learning opportunities.​​​‌ This results in faster‌ adaptation, improved sample efficiency,‌​‌ and more effective curriculum​​ organization, paving the way​​​‌ for truly autonomous agents‌ capable of navigating vast‌​‌ and complex goal spaces.​​

8.1.3 When goals are​​​‌ beyond reach: Metacognitive monitoring‌ guides autonomous discovery of‌​‌ frugal assistance-seeking in LLMs​​

Participants: Clément Romac [correspondant]​​​‌, Pierre-Yves Oudeyer.‌

Enhancing LLMs with metacognitive‌​‌ capabilities has been identified​​ as a key challenge​​​‌ for improving the trustworthiness‌ and interpretability of these‌​‌ models. In this work,​​ we investigate how such​​​‌ metacognitive abilities can be‌ leveraged to trigger external‌​‌ assistance when the model’s​​ own capabilities are insufficient.​​​‌ While improving LLMs’ is‌ essential, it is equally‌​‌ critical that models learn​​ to recognize their own​​​‌ limitations—and to seek or‌ rely on external support‌​‌ in real-world settings where​​ functional competence may be​​​‌ partial or underdeveloped. This‌ ability forms a crucial‌​‌ part of a broader​​ learning loop: requesting help​​​‌ when needed, then internalizing‌ the knowledge or skills‌​‌ acquired through that assistance.​​

Augmenting LLMs with external​​​‌ assistance and, in particular,‌ what has been named‌​‌ "tools", has become a​​ well-established practice. These augmentations​​​‌ range from calculators and‌ retrieval systems to code‌​‌ interpreters, and even other​​ LLMs. This shift has​​​‌ led to a rethinking‌ of the role of‌​‌ LLMs—not as general-purpose solvers,​​ but as assistants (often​​​‌ referred to as action‌ models) that must learn‌​‌ to orchestrate the use​​ of external resources and​​​‌ integrate their outputs into‌ coherent, human-readable responses.

This‌​‌ reframing introduces a new​​ class of decision-making problems:​​​‌ LLMs must determine when‌ and which external assistance‌​‌ to invoke. However, the​​ optimal assistance strategy is​​​‌ not known in advance.‌ Some tasks may be‌​‌ solvable independently by the​​ LLM, while others may​​​‌ require external help. Additionally,‌ the tools themselves may‌​‌ be fallible—for instance, even​​​‌ large or specialized LLMs​ can return suboptimal results.​‌ To address this, most​​ prior approaches rely on​​​‌ supervised learning, fine-tuning LLMs​ on curated datasets containing​‌ examples of effective tool​​ use. More recently, several​​​‌ works have begun exploring​ how RL can be​‌ used to learn assistance-seeking​​ strategies from scratch, without​​​‌ requiring predefined tool-use demonstrations.​

While both RL and​‌ more conventional supervised learning​​ approaches have shown promise,​​​‌ an important dimension of​ the assistance-seeking problem remains​‌ largely understudied: external assistance​​ comes at a cost.​​​‌ This cost may take​ the form of increased​‌ latency in the LLM’s​​ response, financial charges for​​​‌ calling APIs, or computational​ overhead. Although early work​‌ on tool use—such as—acknowledged​​ this issue, it has​​​‌ received limited attention since.​ A recent exception is,​‌ which introduced a first​​ approach to this multi-objective​​​‌ problem: maximizing task performance​ while minimizing assistance costs.​‌ Their method involves a​​ multi-stage learning pipeline: (1)​​​‌ an estimator of LLM​ performance is trained using​‌ interaction data between the​​ LLM and the task​​​‌ space; (2) a separate​ model is trained to​‌ simulate the outputs of​​ both the LLM and​​​‌ its assistance sources; and​ (3) given a predefined​‌ cost budget, Dynamic Programming​​ is used to derive​​​‌ the optimal assistance strategy.​ While effective, this method​‌ is computationally intensive and​​ requires extensive data collection​​​‌ and training across multiple​ stages.

In this work,​‌ we propose a fully​​ online approach (see Figure​​​‌ 6) based on​ multi-objective contextual multi-armed bandits.​‌ Given a task, we​​ frame the decision of​​​‌ whether to keep the​ task with the LLM​‌ or delegate it to​​ external assistance as the​​​‌ selection of an arm.​ Given a task, we​‌ consider the dual objective​​ of maximizing the answer's​​​‌ performance R while minimizing​ its cost C,​‌ and we adopt scalarization—i.e.,​​ combining the two objectives​​​‌ into a single weighted​ sum U=β​‌R+(1​​-β)C​​​‌ that our approach aims​ to maximize. Crucially, our​‌ method naturally adapts to​​ any specified user-specified budget​​​‌ by treating the budget​ as the scalarization weight​‌ that balances the two​​ objectives. A central challenge​​​‌ of this approach lies​ in efficiently estimating the​‌ performance and cost associated​​ with each option (i.e.,​​​‌ the LLM and all​ available assistance sources), using​‌ as few interactions as​​ possible. To address this,​​​‌ we draw inspiration from​ MAGELLAN and leverage the​‌ LLM itself to learn​​ these estimations.

We first​​​‌ evaluate our method on​ a set of carefully​‌ designed math problems with​​ calculator tools as assistance,​​​‌ for which the optimal​ strategy is known. This​‌ setting enables us to​​ investigate how the strategy​​​‌ discovered by our method​ compares to the optimal​‌ one, as well as​​ the sample efficiency of​​​‌ our approach (i.e., the​ number of interactions required​‌ to converge). We notably​​ show that our LLM-based​​​‌ estimation of performance and​ cost reaches similar or​‌ even better performance than​​ a classic moving average​​​‌ approach which has access​ to privileged information—namely, the​‌ problem category. Finally, we​​ demonstrate the broader applicability​​ of our method by​​​‌ applying it to real-world‌ problems faced by LLMs.‌​‌ In particular, we apply​​ it to a standard​​​‌ question-answering benchmark: MMLU-Pro. The‌ results show that our‌​‌ approach is scalable to​​ complex natural language tasks​​​‌ without access to any‌ external expert knowledge.

Figure 6

We‌​‌ study how LLMs can​​ autonomously learn how and​​​‌ when to use tools‌ when solving tasks.

Figure‌​‌ 6: We frame​​ the decision of whether​​​‌ an LLM should trigger‌ external assistance as a‌​‌ multi-objective contextual multi-armed bandit​​ problem. For each task,​​​‌ the LLM estimates the‌ performance and cost of‌​‌ all available options. These​​ estimates are combined into​​​‌ a single utility score‌ using a user-defined scalarization‌​‌ weight, which specifies the​​ trade-off between maximizing performance​​​‌ and minimizing assistance cost.‌ The estimator is continuously‌​‌ updated through online interactions.​​

8.1.4 LLM-based goal generation​​​‌ for autotellic exploration with‌ goal-conditioned RL

Participants: Guillaume‌​‌ Pourcel [correspondant], Grgur​​ Kovač, Thomas Carta​​​‌, Cédric Colas,‌ Pierre-Yves Oudeyer.

Designing‌​‌ autotelic agents capable of​​ autonomously generating and pursuing​​​‌ their own goals represents‌ a promising endeavor for‌​‌ open-ended learning and skill​​ acquisition in reinforcement learning.​​​‌ This challenge is especially‌ difficult in open worlds‌​‌ that require inventing new​​ previously unobserved goals. In​​​‌ this work, we propose‌ an architecture where a‌​‌ single generalist autotelic agent​​ is trained on an​​​‌ automatic curriculum of goals.‌ We leverage large language‌​‌ models (LLMs) to generate​​ goals as code for​​​‌ reward functions based on‌ learnability and difficulty estimates.‌​‌ The goal-conditioned RL agent​​ is trained on those​​​‌ goals sampled based on‌ learning progress. We compare‌​‌ our method to an​​ adaptation of OMNI-EPIC to​​​‌ goal-conditioned RL. Our preliminary‌ experiments imply that our‌​‌ method generates a higher​​ proportion of learnable goals,​​​‌ suggesting better adaptation to‌ the goalconditioned learner. This‌​‌ work is described in​​ this technical report.​​​‌

Figure 7

The used architecture.

Figure‌ 7: The used‌​‌ architecture.

8.1.5 Self-Improving Language​​ Models for Evolutionary Program​​​‌ Synthesis: A Case Study‌ on ARC-AGI

Participants: Julien‌​‌ Pourcel [correspondant], Cédric​​ Colas, Pierre-Yves Oudeyer​​​‌.

In our work‌ on SOAR (Self-improving Operators‌​‌ for Automated program Refinements),​​ published at ICML 2025,​​​‌ we address a fundamental‌ limitation in program synthesis:‌​‌ while large language models​​ struggle to solve complex​​​‌ tasks in single attempts,‌ traditional evolutionary approaches are‌​‌ constrained by the fixed​​ capabilities of their underlying​​​‌ generative models. We developed‌ a framework that integrates‌​‌ language models into a​​ self-improving evolutionary loop, enabling​​​‌ continuous performance enhancement through‌ experience rather than relying‌​‌ on static model capabilities.​​

Figure 8

SOAR architecture.

Figure 8​​​‌: SOAR architecture.

Our‌ method operates through an‌​‌ iterative two-phase process that​​ we designed to create​​​‌ a virtuous cycle of‌ improvement. First, an evolutionary‌​‌ search phase employs a​​ language model to sample​​​‌ and refine candidate program‌ solutions. Second, a hindsight‌​‌ learning phase converts these​​ search attempts—both successful and​​​‌ unsuccessful—into valid problem-solution pairs‌ that we use to‌​‌ fine-tune the LLM's sampling​​ and refinement capabilities. This​​​‌ approach leverages positive transfer‌ between the sampling and‌​‌ refinement fine-tuning tasks, allowing​​​‌ the system to bootstrap​ its own improvement without​‌ requiring human-engineered training data.​​

We evaluated SOAR on​​​‌ the challenging ARC-AGI benchmark,​ which tests abstract reasoning​‌ and program induction capabilities.​​ Our framework solves 52%​​​‌ of the public test​ set, establishing state-of-the-art results​‌ for program synthesis using​​ open-source language models. These​​​‌ improvements compound through iterations,​ with models showing enhanced​‌ abilities to both generate​​ initial program ideas and​​​‌ refine existing solutions. Notably,​ the gains carry over​‌ to test-time adaptation, enabling​​ continuous improvement on target​​​‌ problems even after deployment.​

Our research demonstrates that​‌ program synthesis systems can​​ transcend the limitations of​​​‌ their base models through​ self-improvement, opening new possibilities​‌ for autonomous AI development.​​ By showing how iterative​​​‌ model improvement can overcome​ performance plateaus inherent to​‌ search methods, SOAR provides​​ a drop-in upgrade for​​​‌ existing systems like AlphaEvolve​ or ShinkaEvolve, transforming their​‌ fixed LLM operators into​​ continuously improving ones.

8.1.6​​​‌ WorldLLM: Improving LLMs' world​ modeling using curiosity-driven theory-making​‌

Participants: Guillaume Levy,​​ Cedric Colas, Pierre-Yves​​​‌ Oudeyer, Thomas Carta​, Cément Romac [correspondant]​‌.

Large Language Models​​ (LLMs) possess broad knowledge​​​‌ about the world, but​ leveraging this knowledge for​‌ precise dynamics modeling remains​​ challenging. While LLMs can​​​‌ engage in general reasoning,​ they struggle to make​‌ accurate predictions in specific​​ domains with structured observations​​​‌ and dynamics, such as​ physics simulations or video​‌ games. This limitation stems​​ from the gap between​​​‌ their general capabilities and​ the need for grounded,​‌ domain-specific understanding.

In this​​ paper, we present WorldLLM​​​‌59, a framework​ for autonomous improvement of​‌ an LLM's world modeling​​ abilities. Our approach combines​​​‌ 1) probabilistic theory induction​ to produce hypotheses that​‌ are given in our​​ LLM's prompt to improve​​​‌ its predictions and 2)​ curiosity-driven RL to explore​‌ the environment and collect​​ transitions poorly predicted by​​​‌ the current hypotheses (see​ Figure 9). Formally,​‌ our LLM's world model​​ is the conditional probability​​​‌ P(st​+1|s​‌t,at​​,H),​​​‌ where st represents​ a state, at​‌ an action, and H​​ a set of natural​​​‌ language hypothesized theories. This​ probability is computed by​‌ the LLM by giving​​ it st,​​​‌ at, and​ H in its prompt​‌ and taking the probability​​ of st+​​​‌1 to follow this​ prompt. Our key insight​‌ is that natural language​​ theories can help ground​​​‌ an LLM's broad knowledge​ into precise predictive power​‌ by providing domain-specific rules.​​ Our approach consists of​​​‌ three interacting components: (1)​ our LLM that computes​‌ P(st​​+1|s​​​‌t,at​,H) by​‌ conditioning its predictions on​​ both a state-action pair​​​‌ and the current hypotheses,​ (2) a theory generator​‌ that updates natural language​​ hypotheses using Bayesian inference,​​​‌ and (3) a curiosity-driven​ reinforcement learning agent trained​‌ to collect evidence against​​ the current hypotheses. Inspired​​​‌ by how humans, from​ children to scientists, actively​‌ update their internal world​​ model by performing experiments,​​ our agent's exploration provides​​​‌ new evidence for hypothesis‌ refinement, creating a virtuous‌​‌ cycle of improvement.

Figure 9

WorldLLM.​​

Figure 9: WorldLLM.​​​‌

We demonstrate our approach‌ in a video game‌​‌ environment where agents manipulate​​ and combine objects, showing​​​‌ that WorldLLM successfully learns‌ accurate predictive models while‌​‌ generating human-interpretable theories about​​ environment dynamics. This work​​​‌ contributes to a growing‌ body of research on‌​‌ improving LLMs' world modeling​​ capabilities and grounding their​​​‌ knowledge in specific domains.‌ By combining ideas from‌​‌ theory-based RL, Bayesian inference,​​ and active exploration, we​​​‌ provide a framework for‌ learning structured, interpretable world‌​‌ models that leverage both​​ the broad knowledge of​​​‌ LLMs and domain-specific experiences‌ without any costly gradient-based‌​‌ learning.

8.1.7 HERAKLES: Hierarchical​​ Skill Compilation for Open-ended​​​‌ LLM Agents

Participants: Thomas‌ Carta [correspondant], Cément‌​‌ Romac, Loris Gaven​​, Pierre-Yves Oudeyer,​​​‌ Olivier Sigaud, Sylvain‌ Lamprier.

In our‌​‌ work on HERAKLES (HiERarchicAl​​ sKill compiLation for open-Ended​​​‌ llm agentS), we address‌ a fundamental challenge in‌​‌ open-ended AI: as goal​​ spaces expand, increasingly complex​​​‌ goals require composing multiple‌ elementary actions, leading to‌​‌ combinatorial explosion that impedes​​ learning progress. While existing​​​‌ hierarchical reinforcement learning approaches‌ rely on expert-defined skill‌​‌ spaces and pre-trained low-level​​ policies, such designs are​​​‌ inadequate for open-ended scenarios‌ where goal spaces naturally‌​‌ diversify across a broad​​ spectrum of difficulties. We​​​‌ developed a framework that‌ enables continuous skill compilation,‌​‌ dynamically expanding the agent's​​ capabilities through experience rather​​​‌ than relying on fixed,‌ predefined abstractions.

Our method‌​‌ operates through a two-level​​ hierarchical architecture designed to​​​‌ create a virtuous cycle‌ of skill acquisition. A‌​‌ high-level policy, instantiated as​​ a Large Language Model,​​​‌ decomposes complex goals into‌ subgoals and selects skills‌​‌ from an evolving skill​​ space. A low-level policy,​​​‌ implemented as lightweight neural‌ networks, executes these skills‌​‌ through primitive actions. Crucially,​​ as the hierarchical agent​​​‌ masters a goal, the‌ complete trajectory is compiled‌​‌ into the low-level policy​​ as a new reusable​​​‌ skill. A competence estimator‌ predicts the low-level policy's‌​‌ success probability for each​​ skill, ensuring the high-level​​​‌ policy only invokes skills‌ that can be reliably‌​‌ executed. This approach leverages​​ language's compositional and combinatorial​​​‌ properties to structure the‌ skill space, enabling generalization‌​‌ across semantically related goals.​​

We evaluated HERAKLES in​​​‌ the Crafter environment, a‌ procedurally generated Minecraft-like world‌​‌ designed to assess agent​​ capabilities within a unified​​​‌ open-ended framework. Our framework‌ achieves Crafter scores above‌​‌ 70, while baselines plateau​​ below 30. More importantly,​​​‌ HERAKLES scales near-linearly with‌ goal difficulty, whereas non-hierarchical‌​‌ methods exhibit exponential growth​​ in learning time. The​​​‌ framework also demonstrates strong‌ generalization: when tested on‌​‌ synonymous goal formulations, HERAKLES​​ shows only a 16%​​​‌ performance drop compared to‌ 24-27% for baselines, and‌​‌ maintains robust performance on​​ compositional variants requiring repeated​​​‌ skill execution.

Our research‌ demonstrates that open-ended agents‌​‌ can transcend the limitations​​ of fixed skill spaces​​​‌ through continuous compilation, opening‌ new possibilities for lifelong‌​‌ learning systems. By showing​​ how mastered behaviors can​​​‌ be recursively encoded at‌ lower levels for rapid‌​‌ reuse—mirroring how humans overcome​​​‌ complexity barriers through hierarchical​ learning—HERAKLES provides a principled​‌ approach for building agents​​ that autonomously expand their​​​‌ competencies over time.

8.1.8​ Software Engineering Agents for​‌ Embodied Controller Generation :​​ A Study in Minigrid​​​‌ Environments

Participants: Timothé Boulet​ [correspondant], Xavier Hinaut​‌ [Mnemosyne, Inria Bordeaux],​​ Clément Moulin-Frier, Nathanaël​​​‌ Fijalkow.

Motivation. 

Software​ Engineering Agents (SWE-Agents) have​‌ proven effective for traditional​​ software engineering tasks with​​​‌ accessible codebases, but their​ performance for embodied tasks​‌ requiring well-designed information discovery​​ remains unexplored. In this​​​‌ paper 44, we​ present the first extended​‌ evaluation of SWE-Agents on​​ controller generation for embodied​​​‌ tasks, adapting Mini-SWE-Agent (MSWEA)​ to solve 20 diverse​‌ embodied tasks from the​​ Minigrid environment. Our experiments​​​‌ compare agent performance across​ different information access conditions:​‌ with and without environment​​ source code access, and​​​‌ with varying capabilities for​ interactive exploration. We quantify​‌ how different information access​​ levels affect SWE-Agent performance​​​‌ for embodied tasks and​ analyze the relative importance​‌ of static code analysis​​ versus dynamic exploration for​​​‌ task solving. This work​ establishes controller generation for​‌ embodied tasks as a​​ crucial evaluation domain for​​​‌ SWE-Agents and provides baseline​ results for future research​‌ in efficient reasoning systems.​​

This work investigates a​​​‌ fundamental question: How do​ SWE-Agents perform in controller​‌ generation for embodied tasks​​ ? Our approach involves​​​‌ a code-agent (the SWE-Agent​ interacting with a code-environment​‌ involving codebases and terminals)​​ that generates controller-agents (Python​​​‌ programs) to solve tasks​ in an embodied setup,​‌ creating a two-level agency​​ structure that differs from​​​‌ direct LLM-environment interaction approaches​ common in embodied AI.​‌ Figure 10 illustrates this​​ two-level agency structure. The​​​‌ agent can evaluate its​ proposed solution by executing​‌ them in the environment​​ and receiving feedback in​​​‌ the form of success/failure​ and reward. Task terminates​‌ either when the agent​​ validates with a special​​​‌ command or when the​ maximum number of steps​‌ or cost is reached.​​

Figure 10

Two-level agency structure: a​​​‌ code-agent interacts with a​ code-environment to generates controller-agents​‌ (Python programs) to solve​​ the embodied task.

Figure​​​‌ 10: Two-level agency​ structure: a code-agent interacts​‌ with a code-environment to​​ generates controller-agents (Python programs)​​​‌ to solve the embodied​ task.

We evaluate the​‌ challenge of controller generation​​ for embodied tasks by​​​‌ adapting Mini-SWE-Agent (MSWEA) to​ solve diverse Minigrid tasks​‌ under different information access​​ conditions:

  • Source Code Access​​​‌: When the agent​ can read Minigrid environment​‌ code, it can analyze​​ environment mechanics, constraints, and​​​‌ object interactions to inform​ controller design.
  • Interactive Exploration​‌: When the agent​​ can write and execute​​​‌ scripts to probe the​ environment, where it can​‌ discover dynamics through exploration,​​ observing outcomes of actions​​​‌ in various states.
Results.​ 

The best@5 success rates​‌ of MSWEA across different​​ tasks and information access​​​‌ conditions are summarized in​ Figure 11. We​‌ display standard deviation as​​ error bars in all​​​‌ our plots.

Minigrid PO​ was very hard to​‌ solve for MSWEA, with​​ many tasks not being​​​‌ solved even with full​ access. In Minigrid FO​‌ however, all tasks except​​ 1 are solved by​​ at least MSWEA with​​​‌ full access. Partial Observability,‌ as a component of‌​‌ embodied tasks, is thus​​ a hard step for​​​‌ SWE Agents to solve.‌

Figure 11

Best@5 success rate of‌​‌ MSWEA across different tasks​​ and information access conditions​​​‌ in Fully Observable Minigrid.‌

Figure 11: Best@5‌​‌ success rate of MSWEA​​ across different tasks and​​​‌ information access conditions in‌ Fully Observable Minigrid.

To‌​‌ identify patterns in the​​ influence of the type​​​‌ of the task to‌ the performance of different‌​‌ information access conditions, we​​ grouped the average best@5​​​‌ success rate metric into‌ 4 categories : navigation,‌​‌ manipulation, hazard, memory, as​​ well as the overall​​​‌ average across all tasks.‌ The results are shown‌​‌ in Figures 12 and​​ Figure 13.

Figure 12

Mean-by-category​​​‌ best@5 success rate in‌ Fully Observable Minigrid

Figure‌​‌ 12: Mean-by-category best@5​​ success rate in Fully​​​‌ Observable Minigrid
Figure 13

Mean-by-category best@5‌ success rate in Partially‌​‌ Observable Minigrid

Figure 13​​: Mean-by-category best@5 success​​​‌ rate in Partially Observable‌ Minigrid

In the Fully‌​‌ Observable benchmark, comparing MSWEA​​ (blue bars) with its​​​‌ fully ablated version (red‌ bars) without neither source‌​‌ code read access nor​​ interactive exploration, we observe​​​‌ performance drop dramatically. An‌ agent with only the‌​‌ Test-Access capability (i.e. being​​ capable of testing its​​​‌ solution to obtain the‌ success rate of its‌​‌ controller solution on the​​ task) obtain much worse​​​‌ result, but surprisingly still‌ manages to solve some‌​‌ tasks through iterated submissions.​​

If we try to​​​‌ get back to the‌ MSWEA performance level by‌​‌ adding only the code​​ access (cyan bars), we​​​‌ see very limited improvement,‌ which means reading only‌​‌ help partially the agent​​ and that the difficulty​​​‌ lies elsewhere. If we‌ add only the interactive‌​‌ execution capability however (orange​​ bars), we observe the​​​‌ performance get back to‌ a comparable level as‌​‌ MSWEA. This pattern is​​ consistent across all task​​​‌ categories and particularly for‌ manipulation task, where the‌​‌ very exact knowledge of​​ how the environment operates​​​‌ is required to solve‌ the task. This systematic‌​‌ pattern means that the​​ interactive access is an​​​‌ essential capacity of SWE-Agents‌ that allows them to‌​‌ perform significantly better in​​ embodied tasks.

In the​​​‌ Partially Observable benchmark, performance‌ is much lower than‌​‌ in Minigrid FO, in​​ particular for the complex​​​‌ manipulation tasks. We can‌ note there are different‌​‌ patterns depending on the​​ task category, but we​​​‌ will not try to‌ interpret them as these‌​‌ may arise either from​​ statistical variability given the​​​‌ relatively high standard errors,‌ or from subtle hard‌​‌ to infer and task-specific​​ factors that bias the​​​‌ agent’s behavior in ways‌ not observed in similar‌​‌ tasks. The overall performance​​ does not vary significantly​​​‌ with the information access‌ conditions. We interpret this‌​‌ as the PO tasks​​ being inherently too hard​​​‌ for MSWEA, such that‌ the agent only solve‌​‌ the simplest tasks such​​ as the easiest navigation​​​‌ tasks, and can make‌ little use of different‌​‌ information access to increase​​ performances. This leads us​​​‌ to believe that strongly‌ embodied tasks such as‌​‌ Minigrid PO tasks represent​​​‌ a good benchmark for​ SWE Agents : they​‌ perform decently on some​​ tasks, but on others,​​​‌ even with good LLMs​ and access to source​‌ code and execution access,​​ they still have significant​​​‌ room for improvement regarding​ the understanding of the​‌ functioning of the environment.​​ These results encourages the​​​‌ use of embodied tasks​ for future software engineering​‌ agents benchmarks.

8.2 Models​​ of cultural evolution in​​​‌ humans and AI systems​

As generative AI systems​‌ become powerful cultural transmission​​ technologies that influence human​​​‌ cultural evolution in important​ ways, and can also​‌ have their own cultural​​ processes through machine-machine large​​​‌ scale interaction, the study​ of the dynamics of​‌ cultural processes in populations​​ of AI systems/humans becomes​​​‌ crucial.

8.2.1 The effect​ of social network structure​‌ on collective innovation

Participants:​​ Eleni Nisioti [correspondant],​​​‌ Mateo Mahaut, Pierre-Yves​ Oudeyer, Ida Momennejad​‌, Sebastian Risi,​​ Pierre-Yves Oudeyer, Clément​​​‌ Moulin-Frier.

Innovations are​ a central component of​‌ open-ended skill acquisition: they​​ denote the emergence of​​​‌ new solutions by the​ recombination of existing ones​‌ and their presence is​​ necessary to ensure a​​​‌ continuous complexification of an​ agent's cultural repertoire. While​‌ we often tend to​​ attribute discoveries to certain​​​‌ innovative individuals, if we​ shed a broad perspective​‌ at the history of​​ our species we see​​​‌ that human innovation is​ primarily a collective process.​‌ Fields such as psychology​​ and anthropology have been​​​‌ studying the ability of​ human groups to innovate​‌ for some time, with​​ studies indicating that the​​​‌ social network structure has​ a significant impact: fully-connected​‌ structures are better suited​​ for quick convergence in​​​‌ easy problems with clear​ global optima, while partially-connected​‌ structures perform best in​​ difficult tasks where local​​​‌ optima may lure agents​ away from the globally​‌ optimal solution 94.​​ At the same time​​​‌ a parallel story is​ unfolding in reinforcement learning​‌ (RL): distributed RL is​​ a sub-field where multiple​​​‌ agents solve a task​ collectively 134. Compared​‌ to the single-agent paradigm,​​ distributed RL algorithms converge​​​‌ quicker and often achieve​ superior performance. However, these​‌ algorithms have only considered​​ full connectivity. In this​​​‌ inter-disciplinary project, we presented​ a novel learning framework​‌ that augments distributed RL​​ with the notion of​​​‌ a social network structure​ and employed it to​‌ study the hypothesis from​​ human studies that partial​​​‌ connectivity performs best in​ innovation tasks.

Cultural evolution​‌ in populations of RL​​ agents. 

We implemented such​​​‌ innovation tasks using Wordcraft,​ a recently introduced RL​‌ playground inspired from the​​ Little Alchemy 2 game​​​‌ (see left of figure​ 14 for an illustration​‌ of how this task​​ works). We considered a​​​‌ wide diversity of social​ network structures: static structures​‌ that remain constant throughout​​ learning (fully-connected, ring, small-world)​​​‌ and a dynamic structure​ where the group oscillates​‌ between phases of low​​ and high connectivity (we​​​‌ illustrate this dynamic structure​ on the right of​‌ figure 14). Each​​ agent in our implementation​​​‌ employs the DQN learning​ algorithm and exchanges experiences​‌ that have the form​​ of sequences of state-action​​ combinations with its neighbors.​​​‌

Figure 14.a
Figure 14.b

(Left) Illustration of an‌ innovation task, consisting of‌​‌ an initial set of​​ elements (Earth, Water) and​​​‌ a recipe book indicating‌ which combinations create new‌​‌ elements. Upon creating a​​ new element the player​​​‌ moves up an innovation‌ level and receives a‌​‌ reward that increases monotonically​​ with levels. (Right) Dynamic​​​‌ social network structures oscillate‌ between phases of low‌​‌ connectivity, where experience sharing​​ takes place within clusters,​​​‌ and high connectivity, where‌ experiences spread between clusters.‌​‌

Figure 14: (Left)​​ Illustration of an innovation​​​‌ task, consisting of an‌ initial set of elements‌​‌ (Earth, Water) and a​​ recipe book indicating which​​​‌ combinations create new elements.‌ Upon creating a new‌​‌ element the player moves​​ up an innovation level​​​‌ and receives a reward‌ that increases monotonically with‌​‌ levels. (Right) Dynamic social​​ network structures oscillate between​​​‌ phases of low connectivity,‌ where experience sharing takes‌​‌ place within clusters, and​​ high connectivity, where experiences​​​‌ spread between clusters.

A‌ central conclusion of our‌​‌ empirical analysis was that​​ the dynamic social network​​​‌ structure performs best. In‌ addition to the performance‌​‌ groups achieve we measured​​ behavioral and mnemonic metrics​​​‌ such as behavioral conformity‌ and mnemonic diversity. Such‌​‌ metrics were inspired from​​ human studies and helped​​​‌ us further analyze the‌ behavior of groups. For‌​‌ example, one empirical observation​​ was that sharing experiences​​​‌ did not help the‌ group learn quicker in‌​‌ a very simple innovation​​ task; instead the fully-connected​​​‌ group was the slowest.‌ By looking at the‌​‌ diversity in the memories​​ of the agents we​​​‌ observed that the fully-connected‌ structure had the highest‌​‌ individual diversity (left of​​ figure 15 ) and​​​‌ the lowest group diversity‌ (right of figure 15‌​‌): sharing experiences with​​ others diversifies an individual's​​​‌ experiences but also homogenizes‌ the group, which is‌​‌ bad for its performance.​​

Figure 15.a
Figure 15.b

Illustration of an innovation​​​‌ task, consisting of an‌ initial set of elements‌​‌ (Earth, Water) and a​​ recipe book indicating which​​​‌ combinations create new elements.‌ Upon creating a new‌​‌ element the player moves​​ up an innovation level​​​‌ and receives a reward‌ that increases monotonically with‌​‌ levels. Dynamic social network​​ structures oscillate between phases​​​‌ of low connectivity, where‌ experience sharing takes place‌​‌ within clusters, and high​​ connectivity, where experiences spread​​​‌ between clusters.

Figure 15‌: (Left) Illustration of‌​‌ an innovation task, consisting​​ of an initial set​​​‌ of elements (Earth, Water)‌ and a recipe book‌​‌ indicating which combinations create​​ new elements. Upon creating​​​‌ a new element the‌ player moves up an‌​‌ innovation level and receives​​ a reward that increases​​​‌ monotonically with levels. (Right)‌ Dynamic social network structures‌​‌ oscillate between phases of​​ low connectivity, where experience​​​‌ sharing takes place within‌ clusters, and high connectivity,‌​‌ where experiences spread between​​ clusters.

We see the​​​‌ contribution of this project‌ as two-fold. From the‌​‌ perspective of fields studying​​ human intelligence, we have​​​‌ shown that using RL‌ algorithms as computational tool‌​‌ is a promising direction​​ towards increasing the verisimilitude​​​‌ of simulations and analyzing‌ both behavior and memory.‌​‌ From the perspective of​​​‌ RL, we have shown​ that distributed RL algorithm​‌ should move beyond the​​ fully-connected architecture and explore​​​‌ groups with dynamic topologies.​ This work is currently​‌ a preprint 136 and​​ is about to be​​​‌ submitted in PNAS. We​ open-source the code at​‌ this link.

Cultural​​ evolution in populations of​​​‌ LLM agents. 

In 2024,​ we have extended this​‌ framework with agents equipped​​ with Large Language Models​​​‌ (LLMs) playing Little Alchemy​ 2, a creative video​‌ game originally developed for​​ humans (figure 16).​​​‌ We, first, study an​ LLM in isolation and​‌ discover that it exhibits​​ both useful skills and​​​‌ crucial limitations. We, then,​ study groups of LLMs​‌ that share information related​​ to their behaviour and​​​‌ focus on the effect​ of social connectivity on​‌ collective performance. In agreement​​ with previous human and​​​‌ computational studies (including the​ one described above), we​‌ observe that groups with​​ dynamic connectivity out-compete fully-connected​​​‌ groups. Our work reveals​ opportunities and challenges for​‌ future studies of collective​​ innovation that are becoming​​​‌ increasingly relevant as Generative​ Artificial Intelligence algorithms and​‌ humans innovate alongside each​​ other. We published this​​​‌ work at the ALife​ 2024 conference 139.​‌

Figure 16

Studying collective innovation in​​ groups of LLMs: A)​​​‌ we experiment with Little​ Alchemy 2, a game​‌ where players combine real-world​​ items to create new​​​‌ ones. A knowledge graph​ describes the possible combinations​‌ (we only present a​​ small sub-part of the​​​‌ graph which contains 720​ items in total) B)​‌ Alice-LLM and Bob-LLM are​​ two LLMs playing the​​​‌ game together. They are​ provided with the same​‌ intro prompt, explaining the​​ rules of the game,​​​‌ and the same task​ (they start with the​‌ same set of items).​​ Alice-LLM and Bob-LLM have​​​‌ identical weights but behave​ differently because the state​‌ prompt depends on their​​ crafting history. They are​​​‌ informed about the actions​ of others through their​‌ prompt. In this paper,​​ we study how groups​​​‌ of such LLM agents​ are able to efficiently​‌ explore a knowledge graph,​​ focusing in particular on​​​‌ the effect of different​ social structures specifying with​‌ whom and when they​​ can share information

Figure​​​‌ 16: Studying collective​ innovation in groups of​‌ LLMs: A) we experiment​​ with Little Alchemy 2,​​​‌ a game where players​ combine real-world items to​‌ create new ones. A​​ knowledge graph describes the​​​‌ possible combinations (we only​ present a small sub-part​‌ of the graph which​​ contains 720 items in​​​‌ total) B) Alice-LLM and​ Bob-LLM are two LLMs​‌ playing the game together.​​ They are provided with​​​‌ the same intro prompt,​ explaining the rules of​‌ the game, and the​​ same task (they start​​​‌ with the same set​ of items). Alice-LLM and​‌ Bob-LLM have identical weights​​ but behave differently because​​​‌ the state prompt depends​ on their crafting history.​‌ They are informed about​​ the actions of others​​​‌ through their prompt. In​ this paper, we study​‌ how groups of such​​ LLM agents are able​​​‌ to efficiently explore a​ knowledge graph, focusing in​‌ particular on the effect​​ of different social structures​​ specifying with whom and​​​‌ when they can share‌ information

8.2.2 When LLMs‌​‌ Play the Telephone Game:​​ Cultural Attractors as Conceptual​​​‌ Tools to Evaluate LLMs‌ in Multi-turn Settings

Participants:‌​‌ Jérémy Perez [correspondant],​​ Grgur Kovač, Corentin​​​‌ Léger, Cédric Colas‌, Gaia Molinaro,‌​‌ Maxime Derex, Pierre-Yves​​ Oudeyer, Clément Moulin-Frier​​​‌.

As large language‌ models (LLMs) start interacting‌​‌ with each other and​​ generating an increasing amount​​​‌ of text online, it‌ becomes crucial to better‌​‌ understand how information is​​ transformed as it passes​​​‌ from one LLM to‌ the next. While significant‌​‌ research has examined individual​​ LLM behaviors, existing studies​​​‌ have largely overlooked the‌ collective behaviors and information‌​‌ distortions arising from iterated​​ LLM interactions. Small biases,​​​‌ negligible at the single‌ output level, risk being‌​‌ amplified in iterated interactions,​​ potentially leading the content​​​‌ to evolve towards attractor‌ states.

In this project,‌​‌ we ran a series​​ of telephone game experiments,​​​‌ applying a transmission chain‌ design borrowed from the‌​‌ human cultural evolution literature:​​ LLM agents iteratively receive,​​​‌ produce, and transmit texts‌ from the previous to‌​‌ the next agent in​​ the chain.

Figure 17

This figures​​​‌ depicts the method to‌ estimate the strength and‌​‌ position of theoretical attractors.​​ Each dot in this​​​‌ figure corresponds to one‌ chain, for a total‌​‌ of 100 chains (20​​ initial texts * 5​​​‌ seeds). The position of‌ a dot on the‌​‌ x-axis corresponds to the​​ value of the property​​​‌ (positivity in this‌ example) in the initial‌​‌ text, while the position​​ on the y-axis corresponds​​​‌ to the value of‌ this property of the‌​‌ text produced after 50​​ generations. We then used​​​‌ these 100 data points‌ to fit a linear‌​‌ regression predicting the relationship​​ between the initial and​​​‌ final values of the‌ property.

Figure 17:‌​‌ Method for estimating attractor​​ strength and position.

Our​​​‌ main contributions are:

  • We‌ propose that there might‌​‌ be a gap in​​ current LLM evaluations methods​​​‌ (single-turn evaluations might not‌ be suited to assess‌​‌ the properties of multi-turn​​ interactions)
  • We empirically confirm​​​‌ this hypothesis by showing‌ that multi-turn interactions indeed‌​‌ often lead to distributions​​ of text properties that​​​‌ are significantly different from‌ what is observed after‌​‌ a single interaction.
  • We​​ introduce novel conceptual and​​​‌ methodological tools to fill‌ this gap, grounded in‌​‌ research in cultural evolution,​​ and in particular the​​​‌ concept of cultural attractor.‌
  • We showcase the potential‌​‌ of this method by​​ applying it to compare​​​‌ the effect of different‌ tasks, of different models,‌​‌ of temperature, and of​​ fine-tuning on the properties​​​‌ of multi-turn interactions.
  • We‌ find several robust effects,‌​‌ such as the fact​​ that less constrained tasks​​​‌ lead to stronger attractors,‌ that some properties posses‌​‌ stronger attractors than others,​​ and that fine-tuning can​​​‌ shift the position and‌ modify the strength of‌​‌ attractors.
Figure 18

The heigth of​​ the bars represent the​​​‌ position (top row) and‌ strength (bottom row) of‌​‌ theoretical attractors, for each​​ property (columns), task, and​​​‌ model. Less constrained tasks,‌ such as Continue,‌​‌ appear to produce stronger​​​‌ attractors than more constrained​ tasks, such as Rephrase​‌. Attractors appear to​​ be stronger for toxicity​​​‌ than for length.​ Finally, we can notice​‌ that the position of​​ attractors appears to vary​​​‌ between models.

Figure 18​: Attractors strength and​‌ position.

These findings highlight​​ the importance of accounting​​​‌ for multi-step transmission dynamics​ and represent a first​‌ step towards a more​​ comprehensive understanding of LLM​​​‌ cultural dynamics.

This work​ was presented during a​‌ 15-minutes talk given at​​ the 2024 Cultural Evolution​​​‌ Society conference, and was​ accepted as a conference​‌ paper at the International​​ Conference on Representation Learning​​​‌ 2025 (ICLR 2025) Conference​ 144 The code is​‌ available at here.​​ We also created a​​​‌ website featuring a Data​ Explorer tool, allowing to​‌ directly inspect the texts​​ generated during our experiments.​​​‌

8.2.3 Recursive Training Loops​ in LLMs: How training​‌ data properties modulate distribution​​ shift in generated data?​​​‌

Participants: Grgur Kovač [correspondant]​, Jérémy Perez [correspondant]​‌, Remy Portelas,​​ Peter Ford Dominey,​​​‌ Pierre-Yves Oudeyer.

Large​ language models (LLMs) are​‌ increas- ingly used in​​ the creation of online​​​‌ content, creating feedback loops​ as subsequent gener- ations​‌ of models will be​​ trained on this syn-​​​‌ thetic data. Such loops​ were shown to lead​‌ to distribution shifts -​​ models misrepresenting the true​​​‌ underlying distributions of human​ data (also called model​‌ collapse). However, how hu-​​ man data properties affect​​​‌ such shifts remains poorly​ understood. In this paper,​‌ we provide the first​​ empirical examination of the​​​‌ effect of such properties​ on the outcome of​‌ recursive training. We first​​ confirm that using differ-​​​‌ ent human datasets leads​ to distribution shifts of​‌ different magnitudes. Through exhaustive​​ manipulation of dataset properties​​​‌ combined with regression analyses,​ we then identify a​‌ set of properties associated​​ with distribution shift magnitudes.​​​‌ Lexical diversity is found​ to am- plify these​‌ shifts, while semantic diversity​​ and data quality mitigate​​​‌ them. Furthermore, we find​ that these influences are​‌ highly modular: data scrapped​​ from a given internet​​​‌ domain has little influence​ on the content generated​‌ for an- other domain.​​ Finally, experiments on political​​​‌ bias reveal that human​ data properties affect whether​‌ the initial bias will​​ be amplified or re-​​​‌ duced. Overall, our results​ portray a novel view,​‌ where different parts of​​ internet may undergo different​​​‌ types of distribution shift.​

The main contributions of​‌ this work are:

  • We​​ propose and experimentally confirm​​​‌ the hypothesis that different​ training datasets lead to​‌ different distribution shift dynamics,​​ motivating an investigation on​​​‌ the underlying causes.
  • Through​ an extensive set of​‌ experiments (four datasets over​​ three domains), we outline​​​‌ several data properties as​ influencing distribution shift dynamics.​‌
  • We reveal that these​​ influences are highly modular,​​​‌ with generated content being​ mostly influenced by human​‌ data properties from the​​ same domain.
  • We find​​​‌ that distribution shifts also​ occur in terms of​‌ political lean, and that​​ the type of shift​​​‌ (bias amplification, reduction or​ inversion) depends on the​‌ political lean of the​​ human data.
Figure 19

Iterative chain​​ In each generation, a​​​‌ fresh base model is‌ fine-tuned on texts sampled‌​‌ from the Accumulated data​​ pool (except generation 0,​​​‌ where it's trained only‌ on human posts). The‌​‌ model generates posts, which​​ are added to the​​​‌ pool alongside some newly‌ sampled human posts.

Figure‌​‌ 19: Iterative chain​​ In each generation, a​​​‌ fresh base model is‌ fine-tuned on texts sampled‌​‌ from the Accumulated data​​ pool (except generation 0,​​​‌ where it's trained only‌ on human posts). The‌​‌ model generates posts, which​​ are added to the​​​‌ pool alongside some newly‌ sampled human posts.

This‌​‌ work was published as​​ a conference paper at​​​‌ the EMNLP2025 conference 49‌.

8.2.4 Intrinsic motivation‌​‌ is key to understanding​​ peer cultures

Participants: Jérémy​​​‌ Perez [correspondant], Maxime‌ Derex, Pierre-Yves Oudeyer‌​‌, Clément Moulin-Frier.​​

This paper 64 is​​​‌ a commentary to 123‌, as part of‌​‌ the call for open-peer​​ commentary on this target​​​‌ article in Behavioral and‌ Brain Sciences, 1–68. In‌​‌ the target paper, the​​ authors make an intriguing​​​‌ case that peer cultures‌ could play a key‌​‌ role in cultural adaptation​​ by generating qualitatively different​​​‌ cultural variation compared to‌ adult cultures. However, the‌​‌ mechanisms responsible for this​​ distinction remain unclear. In​​​‌ out commentary, we discuss‌ how accounting for the‌​‌ role of intrinsic motivation​​ in shaping the content​​​‌ of peer cultures may‌ help explain their evolutionary‌​‌ dynamics.

8.2.5 Cultural variation​​ and regularities in intrinsically​​​‌ motivated exploration: investigating autonomous‌ goal selection in BaYaka‌​‌ foragers and Bandongo fisher-farmers​​

Participants: Jérémy Perez [correspondant]​​​‌, Sarah Pope-Caldwell,‌ Sheina Lew-Levy, Pierre-Yves‌​‌ Oudeyer, Maxime Derex​​, Clément Moulin-Frier.​​​‌

TLDR: This study investigates‌ how recent performance and‌​‌ recent progress influence autonomous​​ goal selection in children​​​‌ and adults from two‌ cultural groups in the‌​‌ Congo Basin. All data​​ necessary for this project​​​‌ has been collected during‌ Jérémy Perez's mission in‌​‌ Congo in July-August 2025,​​ and analyses are still​​​‌ ongoing. This project was‌ made possible through a‌​‌ collaboration with Sheina Lew-Levy​​ from Durham University and​​​‌ Sarah Pope-Caldwell from Georgia‌ State University. An abstract‌​‌ for this project will​​ be submitted to the​​​‌ 2026 conference of the‌ European Human Behaviour &‌​‌ Evolution Association.

Objective

:​​ By influencing which goals​​​‌ individuals set for themselves,‌ intrinsic motivation plays a‌​‌ central role in structuring​​ autonomous learning trajectories. Grounded​​​‌ in theoretical work, recent‌ empirical studies have uncovered‌​‌ the features that make​​ an activity intrinsically motivating.​​​‌ For instance, having experienced‌ recent progress towards a‌​‌ goal was found to​​ influence the probability of​​​‌ selecting it, reflecting curiosity-driven‌ exploration. However, these studies‌​‌ have exclusively focused on​​ humans from Western cultures.​​​‌ The role of the‌ cultural environment in determining‌​‌ the strategies used during​​ intrinsically-motivated goal exploration thus​​​‌ remains unclear.

Method:

In‌ the present study, we‌​‌ investigated how recent performance​​ and recent progress influence​​​‌ autonomous goal selection in‌ 60 Congolese BaYaka foragers‌​‌ (30 children, 30 adults)​​ and 57 Bandongo fisher-farmers​​​‌ (29 children, 28 adults).‌ To do so, we‌​‌ adapted the free-choice paradigm​​​‌ used in the previous​ studies, in which participants​‌ are free to select,​​ and switch between, learning​​​‌ activities of different difficulties.​ Pre-registered analyses were used​‌ to uncover how recent​​ performance and recent progress​​​‌ predict activity choices.

Preliminary​ results:

Preliminary results indicate​‌ that the strategies used​​ by participants in the​​​‌ present study are qualitatively​ similar to those previously​‌ observed in western participants.​​ Specifically, many Bandongo and​​​‌ BaYaka participants rely on​ recent progress to guide​‌ their activity choices. However,​​ clear cross-cultural differences exist:​​​‌ for instance, recent performance​ had a greater influence​‌ on goal choices in​​ Bandongo participants than in​​​‌ BaYaka participants. Our results​ also indicate noticeable heterogeneity​‌ within cultural groups with​​ respect to the strategies​​​‌ guiding self-directed learning.

Conclusion:​

By taking a cross-cultural​‌ perspective on intrinsic motivation,​​ this study highlights the​​​‌ role of the cultural​ niche in shaping the​‌ mechanisms underlying self-directed learning,​​ and contributes to building​​​‌ a more representative picture​ of human curiosity-driven exploration.​‌

8.2.6 The cultural evolution​​ of human goals: How​​​‌ individuals generate, select, and​ transmit goals

Participants: Jérémy​‌ Perez [correspondant], Cédric​​ Colas, Gaia Molinaro​​​‌, Pierre-Yves Oudeyer,​ Maxime Derex, Clément​‌ Moulin-Frier.

This work​​ has been submitted to​​​‌ the special issue on​ Goal Dynamics in Cognition​‌ of the journal Topics​​ in Cognitive Science, and​​​‌ is currently under review.​

Abstract:

Humans pursue goals​‌ that are remarkably diverse​​ and vary over time​​​‌ and cultures. These goals​ shape which behaviors are​‌ explored, valued, and socially​​ transmitted, yet most theories​​​‌ of cultural evolution focus​ on how behaviors evolve​‌ while leaving the origins​​ of goals unexamined. We​​​‌ argue that a complete​ understanding of cultural evolution​‌ requires explaining how goals​​ themselves emerge, vary, and​​​‌ persist across generations. Building​ on studies of motivation​‌ and curiosity in cognitive​​ science and artificial intelligence,​​​‌ we introduce the notion​ of cultural autotelic agents—individuals​‌ who actively generate, select,​​ and transmit their own​​​‌ goals within social environments.​ By highlighting the cognitive​‌ and motivational mechanisms that​​ drive goal formation and​​​‌ selection, this framework extends​ existing models of cultural​‌ evolution and helps explain​​ the open-ended, self-propelling character​​​‌ of human culture.

Figure 20

We​ introduce the notion of​‌ cultural autotelic agents, i.e.​​ agents that combine individual​​​‌ and social learning to​ represent, generate, select, and​‌ transmit their own goals.​​ This model departs from​​​‌ the historical conceptualization as​ problem-solvers (left column). This​‌ standard perspective focuses on​​ how agents optimize behaviors​​​‌ toward goals that are​ externally imposed. This view​‌ is largely present in​​ research on individual cognition​​​‌ (top-left) and has inspired​ most experimental paradigms in​‌ cultural evolution (bottom left,​​ e.g. transmission chains). Research​​​‌ on motivation, developmental psychology,​ and developmental artificial intelligence​‌ has extended this view​​ toward the concept of​​​‌ autotelic agents, i.e.​ agents able to self-generate​‌ and purse their own​​ goals (top-right). This conceptualization​​​‌ affords a more complete​ understanding of proactively exploratory​‌ behaviors in humans, in​​ particular how their past​​​‌ behavior influence their goal​ generation and selection mechanisms.​‌ Here, we propose integrating​​ such insights from cultural​​ evolution and autotelic learning​​​‌ to introduce the concept‌ of cultural autotelic agents‌​‌ (bottom-right). Under this view,​​ agents are active in​​​‌ the generation and selection‌ of the goals they‌​‌ pursue. These goal generation​​ and selection mechanisms are​​​‌ influenced both by social‌ information and individually collected‌​‌ information. We argue that​​ this conceptualization is necessary​​​‌ to think about the‌ cultural evolutionary dynamics of‌​‌ goals.

Figure 20:​​ Humans are cultural autotelic​​​‌ agents. We introduce‌ the notion of cultural‌​‌ autotelic agents, i.e. agents​​ that combine individual and​​​‌ social learning to represent,‌ generate, select, and transmit‌​‌ their own goals. This​​ model departs from the​​​‌ historical conceptualization as problem-solvers‌ (left column). This standard‌​‌ perspective focuses on how​​ agents optimize behaviors toward​​​‌ goals that are externally‌ imposed. This view is‌​‌ largely present in research​​ on individual cognition (top-left)​​​‌ and has inspired most‌ experimental paradigms in cultural‌​‌ evolution (bottom left, e.g.​​ transmission chains). Research on​​​‌ motivation, developmental psychology, and‌ developmental artificial intelligence has‌​‌ extended this view toward​​ the concept of autotelic​​​‌ agents, i.e. agents‌ able to self-generate and‌​‌ purse their own goals​​ (top-right). This conceptualization affords​​​‌ a more complete understanding‌ of proactively exploratory behaviors‌​‌ in humans, in particular​​ how their past behavior​​​‌ influence their goal generation‌ and selection mechanisms. Here,‌​‌ we propose integrating such​​ insights from cultural evolution​​​‌ and autotelic learning to‌ introduce the concept of‌​‌ cultural autotelic agents (bottom-right).​​ Under this view, agents​​​‌ are active in the‌ generation and selection of‌​‌ the goals they pursue.​​ These goal generation and​​​‌ selection mechanisms are influenced‌ both by social information‌​‌ and individually collected information.​​ We argue that this​​​‌ conceptualization is necessary to‌ think about the cultural‌​‌ evolutionary dynamics of goals.​​

8.2.7 Evolving Interaction Protocols​​​‌ for Open-Ended Collective Innovation‌

Participants: Akhi Mocherla,‌​‌ Jérémy Perez, Eleni​​ Nisioti, Cédric Colas​​​‌.

In exploratory domains‌ such as science, art,‌​‌ and design, progress emerges​​ not from achieving predefined​​​‌ objectives but from accumulating‌ novel and meaningful discoveries‌​‌ 140. Lab and​​ field studies of human​​​‌ collective innovation have shown‌ that a group's exploration‌​‌ and, thus, innovation abilities​​ critically depend on how​​​‌ individuals communicate with each‌ other 94, 130‌​‌, 124. For​​ example, increasing group connectivity​​​‌ speeds up innovation in‌ the short-term but reduces‌​‌ diversity within the collective,​​ negatively impacting long-term innovation.​​​‌ Partially connected groups thus‌ accumulate the most innovations‌​‌ in deceptive search spaces.​​ Computational studies have confirmed​​​‌ this in groups of‌ evolving agents 121,‌​‌ 78, reinforcement learning​​ agents 137, and​​​‌ Large Language Models (LLMs)‌ 138, highlighting the‌​‌ key role of collective​​ dynamics in engineered multi-agent​​​‌ systems. Despite this, systematic‌ approaches for optimising how‌​‌ groups interact remain underdeveloped.​​ In this work, we​​​‌ propose an approach for‌ designing interaction protocols (IPs)‌​‌ that govern who communicates​​ with whom, what is​​​‌ communicated, and when. Similarly‌ to past computational studies‌​‌ 137, 138,​​ we use the text-based​​​‌ game Little Alchemy 2‌ (LA2) as a test-bed‌​‌ of collective innovation. To​​​‌ explore the IP space​ systematically, we employ a​‌ Quality-Diversity (QD) algorithm 152​​ discovering repertoires with high​​​‌ performance and behavioral diversity.​ We maintain an archive​‌ of IPs, each evaluated​​ via multiple trials. Similarly​​​‌ to previous works 100​ , our approach follows​‌ the Novelty Search with​​ Local Competition (NS-LC) paradigm​​​‌ and employs LLMs within​ QD for solution generation​‌ and novelty estimation.

This​​ work has been presented​​​‌ as a poster at​ the 2025 workshop on​‌ Intrinsically Motivated Open-ended Learning​​ (IMOL 2025).

Figure 21

Overview of​​​‌ the framework used to​ evolve Interaction Protocols for​‌ groups of agents playing​​ the Little Alchemy 2​​​‌ game. The system iteratively​ generates new IPs using​‌ a language model (LLM),​​ evaluates their performance, and​​​‌ maintains an archive of​ candidate solutions. Candidate IPs​‌ are debugged and tested,​​ then evaluated for fitness​​​‌ and novelty relative to​ archived solutions. The archive​‌ is updated based on​​ fitness, and novel or​​​‌ improved protocols are used​ to guide further LLM​‌ generations, enabling continual improvement​​ and diversity in discovered​​​‌ solutions. Comparison of the​ performance of an evolved​‌ IP to dynamic and​​ fully-connected IPs from past​​​‌ studies.

Figure 21:​ (Left) Overview of the​‌ framework used to evolve​​ Interaction Protocols for groups​​​‌ of agents playing the​ Little Alchemy 2 game.​‌ The system iteratively generates​​ new IPs using a​​​‌ language model (LLM), evaluates​ their performance, and maintains​‌ an archive of candidate​​ solutions. Candidate IPs are​​​‌ debugged and tested, then​ evaluated for fitness and​‌ novelty relative to archived​​ solutions. The archive is​​​‌ updated based on fitness,​ and novel or improved​‌ protocols are used to​​ guide further LLM generations,​​​‌ enabling continual improvement and​ diversity in discovered solutions.​‌ (Right) Comparison of the​​ performance of an evolved​​​‌ IP to dynamic and​ fully-connected IPs from past​‌ studies.

8.2.8 Inferring the​​ Phylogeny of Large Language​​​‌ Models and Predicting their​ Performances in Benchmarks

Participants:​‌ Nicolas Yax [correspondant],​​ Pierre-Yves Oudeyer, Stefano​​​‌ Palminteri.

In recent​ month the number of​‌ Large Language Models (LLMs)​​ released has never been​​​‌ that high. On one​ hand, multiple private companies​‌ such as OPENAI, Claude,​​ Google, Mistral, etc. are​​​‌ making cutting-edge models that​ have a lot of​‌ visibility in our modern​​ society and science. However,​​​‌ as the number of​ LLMs is raising, the​‌ training methods are becoming​​ more secretive making the​​​‌ field increasingly obscur to​ science. On the other​‌ hand, everyday, a few​​ hundreds of open-access language​​​‌ models are uploaded on​ the hugging face hub​‌ which is far too​​ much to keep track​​​‌ of the evolution of​ LLMs in the field.​‌ Knowing that not all​​ of these open models​​​‌ are perfectly transparent about​ the training methods and​‌ only very few of​​ them are benchmarked (due​​​‌ to the high cost​ of benchmarking) there is​‌ an increasing need for​​ methods to help keep​​​‌ track of the progress​ and evolution of these​‌ models in the field.​​

We developped an algorithm,​​​‌ named PhyloLM, inspired from​ phylogenetics to compute evolutionary​‌ trees in LLMs. We​​ show this method efficient​​ in reconstructing the evolutionary​​​‌ history of LLMs within‌ families 22, in‌​‌ discriminating the different families,​​ and also in finding​​​‌ similarities between these families.‌ Additionaly, the genetic information‌​‌ can be used to​​ predict LLM capabilities like​​​‌ benchmark scores showing a‌ very significant correlation between‌​‌ predicted and true scores.​​ These advances could be​​​‌ instrumental in our way‌ to navigate the field‌​‌ of LLMs by making​​ the world of LLM​​​‌ more transparent at a‌ very low cost. This‌​‌ was published at ICLR​​ 2025.

Figure 22

Phylogenetic tree​​​‌ reconstruction. On the left‌ it is shown the‌​‌ ground truth concerning the​​ relation of some LLMs​​​‌ of the Mistral family.‌ Right is the reconstruction‌​‌ from the phylogenetic algorithm​​ for the five latest​​​‌ models of this family‌ ("leaves" of the phylogenetic‌​‌ tree) on which we​​ run PhyloLM. On the​​​‌ right, it is shown‌ the reconstructed phylogenetic tree‌​‌ PhyloLM on the 5​​ "leafs" models. The numerical​​​‌ labels (0:3) map the‌ true common ancestors (on‌​‌ the right, "ground truth")​​ to the inferred ones​​​‌ (on the left, "reconstructed").‌ It can be seen‌​‌ that the true and​​ the reconstructed trees are​​​‌ topologically equivalent

Figure 22‌: Phylogenetic tree reconstruction.‌​‌ On the left it​​ is shown the ground​​​‌ truth concerning the relation‌ of some LLMs of‌​‌ the Mistral family. Right​​ is the reconstruction from​​​‌ the phylogenetic algorithm for‌ the five latest models‌​‌ of this family ("leaves"​​ of the phylogenetic tree)​​​‌ on which we run‌ PhyloLM. On the right,‌​‌ it is shown the​​ reconstructed phylogenetic tree PhyloLM​​​‌ on the 5 "leafs"‌ models. The numerical labels‌​‌ (0:3) map the true​​ common ancestors (on the​​​‌ right, "ground truth") to‌ the inferred ones (on‌​‌ the left, "reconstructed"). It​​ can be seen that​​​‌ the true and the‌ reconstructed trees are topologically‌​‌ equivalent

8.3 An Eco-Evo-Devo​​ perspective on Artificial Intelligence​​​‌

8.3.1 Research perspective: The‌ Ecology of Open-Ended skill‌​‌ Acquisition

Participants: Clément Moulin-Frier​​ [correspondant], Eleni Nisioti​​​‌, Pierre-Yves Oudeyer.‌

An intriguing feature of‌​‌ the human species is​​ our ability to continuously​​​‌ invent new problems and‌ to proactively acquiring new‌​‌ skills in order to​​ solve them: what is​​​‌ called Open-Ended Skill Acquisition‌ (OESA). Understanding the mechanisms‌​‌ underlying OESA is an​​ important scientific challenge in​​​‌ both cognitive science (e.g.‌ by studying infant cognitive‌​‌ development) and in artificial​​ intelligence (aiming at computational​​​‌ architectures capable of open-ended‌ learning). Both fields, however,‌​‌ mostly focus on cognitive​​ and social mechanisms at​​​‌ the scale of an‌ individual’s life. It is‌​‌ rarely acknowledged that OESA,​​ an ability that is​​​‌ fundamentally related to the‌ characteristics of human intelligence,‌​‌ has been necessarily shaped​​ by ecological, evolutionary and​​​‌ cultural mechanisms interacting at‌ multiple spatiotemporal scales.

Figure 23

The‌​‌ ORIGINS framework identifies central​​ components (boxes) and their​​​‌ interactions (arrows) driving Open-Ended‌ Skill Acquisition, both in‌​‌ terms of its evolution​​ from environmental complexity (roughly:​​​‌ left to right arrows)‌ as well its open-ended‌​‌ aspect through feedback mechanisms​​ (right to left arrows).​​​‌ The employed terminology reflects‌ a diversity of mechanisms‌​‌ considered in both Artificial​​​‌ Intelligence and Human Behavioral​ Ecology.

Figure 23:​‌ The ORIGINS framework identifies​​ central components (boxes) and​​​‌ their interactions (arrows) driving​ Open-Ended Skill Acquisition, both​‌ in terms of its​​ evolution from environmental complexity​​​‌ (roughly: left to right​ arrows) as well its​‌ open-ended aspect through feedback​​ mechanisms (right to left​​​‌ arrows). The employed terminology​ reflects a diversity of​‌ mechanisms considered in both​​ Artificial Intelligence and Human​​​‌ Behavioral Ecology.

We have​ recently initiated a new​‌ research direction aiming at​​ understanding, modeling and simulating​​​‌ the dynamics of OESA​ in artificial systems, grounded​‌ in theories studying its​​ eco-evolutionary bases in the​​​‌ human species. For this​ aim, we have proposed​‌ a conceptual framework, called​​ ORIGINS (illustrated Fig. 23​​​‌ and developed in 131​), expressing the complex​‌ interactions between environmental, adaptive,​​ multi-agent and cultural dynamics.​​​‌ This framework raises three​ main research questions:

  • What​‌ are the ecological conditions​​ favoring the evolution of​​​‌ autotelic agents?
  • How to​ bootstrap the formation of​‌ a cultural repertoire in​​ populations of adaptive agents?​​​‌
  • What is the role​ of cultural feedback effects​‌ in the open-ended dynamics​​ of human skill acquisition?​​​‌

The contributions described below​ are addressing some aspects​‌ of these research questions.​​ Note that there might​​​‌ be a thematic overlap​ between the two last​‌ research questions outlined above​​ and the previous section​​​‌ on Models of Cultural​ Evolution 8.2, where​‌ we also present related​​ results.

8.3.2 Eco-evolutionary Dynamics​​​‌ of Non-episodic Neuroevolution in​ Large Multi-agent Environments

Participants:​‌ Gautier Hamon [correspondant],​​ Eleni Nisioti, Clément​​​‌ Moulin-Frier.

This work​ was published in 2023​‌ but we keep it​​ in this report as​​​‌ it introduces a general​ computational framework, called non-episodic​‌ neuroevolution, that forms the​​ basis of the two​​​‌ next contributions.

This contribution​ focuses on eco-evolutionary dynamics​‌ where "organisms are not​​ solely products but, by​​​‌ modifying their niche and​ therefore its associated fitness​‌ landscape, are also causes​​ of evolution" 120.​​​‌ The main objective of​ this paper is to​‌ propose a method for​​ studying large-scale eco-evolutionary dynamics​​​‌ in agent-based simulations with​ a reasonable level of​‌ biological and ecological plausibility.​​ For this aim, we​​​‌ implement a system with​ the following properties (see​‌ Fig. 24 for illustration):​​

  • Non-episodic simulation environment with​​​‌ complex intrinsic dynamics.​ We model our environment​‌ after common-pool resource (CPR)​​ appropriation problems, where a​​​‌ group of agents competes​ for finite resources. We​‌ extend an existing environment​​ of CPR appropriation  145​​​‌ with the presence of​ multiple niches, where resources​‌ regrow proportionally to the​​ density of nearby resources​​​‌ at different rates in​ different regions of the​‌ environment (Fig 24).​​ We prevent any environment​​​‌ or population reset during​ a whole simulation run,​‌ enabling coupled environmental and​​ population dynamics leading to​​​‌ complex eco-evolutionary feedback effects.​
  • Continuous neuroevolution in a​‌ large, size-varying agent population​​ The environment contains thousands​​​‌ of agents, each controlled​ by a neural network​‌ whose weights are optimized​​ using neuroevolution 161
  • Physiology-driven​​​‌ death and reproduction There​ is no notion of​‌ rewards, agents are instead​​ equipped with a physiological​​ system modulating their energy​​​‌ level according to the‌ resources they consume, in‌​‌ a non-linear way. At​​ the evolutionary scale, agents​​​‌ reproduce as long as‌ they are able to‌​‌ maintain their energy level​​ within a reasonable range​​​‌ and die if this‌ level goes below a‌​‌ minimum threshold. This is​​ departure from the notion​​​‌ of fitness-based selection and‌ more in line with‌​‌ a minimal criterion selection​​ 76. Note that​​​‌ the population size can‌ vary with time.
Figure 24

Our‌​‌ simulation environment (Left) is​​ an extension of the​​​‌ Common Pool Resource (CPR)‌ environment 145, 122‌​‌ : a two-dimensional grid-world​​ where some cells contain​​​‌ resources (in green) that‌ the agents (in black)‌​‌ can collect. Resources grow​​ depending on the presence​​​‌ of other resources around‌ them (local growth, Middle)‌​‌ with an additional very​​ sparse spontaneous growth, which​​​‌ means that over-consumption may‌ lead to their local‌​‌ depletion. We introduce a​​ latitudinal model of resource​​​‌ regrowth. We prevent any‌ environment and population reset‌​‌ during a whole simulation,​​ enabling continual eco-evolutionary dynamics​​​‌ to take place. Each‌ agent may reproduce or‌​‌ die according to a​​ physiological model modulating its​​​‌ energy level as a‌ function of life time‌​‌ and resource consumption (Top-Right).​​ The population size varies​​​‌ during the simulation according‌ to the current amount‌​‌ of available resources and​​ the current ability of​​​‌ agents to collect them.‌ Evolution occurs through the‌​‌ mutation of a parent's​​ network weights when it​​​‌ produces an offspring.

Figure‌ 24: Our simulation‌​‌ environment (Left) is an​​ extension of the Common​​​‌ Pool Resource (CPR) environment‌ 145, 122 :‌​‌ a two-dimensional grid-world where​​ some cells contain resources​​​‌ (in green) that the‌ agents (in black) can‌​‌ collect. Resources grow depending​​ on the presence of​​​‌ other resources around them‌ (local growth, Middle) with‌​‌ an additional very sparse​​ spontaneous growth, which means​​​‌ that over-consumption may lead‌ to their local depletion.‌​‌ We introduce a latitudinal​​ model of resource regrowth.​​​‌ We prevent any environment‌ and population reset during‌​‌ a whole simulation, enabling​​ continual eco-evolutionary dynamics to​​​‌ take place. Each agent‌ may reproduce or die‌​‌ according to a physiological​​ model modulating its energy​​​‌ level as a function‌ of life time and‌​‌ resource consumption (Top-Right). The​​ population size varies during​​​‌ the simulation according to‌ the current amount of‌​‌ available resources and the​​ current ability of agents​​​‌ to collect them. Evolution‌ occurs through the mutation‌​‌ of a parent's network​​ weights when it produces​​​‌ an offspring.

In addition‌ to experiments conducted in‌​‌ the large environment presented,​​ we also conduct experiments​​​‌ in "lab environment" (as‌ opposed to the "natural‌​‌ environment") to isolate the​​ study of certain behavior​​​‌ (which are often intertwined‌ with a lot of‌​‌ dynamics in the natural​​ environment).

One interesting results​​​‌ of these simulation is‌ the emergence of sustainable‌​‌ foragers which as shown​​ in lab environment Fig.​​​‌25 tends to not‌ overconsume when there is‌​‌ enough resource in their​​ neighbourhood. This allows to​​​‌ keep a certain amount‌ of resource to spread‌​‌ which is therefore beneficial​​​‌ for their future survival​ as well as the​‌ survival of their offspring.​​ (as there is no​​​‌ reset of the environment)​

Figure 25

Greediness of a sustainable​‌ forager agent across evaluation​​ environments that differ in​​​‌ the amount of resources.​ Sustainable agents are far​‌ less greedy in environments​​ where there is a​​​‌ certain amount of resources​ available. This strategy allows​‌ to keep resources so​​ that they spread and​​​‌ avoid overdepletion of resources.​

Figure 25: Greediness​‌ of a sustainable forager​​ agent across evaluation environments​​​‌ that differ in the​ amount of resources. Sustainable​‌ agents are far less​​ greedy in environments where​​​‌ there is a certain​ amount of resources available.​‌ This strategy allows to​​ keep resources so that​​​‌ they spread and avoid​ overdepletion of resources.

This​‌ work was published at​​ the Genetic and Evolutionary​​​‌ Computation Conference (GECCO) 2023.​ The computational framework it​‌ introduced led to the​​ two next recent contributions.​​​‌

8.3.3 Emergent kin selection​ of altruistic feeding via​‌ non-episodic neuroevolution

Participants: Max​​ Taylor-Davies, Gautier Hamon​​​‌, Timothe Boulet,​ Clément Moulin-Frier [correspondant].​‌

This work extends the​​ project presented in previous​​​‌ contribution Sec.8.3.2.​ It is the result​‌ from the visit in​​ the team of Max​​​‌ Taylor-Davies doing his PhD​ at School of Informatics,​‌ University of Edinburgh, Scotland.​​ It has been accepted​​​‌ at the EvoStar conference​ the International Conference on​‌ the Applications of Evolutionary​​ Computation (Part of EvoStar)​​​‌ 55.

At first​ glance, it seems difficult​‌ to square the phenomenon​​ of purely altruistic behaviour​​​‌ (acts which confer a​ benefit to the recipient​‌ at a cost to​​ the actor) with the​​​‌ basic principle of natural​ selection: how can a​‌ gene be selected for​​ when it decreases, rather​​​‌ than increases, the fitness​ of its host? One​‌ plausible account can be​​ made through the theory​​​‌ of inclusive fitness. Key​ to this theory is​‌ the recognition that individual​​ organisms within a social​​​‌ environment are not isolated​ from their conspecifics in​‌ terms of fitness. Whether​​ a given gene is​​​‌ selected for is thus​ determined by its effect(s)​‌ on the fitness of​​ any bearers of copies​​​‌ of that gene. Under​ this view, we can​‌ think of an altruistic​​ act as an exchange​​​‌ of fitness from one​ agent to another. If​‌ the exchange is positive-sum​​ and both sides are​​​‌ bearers of the gene​ in question, then from​‌ the gene's perspective the​​ behaviour confers a fitness​​​‌ benefit–even while it decreases​ the fitness of the​‌ acting individual.

Kin selection​​ theory 160 has proven​​​‌ to be a popular​ and widely accepted account​‌ of how altruistic behaviour​​ can evolve under natural​​​‌ selection. Hamilton's rule, first​ published in 1964 107​‌, 108, has​​ since been experimentally validated​​​‌ across a range of​ different species and social​‌ behaviours. In contrast to​​ this large body of​​​‌ work in natural populations,​ however, there has been​‌ relatively little study of​​ kin selection in silico​​​‌. In the current​ work, we offer what​‌ is to our knowledge​​ the first demonstration of​​ kin selection emerging naturally​​​‌ within a population of‌ agents undergoing continuous neuroevolution.‌​‌ Specifically, we find that​​ zero-sum transfer of resources​​​‌ from parents to their‌ infant offspring evolves through‌​‌ kin selection in environments​​ where it is hard​​​‌ for offspring to survive‌ alone. In an additional‌​‌ experiment, we show that​​ kin selection in our​​​‌ simulations relies on a‌ combination of kin recognition‌​‌ and population viscosity. We​​ believe that our work​​​‌ may contribute to the‌ understanding of kin selection‌​‌ in minimal evolutionary systems,​​ without explicit notions of​​​‌ genes and fitness maximisation.‌

Figure 26

The relationship between the‌​‌ estimated benefit to infants​​ of being fed and​​​‌ both the amount and‌ selectivity of feeding observed,‌​‌ shown separately for each​​ of the three experimental​​​‌ parameters we varied (and‌ combined in the rightmost‌​‌ column). Each scatterplot point​​ represents a single 500k-timestep​​​‌ simulation run (with values‌ averaged over the final‌​‌ 50k timesteps); regression lines​​ (with 95% confidence intervals)​​​‌ are shown in green.‌ Note that the y‌​‌-axis shows log(measure)​​ for both amount and​​​‌ selectivity.

Figure 26:‌ The relationship between the‌​‌ estimated benefit to infants​​ of being fed and​​​‌ both the amount and‌ selectivity of feeding observed,‌​‌ shown separately for each​​ of the three experimental​​​‌ parameters we varied (and‌ combined in the rightmost‌​‌ column). Each scatterplot point​​ represents a single 500k-timestep​​​‌ simulation run (with values‌ averaged over the final‌​‌ 50k timesteps); regression lines​​ (with 95% confidence intervals)​​​‌ are shown in green.‌ Note that the y‌​‌-axis shows log(measure)​​ for both amount and​​​‌ selectivity.

This paper was‌ accepted at The International‌​‌ Conference on the Applications​​ of Evolutionary Computation (EvoAPPS)​​​‌ 2025 (part of EvoStar).‌

8.3.4 Evolving large populations‌​‌ of adaptive neural agents​​ in ecologically plausible environments​​​‌

Participants: Timothé Boulet [correspondant]‌, Gautier Hamon,‌​‌ Clément Moulin-Frier.

This​​ work continues the project​​​‌ presented in the previous‌ paragraph, with a focus‌​‌ on the ability of​​ agents to develop adaptability​​​‌ behaviors. Specifically, we extend‌ the framework by adding‌​‌ fruits, a spatially variable​​ ressource, and a memory​​​‌ of the values of‌ each type of fruits‌​‌ for the agents. The​​ goal is to observe​​​‌ whether the agents manage‌ to exploit the knowledge‌​‌ of the fruits values​​ to decide which fruit​​​‌ exploit.

Results : the‌ agent were able to‌​‌ exploit the fruit value​​ information to optimize their​​​‌ behavior. There were also‌ some results that we‌​‌ were not necessarily expecting​​ and that comes from​​​‌ our choice of model.‌ Notably, it seems the‌​‌ agents choice for exploiting​​ a cluster is heavily​​​‌ influenced by social criteria‌ (the number of agents‌​‌ already exploiting it) and​​ cultural criteria (whether the​​​‌ cluster is empty or‌ full of fruits). This‌​‌ effect exceeds the adapatability​​ effect in the latest​​​‌ stages of the simulations.‌

8.4 Theories and experiments‌​‌ on human curiosity-driven learning​​

8.4.1 DevCur Project: studying​​​‌ the co-development of curiosity,‌ metacognition and agency in‌​‌ adolescents

Participants: Julien Rosenberger​​ [correspondent], Pierre-Yves Oudeyer​​​‌, Hélène Sauzéon.‌

Under the scope of‌​‌ the DevCur project, the​​​‌ PhD of Julien Rosenberg​ was started on the​‌ following topic: “How curiosity​​ enhances learning across childhood​​​‌ and adolescence: Models and​ experimentation of the role​‌ of metacognition and agency”.​​ After exploring the literature,​​​‌ a specific project was​ settled that aims to​‌ compare the personality constructs​​ around the intellect. The​​​‌ investigated personality constructs are​ metacognitive skills (eg, 148​‌), curiosity traits (eg,​​ 115), sense of​​​‌ agency (eg, 162)​ and intellectual humility (eg,​‌ 73). Intellectual humility​​ is about correctly setting​​​‌ one’s cognitive limitations 169​. The self-report scale​‌ of Alfano et al​​ 73 hinges intellectual humility​​​‌ on other intellectual traits:​ open-mindedness (recognizing one’s cognitive​‌ limitations and having appetite​​ for knowledge without concerns​​​‌ for social status), intellectual​ modesty (having low concern​‌ for being deemed smart),​​ engagement (being able to​​​‌ confront oneself to what​ one doesn’t understand or​‌ is different from one’s​​ perspective) and corrigibility (being​​​‌ emotionally stable when one​ is intellectually challenged). The​‌ labels are slightly odd​​ but emphasize the diversity​​​‌ of intellectual traits one​ could consider in learning.​‌

A first axis is​​ to understand the organization​​​‌ of those constructs. For​ instance, intellectual humility has​‌ been linked to greater​​ general knowledge and a​​​‌ tendency to underestimate one’s​ cognitive ability 119.​‌ Those dependent variables are​​ also respectively related to​​​‌ curiosity trait 167 and​ low self-esteem and low​‌ metacognition 165. A​​ second axis is to​​​‌ obtain behavioral markers of​ those constructs. This need​‌ for situationally-bounded measures is​​ crucial for intellectual humility​​​‌ 164. It is​ currently measured through self-reports​‌ or other-reports. Yet, self-reports​​ pose an issue because​​​‌ being intellectually humble is​ socially desirable 119,​‌ requires some recall to​​ form that self-referenced attribute​​​‌ and faces the paradox​ of self-attribution (ie some​‌ humility is required to​​ say if one is​​​‌ humble). The other reports​ are alternatively resource intensive​‌ and brings other factors​​ (context, relationship…).

8.5 Generative​​​‌ AI and educational technologies​

8.5.1 Investigating the​‌ use of LLM in​​ middle school. 

Participants: Pierre-Yves​​​‌ Oudeyer, Hélène Sauzéon​ [correspondant], Rania Abdelghani​‌.

ChatGPT, one of​​ the most widely used​​​‌ generative AI (LLM) tools,​ has made accessing mass​‌ and personalized information easy​​ and straightforward, even for​​​‌ users without expertise in​ AI. More particularly, recent​‌ reports indicate that the​​ majority of surveyed students​​​‌ aged nine and older​ have already used this​‌ tool for school-related tasks.​​ However, while we know​​​‌ that students are using​ ChatGPT, there is limited​‌ understanding of how they​​ use it and its​​​‌ effects on their learning​ processes and outcomes, particularly​‌ among middle and high​​ school students and in​​​‌ subjects outside programming.

Investigating​ these patterns of use​‌ is a critical step​​ toward identifying the necessary​​​‌ educational interventions to mitigate​ risks associated with misuse​‌ or harmful interactions with​​ ChatGPT, which are particularly​​​‌ likely among non-expert users.​ To address this, we​‌ recruited 63 students aged​​ 14 to 15 and​​​‌ asked them to solve​ science problems using ChatGPT.​‌ We examined their prompt​​ choices, evaluations of ChatGPT's​​ responses, and final problem-solving​​​‌ outcomes. Overall, our results‌ indicate that students are‌​‌ still inefficient users of​​ AI tools such as​​​‌ ChatGPT and are vulnerable‌ to incorporating its misinformation,‌​‌ even when they report​​ high domain knowledge and​​​‌ previous experience with generative‌ AI. This highlights potential‌​‌ misconceptions about these tools’​​ capabilities and the skills​​​‌ required to use them‌ effectively. Furthermore, domain knowledge‌​‌ alone appears insufficient to​​ shield students from adopting​​​‌ misinformation generated by ChatGPT.‌ Implementing formal educational interventions‌​‌ to correct these misconceptions​​ and train students for​​​‌ informed usage thus seems‌ both timely and essential,‌​‌ given the growing reliance​​ on generative AI tools​​​‌ in education. On the‌ longer term, fostering metacognitive‌​‌ skills may further promote​​ responsible and effective use​​​‌ of such tools (paper‌ in preparation)

8.5.2 Study‌​‌ impact of a pedagogical​​ intervention on GenAI in​​​‌ middle school students

Participants:‌ Pierre-Yves Oudeyer, Hélène‌​‌ Sauzeon, Olivier Clerc​​ [correspondant], Chloé Desvaux​​​‌, Rania Abdelghani,‌ Eliott Poisson, Kan‌​‌ Yao, Didier Roy​​.

Context and Objective.​​​‌ 

Generative AI (GenAI) systems‌ such as ChatGPT are‌​‌ increasingly used by students,​​ including for schoolwork. A​​​‌ pilot study conducted in‌ 2024 by the Flowers‌​‌ team with 63 students​​ aged 14–15 showed that​​​‌ students experience major difficulties‌ in formulating effective prompts‌​‌ and in evaluating the​​ quality of AI-generated answers,​​​‌ which negatively impacts their‌ performance in scientific problem-solving‌​‌ tasks. Building on this​​ work, we evaluated the​​​‌ impact of a short‌ pedagogical intervention (2 hours)‌​‌ aimed at improving students’​​ ability to formulate and​​​‌ critically evaluate prompts before‌ querying a large language‌​‌ model (LLM).

Task 

Students​​ had to solve six​​​‌ middle-school science problems using‌ ChatGPT (or a similar‌​‌ system such as DuckDuckAI).​​ Each problem included a​​​‌ statement, an image, a‌ question, and a suggested‌​‌ prompt. Students could choose​​ to use or ignore​​​‌ the suggested prompt. Two‌ types of prompts were‌​‌ provided: valid prompts (clear​​ context and precise instructions)​​​‌ and invalid prompts (insufficient‌ or vague context).

Figure 27

Schematic‌​‌ representation of the science​​ exercise proposed to the​​​‌ children. The experimental task‌ consisted of six science‌​‌ exercises to be completed​​ within 90 minutes. Exercises​​​‌ were provided on paper‌ sheets to prevent students‌​‌ from directly copying and​​ pasting the task description​​​‌ and accompanying image into‌ the chatbot interface.

Figure‌​‌ 27: Schematic representation​​ of the science exercise​​​‌ proposed to the children.‌ The experimental task consisted‌​‌ of six science exercises​​ to be completed within​​​‌ 90 minutes. Exercises were‌ provided on paper sheets‌​‌ to prevent students from​​ directly copying and pasting​​​‌ the task description and‌ accompanying image into the‌​‌ chatbot interface.
Pedagogical Intervention.​​ 

For the experimental group,​​​‌ the study took place‌ in two phases. Two‌​‌ days before the task​​ session, students participated in​​​‌ a two-hour classroom workshop‌ designed to strengthen their‌​‌ theoretical and practical understanding​​ of GenAI. The workshop​​​‌ consisted of three parts:‌

  1. an introduction explaining how‌​‌ generative AI systems work,​​
  2. a discussion of their​​​‌ limitations, risks, and biases,‌
  3. a practical session in‌​‌ which students trained to​​​‌ analyze and reformulate prompts.​
Main Results. 

Overall, students​‌ who benefited from the​​ pedagogical intervention achieved higher​​​‌ performance than those in​ the control group. In​‌ quantitative terms, the mean​​ score (out of 20)​​​‌ was approximately 10.3 in​ the control group—comparable to​‌ the 2024 pilot study—and​​ approximately 11.4 in the​​​‌ experimental group. This difference​ is statistically significant (p​‌ < .05). The experimental​​ group not only obtained​​​‌ significantly higher scores, but​ also demonstrated more strategic​‌ use of the AI​​ system. In particular, they​​​‌ rejected invalid prompts more​ frequently and were more​‌ likely to reformulate or​​ refine their queries when​​​‌ the initial answer was​ unsatisfactory. Moreover, formulating their​‌ own prompts tended to​​ maintain or improve performance,​​​‌ even in cases where​ the suggested prompt was​‌ already valid. Importantly, self-reported​​ prior knowledge about AI​​​‌ was not associated with​ better performance, suggesting that​‌ explicit instruction and practice​​ played a more decisive​​​‌ role than familiarity with​ the technology alone.

Figure 28

Workshop​‌ effects on performance and​​ prompt acceptance

Figure 28​​​‌: Workshop effects on​ performance and prompt acceptance.​‌ (A) Students in the​​ experimental group achieved higher​​​‌ scores than those in​ the control group on​‌ the science exercises. (B)​​ Sensitivity (d'​​​‌) was positively associated​ with performance, accounting for​‌ differences between groups. (C)​​ Predictors of prompt acceptance​​​‌ from a generalized linear​ model, showing the effects​‌ of condition, confidence, and​​ prompt validity.
Future Research.​​​‌ 

Future work will aim​ to extend and consolidate​‌ these findings in several​​ directions. First, longitudinal studies​​​‌ will be needed to​ assess whether the strategies​‌ acquired during the workshop​​ are retained over time​​​‌ and whether students continue​ to apply them beyond​‌ the immediate post-intervention period.​​ Second, the present study​​​‌ focused on science problem​ solving with middle-school students.​‌ Future research should examine​​ whether similar short pedagogical​​​‌ interventions yield comparable benefits​ in other subjects (e.g.,​‌ mathematics, history, writing) and​​ with learners of different​​​‌ age groups.

8.5.3 LLM4Humanities:​ An Open-Source Toolkit for​‌ LLM-Assisted Qualitative Research

Participants:​​ Olivier Clerc [correspondant],​​​‌ Grgur Kovač, Chloé​ Desvaux, Gaia Molinaro​‌, Pierre-Yves Oudeyer.​​

Context and Objective. 

Qualitative​​​‌ research in experimental psychology​ and the humanities often​‌ relies on manual annotation​​ of textual data using​​​‌ defined codebooks. This process​ is indispensable but time-consuming​‌ and costly. Moreover, best​​ practices require at least​​​‌ two independent annotators in​ order to compute inter-rater​‌ reliability (IRR), which further​​ increases the required resources.​​​‌ IRR is crucial to​ distinguish variance due to​‌ coder subjectivity from variance​​ due to the phenomenon​​​‌ under study, yet in​ practice it is frequently​‌ omitted, misreported, or computed​​ using inadequate metrics (e.g.,​​​‌ raw percentage agreement or​ simple correlations). The objective​‌ of the LLM4Humanities project​​ is to design an​​​‌ open-source, Python-based toolkit and​ web application that leverages​‌ LLMs to support, accelerate,​​ and improve the methodological​​​‌ rigor of qualitative annotation​ workflows.

System and Workflow.​‌ 

LLM4Humanities provides an end-to-end​​ pipeline combining manual annotation,​​​‌ automated classification, and statistical​ evaluation. In a typical​‌ workflow, researchers first manually​​ annotate a small subset​​ of the dataset. An​​​‌ LLM is then used‌ to automatically classify the‌​‌ remaining data. The system​​ subsequently compares the model’s​​​‌ predictions to the human-annotated‌ subset using appropriate IRR‌​‌ metrics, confidence intervals, and​​ decision guidance, allowing researchers​​​‌ to assess both annotation‌ reliability and model performance.‌​‌

Generation Mode. 

In addition​​ to annotation assistance, LLM4Humanities​​​‌ includes a generation mode‌ designed to support the‌​‌ creation of experimental material.​​ In this mode, users​​​‌ can select one or‌ several template items (e.g.,‌​‌ a mathematics exercise) and​​ specify a set of​​​‌ constraints. The system then‌ generates multiple new variants‌​‌ of the item. These​​ generated items can subsequently​​​‌ be passed through the‌ same annotation and evaluation‌​‌ pipeline, providing a first​​ automated assessment of the​​​‌ quality and consistency of‌ the generated content.

8.5.4‌​‌ GAIMHE: Generative AI and​​ Hybrid Models for Education​​​‌

Participants: Pierre-Yves Oudeyer,‌ Olivier Clerc, Hélène‌​‌ Sauzéon, EvidenceB .​​

Context and Objective. 

Recent​​​‌ advances in generative AI‌ have opened new possibilities‌​‌ for personalized education, but​​ fully LLM-based educational systems​​​‌ raise major concerns in‌ terms of cost, scalability,‌​‌ robustness, pedagogical control, and​​ environmental impact. At the​​​‌ same time, classical Intelligent‌ Tutoring Systems (ITS) offer‌​‌ strong pedagogical structure and​​ efficiency, but lack flexibility​​​‌ for open-ended interaction and‌ content generation. The GAIMHE‌​‌ project aims to design​​ and evaluate hybrid educational​​​‌ architectures that combine the‌ strengths of both approaches:‌​‌ frugal and pedagogically robust​​ ITS for macro-level orchestration,​​​‌ and generative AI models‌ (LLMs/SLMs) for micro-level personalization,‌​‌ feedback, and content generation.​​ Beyond technical integration, the​​​‌ project also aims to‌ structure an open ecosystem‌​‌ of methods, data, and​​ benchmarks to support reproducible​​​‌ and scalable uses of‌ generative AI in education.‌​‌

Project Architecture

The proposed​​ architecture is organized around​​​‌ two complementary modes of‌ use of generative models.‌​‌ First, large banks of​​ pedagogical exercises are pre-generated​​​‌ using large language models‌ and then validated by‌​‌ human experts and orchestrated​​ by structured teaching algorithms​​​‌ within ITS. This content‌ is stored and reused‌​‌ in order to minimize​​ live calls to large​​​‌ models during learner interactions.‌ Second, smaller language models,‌​‌ or external APIs to​​ larger proprietary models when​​​‌ needed, are used in‌ real time to provide‌​‌ specific feedback. This design​​ ensures personalized support while​​​‌ preserving computational efficiency, pedagogical‌ control, and scalability.

Data‌​‌ Generation and Evaluation Strategy.​​ 

A central component of​​​‌ the project concerns the‌ large-scale generation, structuring, and‌​‌ validation of pedagogical datasets.​​ This work relies on​​​‌ two existing software tools:‌ Sphinx, an internal platform‌​‌ used for the annotation​​ and creation of pedagogical​​​‌ content, and LLM4Humanities, an‌ open-source toolkit providing similar‌​‌ functionalities through a Streamlit-based​​ interface. In parallel, the​​​‌ project is developing unified‌ data structures for representing‌​‌ exercises and real student​​ learning trajectories collected from​​​‌ educational platforms, with the‌ goal of sharing these‌​‌ resources as digital commons​​ through open repositories such​​​‌ as GitHub and Hugging‌ Face. We are also‌​‌ developing a web-based visualization​​ platform for exploring learning​​​‌ trajectories and learner profiles,‌ aimed at both researchers‌​‌ and non-technical stakeholders. A​​​‌ first prototype of this​ platform has already been​‌ implemented.

8.6 Curiosity-driven learning​​ in educational technologies

Since​​​‌ 2019 (Idex cooperation fund​ between the University of​‌ Bordeaux and the University​​ of Waterloo, Canada) and​​​‌ the recent creation of​ CuriousTECH associate team in​‌ 2022 (led by the​​ Flowers team and involving​​​‌ F. Lotte from the​ Potioc team and M.​‌ Fernendes and E. Law​​ from the Waterloo University),​​​‌ we continue our work​ on the development of​‌ new curiosity-driven interaction systems.​​ Substantial progress has been​​​‌ made in this area​ of application of FLOWERS​‌ works (see the website​​ of CuriousTECH team.)​​​‌

8.6.1 New digital approaches​ for studying curiosity-driven learning​‌

Participants: Hélène Sauzeon [correspondant]​​, Pierre-Yves Oudeyer [correspondant]​​​‌, Rania Abdelghani,​ Mehdi Alaimi, Fabien​‌ Lotte, Aurélien appriou​​, Myra Fernandes,​​​‌ Edith Law, Yadurshana​ Sivashankar.

As curiosity​‌ is a recent research​​ topic, we studied some​​​‌ basic mechanisms of curiosity-based​ learning, thanks to three​‌ studies have been completed.​​

The first one regards​​​‌ a new interactive educational​ application to foster curiosity-driven​‌ question-asking in children. Determined​​ to improve children’s curiosity,​​​‌ we developed a new​ interactive system aiming to​‌ foster curiosity-related question-asking from​​ texts and their perception​​​‌ of curiosity. To assess​ its efficiency, we conducted​‌ a study with 95​​ fifth grade students of​​​‌ Bordeaux elementary schools. Two​ types of interventions were​‌ designed, one trying to​​ focus children on the​​​‌ construction of low-level question​ (i.e. convergent) and one​‌ focusing them on high-level​​ questions (i.e. divergent) with​​​‌ the help of prompts​ or questions starters models.​‌ We observed that both​​ interventions increased the number​​​‌ of divergent questions, the​ question fluency performance, while​‌ they did not significantly​​ improve the curiosity perception​​​‌ despite high intrinsic motivation​ scores they have elicited​‌ in children. The curiosity-trait​​ score positively impacted the​​​‌ divergent question score under​ divergent condition, but not​‌ under convergent condition. The​​ overall results supported the​​​‌ efficiency and usefulness of​ digital applications for fostering​‌ children’s curiosity that we​​ need to explore further.​​​‌ The overall results are​ published in CHI'20 72​‌. In parallel to​​ these first experimental works,​​​‌ we wrote this year​ a review of the​‌ existing works on the​​ subject 80.

The​​​‌ second study investigates the​ neurophysiological underpinnings of curiosity​‌ and the opportunities of​​ their use for Brain-computer​​​‌ interactions 74. Understanding​ the neurophysiological mechanisms underlying​‌ curiosity and therefore being​​ able to identify the​​​‌ curiosity level of a​ person, would provide useful​‌ information for researchers and​​ designers in numerous fields​​​‌ such as neuroscience, psychology,​ and computer science. A​‌ first step to uncovering​​ the neural correlates of​​​‌ curiosity is to collect​ neurophysiological signals during states​‌ of curiosity, in order​​ to develop signal processing​​​‌ and machine learning (ML)​ tools to recognize the​‌ curious states from the​​ non-curious ones. Thus, we​​​‌ ran an experiment in​ which we used electroencephalography​‌ (EEG) to measure the​​ brain activity of participants​​​‌ as they were induced​ into states of curiosity,​‌ using trivia question and​​ answer chains. We used​​ two ML algorithms, i.e.​​​‌ Filter Bank Common Spatial‌ Pattern (FBCSP) coupled with‌​‌ a Linear Discriminant Algorithm​​ (LDA), as well as​​​‌ a Filter Bank Tangent‌ Space Classifier (FBTSC), to‌​‌ classify the curious EEG​​ signals from the non-curious​​​‌ ones. Global results indicate‌ that both algorithms obtained‌​‌ better performances in the​​ 3-to-5s time windows, suggesting​​​‌ an optimal time window‌ length of 4 seconds‌​‌ to go towards curiosity​​ states estimation based on​​​‌ EEG signals. These results‌ have been published 74‌​‌.

Thanks to Virtual​​ reality device, a third​​​‌ study investigates the role‌ of intrinsic motivation in‌​‌ spatial learning in children​​ 159. In this​​​‌ study, the state curiosity‌ is manipulated as a‌​‌ preference for a level​​ of uncertainty during the​​​‌ exploration of new virtual‌ environments. To this end,‌​‌ a series of virtual​​ environments have been created​​​‌ and is presented to‌ children. During encoding, participants‌​‌ explore routes in environments​​ according the three levels​​​‌ of uncertainty (low, medium,‌ and high), thanks to‌​‌ a virtual reality headset​​ and controllers and, are​​​‌ later asked to retrace‌ their travelled routes. The‌​‌ exploration area and the​​ wayfinding. ie the route​​​‌ overlap between encoding and‌ retrieval phase, (an indicator‌​‌ of spatial memory accuracy)​​ are measured. Neuropsychological tests​​​‌ are also performed. The‌ results showed that there‌​‌ are better performances under​​ the medium uncertainty condition​​​‌ in terms of exploration‌ area and wayfinding score.‌​‌ These first results supports​​ the idea that curiosity​​​‌ states are a learning‌ booster. In Sivashankar et‌​‌ al. study, 10-year-old children​​ (20 females; 22 males)​​​‌ with low to high‌ trait curiosity actively explored‌​‌ virtual environments 29 containing​​ varying levels of uncertainty​​​‌ (low, medium, high) (Fig.‌ 30), after which‌​‌ memory for the route​​ travelled was assessed 159​​​‌.

Figure 29

First-person view and‌ bird’s-eye view of the‌​‌ three styles of virtual​​ environments. Participants only experienced​​​‌ the environments from a‌ first-person perspective.

Figure 29‌​‌: First-person view and​​ bird’s-eye view of the​​​‌ three styles of virtual‌ environments. Participants only experienced‌​‌ the environments from a​​ first-person perspective.
Figure 30

From left​​​‌ to right: Condition 1‌ with Low Uncertainty (1‌​‌ character); Condition 2 with​​ Medium Uncertainty (3 characters);​​​‌ and Condition 3 with‌ High uncertainty (7 characters)‌​‌

Figure 30: From​​ left to right: Condition​​​‌ 1 with Low Uncertainty‌ (1 character); Condition 2‌​‌ with Medium Uncertainty (3​​ characters); and Condition 3​​​‌ with High uncertainty (7‌ characters)

As trait curiosity‌​‌ increased (31),​​ so did memory performance​​​‌ in the high uncertainty‌ condition, suggesting that children‌​‌ with high levels of​​ curiosity can better recruit​​​‌ cognitive resources within such‌ environments. Children with high‌​‌ compared to low curiosity​​ also had higher feelings​​​‌ of presence during the‌ immersive experience. Importantly, in‌​‌ environments with medium uncertainty,​​ children with low trait​​​‌ curiosity were able to‌ perform as well as‌​‌ those with high curiosity.​​ Results suggest that individual​​​‌ differences in trait curiosity‌ influences route memory in‌​‌ environments with varying levels​​ of uncertainty.

Figure 31

Route Memory​​​‌ Score (black circles) and‌ Intrinsic Motivation Score (white‌​‌ circles) in Low-and High-curiosity​​​‌ Groups as a Function​ of the Three Uncertainty​‌ Conditions (Low, Medium and​​ High)

Figure 31:​​​‌ Route Memory Score (black​ circles) and Intrinsic Motivation​‌ Score (white circles) in​​ Low-and High-curiosity Groups as​​​‌ a Function of the​ Three Uncertainty Conditions (Low,​‌ Medium and High)

8.6.2​​ Fostering curiosity and metacognition​​​‌ in classrooms

Participants: Pierre-Yves​ Oudeyer, Hélène Sauzéon​‌ [correspondant], Rania Abdelghani​​, Chloé Desvaux.​​​‌

Promoting curiosity by supporting​ divergent thniking

Previous work​‌ aimed to propose new​​ educational technologies driven by​​​‌ epistemic curiosity. A central​ question of this work​‌ was to specify the​​ impact of self-questioning aroused​​​‌ by states of curiosity​ (i.e., the identification of​‌ knowledge gaps and formulation​​ of learning goals) on​​​‌ student performance. To this​ end, a web platform​‌ called "Kids Ask" was​​ designed, developed, and tested​​​‌ in primary schools. The​ tool offered an interaction​‌ with a conversational agent​​ that trained children's abilities​​​‌ to generate curiosity-driven questions​ and use these questions​‌ to explore a learning​​ environment and acquire new​​​‌ knowledge. Results from this​ study suggested that the​‌ configuration helped enhance children's​​ questioning and exploratory behaviors;​​​‌ they also showed that​ learning progress differences in​‌ children can be explained​​ by differences in their​​​‌ curiosity-driven behaviors 69.​

Figure 32

Illustration of KidsAsk application​‌ interface

Figure 32:​​ Illustration of a conversational​​​‌ agent's strategies in the​ different work spaces of​‌ the "Kids Ask" platform​​

The ability to formulate​​​‌ curiosity-driven questions (i.e.​, new learning goals)​‌ likely relies upon divergent​​ thinking mechanisms, as suggested​​​‌ by literature highlighting links​ between curiosity and creativity​‌ 117156. In​​ this regard, a novel​​​‌ version of the Kids​ Ask training was proposed​‌ and tested in a​​ field study involving a​​​‌ total of 130 children​ aged 9 to 11​‌ years. These experiments aimed​​ to further assess the​​​‌ interplay between curiosity and​ creativity in question-asking behaviors.​‌ Drawing from creativity literature,​​ we examined the process​​​‌ of question formulation through​ associative thinking involved in​‌ creativity. To do so,​​ the conversational agent's behavior​​​‌ in "Kids Ask" was​ modified to prompt children​‌ to identify important keywords​​ from a text, then​​​‌ generate free associations based​ on their prior knowledge.​‌ Given the intricate interplay​​ between curiosity and creativity,​​​‌ it was hypothesized that​ this associative guidance would​‌ further enhance children's ability​​ to formulate divergent, curiosity-driven​​​‌ questions (as shown in​ figure 33)

Figure 33

Screen​‌ shot of the associative​​ method of prompting in​​​‌ Kids Ask. Children start​ off by reading a​‌ text containing highlighted keywords.​​ They are prompted by​​​‌ the conversational agent to​ choose one from the​‌ list and make a​​ free association with it,​​​‌ based on prior knowledge.​ They are to use​‌ one of or both​​ words to ask a​​​‌ divergent question

Figure 33​: Illustration of a​‌ conversational agent's behavior in​​ the different question-asking workspace​​​‌ of the "Kids Ask"​ platform
Promoting curiosity and​‌ metacognition in authentic settings​​

Curiosity-driven learning is crucial​​​‌ for academic achievement and​ autonomous learning, yet remains​‌ scarce in primary classrooms.​​ Building on our previous​​ work with the IGSA​​​‌ framework (Identify-Guess-Seek-Assess) introduced in‌ 68, we developed‌​‌ a training paradigm that​​ teaches curiosity-driven learning through​​​‌ metacognitive skills training. This‌ approach leverages Murayama's framework‌​‌ 133 by personifying the​​ four basic metacognitive skills​​​‌ as animated characters: the‌ referee (identify knowledge gaps),‌​‌ the detective (formulate predictions),​​ the explorer (seek information),​​​‌ and the second referee‌ (assess information quality).

Curiosity-driven‌​‌ learning framework and link​​ with the metacognitive skills​​​‌ we propose to train‌ as facilitators during our‌​‌ IGSA-based intervention

Figure 34​​: Curiosity-driven learning framework​​​‌ and link with the‌ metacognitive skills we propose‌​‌ to train as facilitators​​ during our IGSA-based intervention​​​‌

The two-part intervention combined‌ declarative knowledge about curiosity‌​‌ and metacognition with procedural​​ training of the four​​​‌ metacognitive strategies. The first‌ step consisted of animated‌​‌ videos explaining key concepts​​ related to curiosity, metacognition,​​​‌ and the four skills‌ through 2D characters. The‌​‌ second step involved the​​ "Kids Reflect" web-based platform,​​​‌ where conversational agents with‌ the same appearance and‌​‌ roles as the video​​ characters prompted children to​​​‌ use these skills appropriately‌ during reading-comprehension tasks (see‌​‌ figure below).

Screenshot of​​ the ”Kids Reflect”” platform​​​‌ during the training, given‌ one text

Figure 35‌​‌: Screenshot of the​​ ”Kids Reflect” platform during​​​‌ the training, given one‌ text

Our earlier pilot‌​‌ studies with small classroom​​ samples demonstrated the accessibility​​​‌ and positive impact of‌ this training on metacognitive‌​‌ efficiency, curiosity-driven question-asking, and​​ learning outcomes. These promising​​​‌ initial results motivated a‌ larger-scale validation study to‌​‌ assess both the intervention's​​ effectiveness and its scalability​​​‌ in authentic educational settings.‌

Study design and implementation‌​‌

This implies considering the​​ interventions' effectiveness when teachers​​​‌ implement it themselves with‌ their classroom. Therefore, in‌​‌ a field study conducted​​ with 159 students aged​​​‌ 9-10 years across five‌ elementary schools in Bordeaux‌​‌ Métropole and 4 teachers,​​ the multimedia-based metacognitive intervention​​​‌ was tested using a‌ pseudo-RCT design in collaboration‌​‌ with the Académie de​​ Bordeaux. Three main experimental​​​‌ conditions were compared: intervention‌ led by researchers, intervention‌​‌ led by trained in-service​​ teachers, and a control​​​‌ group. Additionally, complete and‌ partial versions of the‌​‌ intervention were contrasted. Prior​​ to the intervention, teachers​​​‌ underwent short training sessions‌ delivering curiosity and metacognitive‌​‌ concept knowledge and to​​ familiarize themselves with the​​​‌ format and content, enabling‌ them to autonomously implement‌​‌ the intervention in their​​ classrooms during regular school​​​‌ hours.

Main findings

Results‌ demonstrated that intervention groups‌​‌ significantly improved their divergent​​ question-asking abilities and developed​​​‌ more positive perceptions of‌ curiosity compared to the‌​‌ control group. Importantly, this​​ was the case in​​​‌ the ecological setting of‌ classrooms where teachers managed‌​‌ the intervention themselves, but​​ also with a lighter​​​‌ easy-to-implement version of the‌ training (see figure below).‌​‌

Post-interventions results of question-asking​​ abilities of children in​​​‌ each condition

Figure 36‌: Post-interventions results scoring‌​‌ of a divergent question-asking​​ fluency test

However, nuanced​​​‌ findings emerged regarding teacher‌ delivery conditions. These groups‌​‌ showed lower performance during​​ the intervention and poorer​​​‌ learning outcomes, alongside higher‌ cognitive load, compared to‌​‌ researcher-led groups. This suggests​​​‌ that while the intervention​ can be effectively scaled​‌ to teacher-led implementations, some​​ avenues for improvement have​​​‌ been identified. This point​ was further informed by​‌ qualitative interviews conducted with​​ volunteered teachers who were​​​‌ animators in the study.​ Teachers rated the intervention​‌ highly on acceptability and​​ usefulness, recognizing its pedagogical​​​‌ value. However, they provided​ lower ratings on usability,​‌ citing the complexity of​​ metacognitive concepts and digital​​​‌ interface challenges as primary​ obstacles. The impact of​‌ these lower usability reports​​ on students' performance highlights​​​‌ critical considerations for scaling​ educational interventions. While teachers​‌ appreciated the theoretical foundations​​ and goals of the​​​‌ training, the cognitive demands​ of simultaneously managing complex​‌ pedagogical concepts and digital​​ tools during classroom implementation​​​‌ appeared to affect their​ delivery quality, which in​‌ turn influenced student outcomes.​​

Implications and future directions​​​‌

Together, these findings demonstrate​ that the metacognitive intervention​‌ can enhance curiosity-driven learning​​ in authentic classroom settings.​​​‌ However, successful scaling requires​ strengthened teacher training. Future​‌ iterations of this work​​ will focus on simplifying​​​‌ the intervention, providing more​ comprehensive teacher training programs,​‌ and developing materials increasing​​ perceived usability for teachers​​​‌ as a way to​ favor adoption of such​‌ workshops. In response to​​ these identified needs, we​​​‌ initiated in 2025 the​ creation of comprehensive resources​‌ for teachers around metacognitive​​ interventions, motivation, and curiosity-driven​​​‌ learning. This development work​ focuses on providing teachers​‌ with accessible, evidence-based materials​​ that bridge the gap​​​‌ between research findings and​ classroom practice. The resources​‌ include short, evidence-based exercises​​ designed for direct implementation​​​‌ in the classroom, accompanied​ by detailed recommendations and​‌ pedagogical guidance. These materials​​ aim to reduce the​​​‌ cognitive load on teachers​ by providing ready-to-use activities​‌ while maintaining the theoretical​​ rigor and pedagogical effectiveness​​​‌ demonstrated in our research.​ The exercises are structured​‌ to be modular and​​ adaptable to different classroom​​​‌ contexts, addressing the complexity​ concerns raised by teachers​‌ in our scalability study.​​

This latter point contributes​​​‌ to a broader research​ agenda of developing practical​‌ teacher resources on curiosity-driven​​ learning in educational settings​​​‌ as a way to​ bridge the research-to-practice gap​‌ in educational interventions focused​​ on curiosity and metacognition.​​​‌

8.6.3 Machine Learning for​ Adaptive Personalization in Intelligent​‌ Tutoring Systems

Participants: Pierre-Yves​​ Oudeyer [correspondant], Hélène​​​‌ Sauzeon [correspondant], Benjamin​ Clément, Didier Roy​‌, Cécile Mazon.​​

The Kidlearn project. 

is​​​‌ a research project studying​ how machine learning can​‌ be applied to intelligent​​ tutoring systems. It aims​​​‌ at developing methodologies and​ software which adaptively personalize​‌ sequences of learning activities​​ to the particularities of​​​‌ each individual student. Our​ systems aim at proposing​‌ to the student the​​ right activity at the​​​‌ right time, maximizing concurrently​ his learning progress and​‌ his motivation. In addition​​ to contributing to the​​​‌ efficiency of learning and​ motivation, the approach is​‌ also made to reduce​​ the time needed to​​​‌ design ITS systems.

We​ continued to develop an​‌ approach to Intelligent Tutoring​​ Systems which adaptively personalizes​​​‌ sequences of learning activities​ to maximize skills acquired​‌ by students, taking into​​ account the limited time​​ and motivational resources. At​​​‌ a given point in‌ time, the system proposes‌​‌ to the students the​​ activity which makes them​​​‌ progress faster. We introduced‌ two algorithms that rely‌​‌ on the empirical estimation​​ of the learning progress,​​​‌ RiARiT that uses information‌ about the difficulty of‌​‌ each exercise and ZPDES​​ that uses much less​​​‌ knowledge about the problem.‌

The system is based‌​‌ on the combination of​​ three approaches. First, it​​​‌ leverages recent models of‌ intrinsically motivated learning by‌​‌ transposing them to active​​ teaching, relying on empirical​​​‌ estimation of learning progress‌ provided by specific activities‌​‌ to particular students. Second,​​ it uses state-of-the-art Multi-Arm​​​‌ Bandit (MAB) techniques to‌ efficiently manage the exploration/exploitation‌​‌ challenge of this optimization​​ process. Third, it leverages​​​‌ expert knowledge to constrain‌ and bootstrap initial exploration‌​‌ of the MAB, while​​ requiring only coarse guidance​​​‌ information of the expert‌ and allowing the system‌​‌ to deal with didactic​​ gaps in its knowledge.​​​‌ The system was evaluated‌ in several large-scale experiments‌​‌ relying on a scenario​​ where 7-8 year old​​​‌ schoolchildren learn how to‌ decompose numbers while manipulating‌​‌ money 87. Systematic​​ experiments were also presented​​​‌ with simulated students.

Kidlearn‌ Experiments 2018-2019: Evaluating the‌​‌ impact of ZPDES and​​ choice on learning efficiency​​​‌ and motivation. 

An experiment‌ was held between March‌​‌ 2018 and July 2019​​ in order to test​​​‌ the Kidlearn framework in‌ classrooms in Bordeaux Metropole.‌​‌ 600 students from Bordeaux​​ Metropole participated in the​​​‌ experiment. This study had‌ several goals. The first‌​‌ goal was to evaluate​​ the impact of the​​​‌ Kidlearn framework on motivation‌ and learning compared to‌​‌ an Expert Sequence without​​ machine learning. The second​​​‌ goal was to observe‌ the impact of using‌​‌ learning progress to select​​ exercise types within the​​​‌ ZPDES algorithm compared to‌ a random policy. The‌​‌ third goal was to​​ observe the impact of​​​‌ combining ZPDES with the‌ ability to let children‌​‌ make different kinds of​​ choices during the use​​​‌ of the ITS. The‌ last goal was to‌​‌ use the psychological and​​ contextual data measures to​​​‌ see if correlation can‌ be observed between the‌​‌ students psychological state evolution,​​ their profile, their motivation​​​‌ and their learning. We‌ first show that LP-based‌​‌ personalization improves learning performance​​ (reproducing and solidifying previous​​​‌ results) while producing a‌ positive and motivating learning‌​‌ experience. We then show​​ that the addition of​​​‌ self-choice as a playful‌ feature triggers intrinsic motivation‌​‌ in the learner and​​ reinforces the learning effectiveness​​​‌ of the LP-based personalizing.‌ In doing so, it‌​‌ strengthens the links between​​ intrinsic motivation and performance​​​‌ progress during the serious‌ game. Conversely, deleterious effects‌​‌ of the playful feature​​ are observed for hand-designed​​​‌ linear paths. Thus, the‌ intrinsic motivation elicited by‌​‌ a playful feature is​​ beneficial only if the​​​‌ curriculum personalization is effective‌ for the learner. Such‌​‌ a result deserves great​​ attention due to the​​​‌ increased use of playful‌ features in non adaptive‌​‌ educational technologies available in​​ the market. Details of​​​‌ these new results, as‌ well as the overall‌​‌ results of this project,​​​‌ are presented in Benjamin​ Clément PhD thesis 86​‌ and are currently being​​ processed to be published.​​​‌

Kidlearn and Adaptiv'Math. 

The​ algorithms developed during the​‌ Kidlearn project and Benjamin​​ Clement thesis 86 are​​​‌ being used in an​ innovation partnership for the​‌ development of a pedagogical​​ assistant based on artificial​​​‌ intelligence intended for teachers​ and students of cycle​‌ 2. The algorithms are​​ being written in typescript​​​‌ for the need of​ the project. The expertise​‌ of the team in​​ creating the pedagogical graph​​​‌ and defining the graph​ parameters used for the​‌ algorithms is also a​​ crucial part of the​​​‌ role of the team​ for the project. One​‌ of the main goal​​ of the team here​​​‌ is to transfer technologies​ developed in the team​‌ in a project with​​ the perspective of industrial​​​‌ scaling and see the​ impact and the feasibility​‌ of such scaling.

Kidlearn​​ for numeracy skills with​​​‌ individuals with autism spectrum​ disorders. 

Few digital interventions​‌ targeting numeracy skills have​​ been evaluated with individuals​​​‌ with autism spectrum disorder​ (ASD) 128127.​‌ Yet, some children and​​ adolescents with ASD have​​​‌ learning difficulties and/or a​ significant academic delay in​‌ mathematics. While ITS are​​ successfully developed for typically​​​‌ developed students to personalize​ learning curriculum and then​‌ to foster the motivation-learning​​ coupling, they are not​​​‌ or fewly proposed today​ to student with specific​‌ needs. The objective of​​ this pilot study is​​​‌ to test the feasibility​ of a digital intervention​‌ using an STI with​​ high school students with​​​‌ ASD and/or intellectual disability.​ This application (KidLearn) provides​‌ calculation training through currency​​ exchange activities, with a​​​‌ dynamic exercise sequence selection​ algorithm (ZPDES). 24 students​‌ with ASD and/or DI​​ enrolled in specialized classrooms​​​‌ were recruited and divided​ into two groups: 14​‌ students used the KidLearn​​ application, and 10 students​​​‌ received a control application.​ Pre-post evaluations show that​‌ students using KidLearn improved​​ their calculation performance, and​​​‌ had a higher level​ of motivation at the​‌ end of the intervention​​ than the control group.​​​‌ These results encourage the​ use of an STI​‌ with students with specific​​ needs to teach numeracy​​​‌ skills, but need to​ be replicated on a​‌ larger scale. Suggestions for​​ adjusting the interface and​​​‌ teaching method are suggested​ to improve the impact​‌ of the application on​​ students with autism. 125​​​‌.

8.6.4 Machine learning​ for adaptive cognitive training​‌

Participants: Pierre-Yves Oudeyer,​​ Hélène Sauzéon [correspondant],​​​‌ Masataka Sawayama, Benjamin​ Clément, Maxime Adolphe​‌, Marion Pech,​​ Juliette Deyts.

Because​​​‌ of its cross-cutting nature​ to all cognitive activities​‌ such as learning tasks,​​ attention is a hallmark​​​‌ of good cognitive health​ throughout life and more​‌ particularly in the current​​ context of societal crisis​​​‌ of attention. Recent works​ have shown the great​‌ potential of computerized attention​​ training for an example​​​‌ of attention training, with​ efficient training transfers to​‌ other cognitive activities, and​​ this, over a wide​​​‌ spectrum of individuals (children,​ elderly, individuals with cognitive​‌ pathology such as Attention​​ Deficit and Hyperactivity Disorders).​​ Despite this promising result,​​​‌ a major hurdle is‌ challenging: the high inter-individual‌​‌ variability in responding to​​ such interventions. Some individuals​​​‌ are good responders (significant‌ improvement) to the intervention,‌​‌ others respond variably, and​​ finally some respond poorly,​​​‌ not at all, or‌ occasionally. A central limitation‌​‌ of computerized attention training​​ systems is that the​​​‌ training sequences operate in‌ a linear, non-personalized manner:‌​‌ difficulty increases in the​​ same way and along​​​‌ the same dimensions for‌ all subjects. However, different‌​‌ subjects require in principle​​ a progression at a​​​‌ different, personalized pace according‌ to the different dimensions‌​‌ that characterize attentional training​​ exercises.

To tackle the​​​‌ issue of inter-individual variability,‌ the present project proposes‌​‌ to apply some principles​​ from intelligent tutoring systems​​​‌ (ITS) to the field‌ of attention training. In‌​‌ this context, we have​​ already developed automatic curriculum​​​‌ learning algorithms such as‌ those developed in the‌​‌ KidLearn project, which allow​​ to customize the learner's​​​‌ path according to his/her‌ progress and thus optimize‌​‌ his/her learning trajectory while​​ stimulating his/her motivation by​​​‌ the progress made. ITS‌ are widely identified in‌​‌ intervention research as a​​ successful way to address​​​‌ the challenge of personalization,‌ but no studies to‌​‌ date have actually been​​ conducted for attention training.​​​‌ Thus, whether ITS, and‌ in particular personalization algorithms,‌​‌ can optimize the number​​ of respondents to an​​​‌ attention training program remains‌ an open question.

Grounded‌​‌ state-of-the-art. 

To investigate this​​ question, we first conducted​​​‌ a systematic review aiming‌ at exploring existing methods‌​‌ in computerized CT and​​ analyzing their outcomes in​​​‌ terms of learning mechanics‌ (intra-training performance) and effectiveness‌​‌ (near, far and everyday​​ life transfer effects of​​​‌ CT) 71. A‌ search up to June‌​‌ 2023 with multiple databases​​ selecting 19 computerized CT​​​‌ studies revealed that only‌ two studies emphasized the‌​‌ favorable influence of individualization​​ on CT effectiveness, while​​​‌ five underscored its capacity‌ to enhance the training‌​‌ experience by boosting motivation,​​ engagement, and offering diverse​​​‌ learning pathways. In sum,‌ despite promising results in‌​‌ this new research avenue,​​ more research is needed​​​‌ to fully understand and‌ empirically support individualized techniques‌​‌ in cognitive training.

Figure 37

Distribution​​ of AI techniques depending​​​‌ on type of CT‌ studied (multi or single‌​‌ domain) from Adolphe et​​ al., 2024

Figure 37​​​‌: Distribution of AI‌ techniques depending on type‌​‌ of CT studied (multi​​ or single domain) from​​​‌ Adolphe et al., 2024‌

Complementing the study of‌​‌ adaptive methods applied to​​ cognitive training, we have​​​‌ attempted through a review‌ of the subjective literature‌​‌ to gain a better​​ understanding of the Multiple​​​‌ Object Tracking (MOT) task,‌ which seems to have‌​‌ the best results in​​ terms of attentional training​​​‌ efficiency in young and‌ older adults. Our investigation‌​‌ pursues three main objectives:​​ (1) identifying the cognitive​​​‌ processes influenced by each‌ adjustable parameter of the‌​‌ MOT task; (2) determining​​ which parameters, when progressively​​​‌ adapted during repeated MOT‌ practice, produce the greatest‌​‌ enhancements in task performance;​​ and (3) evaluating how​​​‌ improvements in MOT performance‌ translate into effective transfer‌​‌ effects, including practical, real-world​​​‌ outcomes. The evidence suggests​ that the MOT task​‌ involves a nuanced interplay​​ of visual processing, attentional​​​‌ resources, and working memory,​ shaped by the intrinsic​‌ properties of the objects​​ and the task conditions.​​​‌ The results of this​ work highlight that: (1)​‌ Multiple cognitive mechanisms are​​ identified as active in​​​‌ the task (divided and​ sustained attention; foveal and​‌ peripheric attention ; automatic​​ and controlled inhibition, etc.​​​‌ ); (2) a limited​ number of studies have​‌ actually implemented the MOT​​ task in computer-assisted cognitive​​​‌ training; and (3) tIt's​ the near (attention tasks)​‌ and far (other cognitive​​ tasks) effects that are​​​‌ well documented as positive​ outcomes of MOT-based training​‌ while there is a​​ scarcity of research that​​​‌ has thoroughly analyzed the​ ecological effects of attentional​‌ training, namely the potential​​ transfer effects in everyday​​​‌ life (paper in progress).​

ZPDES calibration for MOT​‌ training (Young participants). 

In​​ parallel to this, a​​​‌ web platform has been​ designed for planning and​‌ implementing remote behavioural studies.​​ This tool provides means​​​‌ for registering recruited participants​ remotely and executing complete​‌ experimental protocols: from presenting​​ instructions and obtaining informed​​​‌ consents, to administering behavioural​ tasks and questionnaires, potentially​‌ throughout multiple sessions spanning​​ days or weeks. In​​​‌ addition to this platform,​ a cognitive test battery​‌ composed of seven classical​​ behavioural tasks has been​​​‌ developed. This battery aims​ to evaluate the evolution​‌ of the cognitive performance​​ of participants before and​​​‌ after training. Fully open-source,​ it mainly targets attention​‌ and memory. A preliminary​​ study on a large​​​‌ sample of 50 healthy​ participants showed that the​‌ developed tasks reproduced the​​ results of previous studies,​​​‌ that there were large​ differences between individuals (no​‌ ceiling effect) and that​​ the results were significantly​​​‌ reliable between two measurements​ taken on two days​‌ separated by one night​​ 4.

Randomized and​​​‌ controlled Trial in Young​ and Olders adults :​‌ Predifined vs. ZPDES condition.​​ 

Utilizing these tools, a​​​‌ pilot study campaign was​ conducted to evaluate the​‌ impact of our AI-based​​ personalized cognitive training program.​​​‌ The first pilot experiment​ involved n=27 participants and​‌ aimed to compare the​​ effectiveness of a cognitive​​​‌ training program using a​ linear difficulty management procedure​‌ (staircase procedure) to a​​ program using an ITS​​​‌ for difficulty manipulation. The​ online training lasted for​‌ 10 hours over a​​ period of 2 weeks.​​​‌ The results indicated that​ the ITS-based intervention produced​‌ diverse learning trajectories compared​​ to the linear procedure​​​‌ 38, leading to​ broader improvements in pre-post​‌ cognitive assessment. However, no​​ significant differences were observed​​​‌ in subjective measures of​ motivation and engagement between​‌ the two groups. Subsequent​​ to this initial experiment,​​​‌ two pilot studies (n=11​ and n=10, respectively) were​‌ conducted with the goal​​ of enhancing motivation and​​​‌ engagement in the game.​ The first study implemented​‌ gamified components such as​​ scores and feedback, while​​​‌ the second study examined​ hyperparameter updates to the​‌ ITS. The analysis of​​ learning trajectories, learning outcomes,​​​‌ and subjective measures yielded​ promising results in favor​‌ of the AI-based personalized​​ procedure.

Figure 38

Different learning trajectories​​ for a selected participant​​​‌ in the staircase group‌ (left) and the ITS‌​‌ group (right). The color​​ of a dot indicates​​​‌ the initial presentation of‌ the parameter value, while‌​‌ the size of the​​ dot represents the frequency​​​‌ of the parameter value.‌

Figure 38: Different‌​‌ learning trajectories for a​​ selected participant in the​​​‌ staircase group (left) and‌ the ITS group (right).‌​‌ The color of a​​ dot indicates the initial​​​‌ presentation of the parameter‌ value, while the size‌​‌ of the dot represents​​ the frequency of the​​​‌ parameter value.

Building on‌ the preliminary findings, we‌​‌ expanded our research scope​​ with a more comprehensive​​​‌ experimental setup involving two‌ distinct studies. The first‌​‌ study encompassed 64 young​​ adults, sourced through the​​​‌ Prolific platform, while the‌ second study consisted of‌​‌ 50 older adults, recruited​​ from the "Université du​​​‌ temps libre". Our experimental‌ methodology mirrored that of‌​‌ our initial pilot studies,​​ with a notable enhancement:​​​‌ the integration of new‌ gamified elements (including mini-story‌​‌ creation and new visual​​ content) aimed at boosting​​​‌ participant motivation and engagement.‌

Figure 39

a) The MOT task.‌​‌ (b) Several visual snapshots​​ of intervention. (c) Schedule​​​‌ proposed to participants

Figure‌ 39: a) The‌​‌ MOT task. (b) Several​​ visual snapshots of intervention.​​​‌ (c) Schedule proposed to‌ participants

The data analysis‌​‌ encompassed three primary dimensions:​​ initially, an exploratory phase​​​‌ to delineate learning trajectories‌ between control and intervention‌​‌ groups; subsequently, a comparative​​ analysis of pre- and​​​‌ post-test performance on the‌ cognitive battery; and lastly,‌​‌ an examination of participants'​​ self-reported experiences during training,​​​‌ providing insights into their‌ subjective perceptions of the‌​‌ experiment.

The pilot studies'​​ preliminary outcomes were corroborated​​​‌ in these larger sample‌ groups. Notably, learning trajectories‌​‌ exhibited greater diversity in​​ the group undergoing the​​​‌ intervention procedure. This group‌ also demonstrated a more‌​‌ pronounced improvement across a​​ wider range of cognitive​​​‌ assessment tasks. Although participants‌ engaging in the personalized‌​‌ cognitive training reported a​​ higher cognitive load via​​​‌ questionnaires, the levels of‌ engagement and frustration did‌​‌ not significantly differ between​​ the two groups.

The​​​‌ results showed that ZPDES‌ could be more effective‌​‌ than a control condition,​​ with improved performance on​​​‌ trained tasks in both‌ studies, underlining the benefits‌​‌ of individualized training paths.​​ However, motivation and engagement​​​‌ were lower in the‌ groups using ZPDES, probably‌​‌ due to cognitive load​​ and metacognitive factors. Overall,​​​‌ individualizing cognitive training through‌ systems like ZPDES provides‌​‌ a promising direction for​​ future research by providing​​​‌ automatic methods for taking‌ individual differences into account‌​‌ in CT programs while​​ respecting methodological standards for​​​‌ evaluating the effectiveness of‌ CT. As a result,‌​‌ our work contributes to​​ the growing body of​​​‌ knowledge in both ITS‌ and CT domains while‌​‌ stressing the crucial role​​ of challenges related to​​​‌ motivation and engagement to‌ optimize the effectiveness of‌​‌ these individualized approaches for​​ cognitive and educational outcomes.​​​‌

As part of the‌ creation of the new‌​‌ University Hospital Institute (UHI)​​ VBHI (VASCULAR BRAIN HEALTH​​​‌ INSTITUTE), we aim to‌ develop and test a‌​‌ personalized, multimodal digital therapeutic​​​‌ approach to slow down​ the functional consequences of​‌ small vessel disease. More​​ specifically:

  • Evaluate the impact​​​‌ of personalized cognitive training​ compared to non-personalized conditions​‌ (comparative efficacy).
  • Identify potential​​ ElectroEncephaloGraphic (EEG) biomarkers that​​​‌ reflect cognitive activity impacted​ by small vessel disease​‌ and could later (in​​ a subsequent study) be​​​‌ used as targets for​ exploratory EEG neurofeedback therapy.​‌
  • Identify brain areas to​​ target for delivering non-invasive​​​‌ HD-tACS electrical stimulation, using​ previously acquired MRI data.​‌
  • Evaluate the impact of​​ this stimulation on brain​​​‌ activity, neural synchronization, and​ cognitive performance.

To achieve​‌ this, 80 participants from​​ the SHIVA cohort (n=80)​​​‌ will be divided into​ two subgroups according to​‌ the severity of the​​ disease:

  • Severe group: presenting​​​‌ multiple lesions on MRI​
  • Non-severe group: presenting a​‌ few lesions

These groups​​ will then be further​​​‌ divided based on the​ type of training: personalized​‌ tests (ZPDES) versus standard​​ tests.

justification=centering,margin=1cm
Figure 40.a
Figure 40.b

SHIVA study​​​‌ protocol and materials.

Figure​ 40: SHIVA study​‌ protocol and materials.

During​​ the pre- and post-training​​​‌ sessions, participants will perform​ cognitive tests on a​‌ computer. Participants will be​​ equipped with an EEG​​​‌ headset, which, combined with​ a tACS stimulator, will​‌ allow for both brain​​ activity recording and stimulation.​​​‌

We are carrying out​ an ancillary study with​‌ Myra Fernandez's laboratory in​​ Canada, thanks to my​​​‌ participation with the Inria​ Curiositytech international associate team.​‌ We have proposed to​​ collaboratively analyze certain data​​​‌ and dimensions of interest​ in our respective laboratories​‌ (e.g. physical activities) associated​​ with the cognitive training​​​‌ proposed in the SHIVA-DTX-COG​ project.

Qualitative Analysis with​‌ LLMs: 

As it is​​ well known that there​​​‌ are more dropouts in​ older adults compared to​‌ young ones, we aimed​​ to better understand the​​​‌ learning experience of trainees​ with feeback analyses. For​‌ this, we designed a​​ new way throught several​​​‌ Large Language Models (LLM)​ enabling to extract hot​‌ topics or main dropout's​​ motivations in verbatim that​​​‌ are related to pragmatic,​ hedonist and/or aesthetic dimensions​‌ of cogntive training .​​ The results analyzed through​​​‌ various LLM are encouraging​ (paper in progress). To​‌ support this new approach,​​ we are exploring different​​​‌ prompts on other data​ corpora in order to​‌ ultimately propose a tutorial​​ accessible to anyone wishing​​​‌ to carry out a​ LLM-based thematic qualitative analysis.​‌

8.6.5 ToGather : Interactive​​ website to foster collaboration​​​‌ among stakeholders of school​ inclusion for pupils with​‌ neurodevelopmental disorders

Participants: Hélène​​ Sauzéon [correspondant], Cécile​​​‌ Mazon, Eric Meyer​, Isabeau Saint-Supery,​‌ Christelle Maillart [Uni. Liège,​​ Belgium], Kamélia Belassel​​​‌, Mathieu Périé,​ Valentin Strahm.

Sustain​‌ and support the follow-up​​ of the school inclusion​​​‌ of children with neurodevelopmental​ disorders (e.g., autism, attention​‌ disorders, intellectual deficiencies) has​​ become an emergency :​​​‌ the higher is the​ school level, the lower​‌ is the amount of​​ schooled pupils with cognitive​​​‌ disabilities.

Technology-based interventions to​ improve school inclusion of​‌ children with neurodevelopmental disorders​​ have mostly been individual​​​‌ centered, focusing on their​ socio-adaptive, and cognitive impairments​‌ and implying they have​​ to adapt themselves in​​ order to fit in​​​‌ our society's expectations. Although‌ this approach centered on‌​‌ the normalization of the​​ person has some advantages​​​‌ (reduction of clinical, symptoms),‌ it carries social stereotypes‌​‌ and misconceptions of cognitive​​ disability that are not​​​‌ respectful of the cognitive‌ diversity and intrinsic motivations‌​‌ of the person, and​​ in particular of the​​​‌ student's wishes in terms‌ of school curriculum to‌​‌ achieve his or her​​ future life project 129​​​‌.

The "ToGather" project‌ aims at enlightening the‌​‌ field of educational technologies​​ for special education by​​​‌ proposing an approach centered‌ on the educational needs‌​‌ of the students and​​ bringing a concerted and​​​‌ informed answer between all‌ the stakeholders including the‌​‌ student and all their​​ support spheres (family, school,​​​‌ medico-social care). To this‌ end, ToGather project that‌​‌ emanates from participatory design​​ methods, primarily consists of​​​‌ having developed a pragmatic‌ tool (interactive website) to‌​‌ help students with cognitive​​ disability and their caregivers​​​‌ to formalize and to‌ visualize the repertoire of‌​‌ academic skills of the​​ student and to make​​​‌ it evolve according to‌ his or her proximal‌​‌ zone of development (in​​ the sense of Vygotsky)​​​‌ on the one hand,‌ and to the intrinsic‌​‌ motivations of the student​​ (his or her own​​​‌ educational and life project)‌ on the other 126‌​‌.

This project is​​ in partnership with the​​​‌ School Academy of Bordeaux‌ of the French Education‌​‌ Minestery, the ARI association,​​ the Centre of Autism​​​‌ of Aquitaine. It is‌ funded by the FIRAH‌​‌ (foundation) and the Nouvelle-Aquitaine​​ Region (see the dedicated​​​‌ webpages).

First, usability‌ studies have been conducted‌​‌ for evaluating ergonomic qualities​​ of the ToGather website,​​​‌ yielding positive resultats in‌ French and Belgian contexts.‌​‌ Then, we conducted a​​ large field-study to assess​​​‌ the effectiveness of the‌ tool in helping stakeholders‌​‌ to support children with​​ neurodevelopmental disorders (NDD)  155​​​‌  153  154.

The‌ study protocol consisted in‌​‌ a longitudinal non-randomized controlled​​ trial, with baseline, 3-months,​​​‌ and 6-months fllow-up assessments.‌ The recruitment was conducted‌​‌ across the entire French​​ territory. Our local partners​​​‌ facilitated the dissemination of‌ the call for participation‌​‌ in Gironde and provided​​ us with contacts to​​​‌ extend it to other‌ regions. Additionally, a recruitment‌​‌ campaign through social media​​ was carried out to​​​‌ communicate about the study‌ and encourage participants to‌​‌ test the ToGather tool.​​

As the tool was​​​‌ designed to support co-educational‌ process between parents and‌​‌ professionals, a support team​​ had to consist of​​​‌ at least two stakeholders,‌ including at least one‌​‌ of the parents. Initially,​​ 157 participants were recruited​​​‌ in 37 support teams,‌ but 30 individuals did‌​‌ not answer to baseline​​ questionnaire, leading to the​​​‌ exclusion of 11 support‌ teams. After baseline assessment,‌​‌ 13 support teams were​​ allocated to the experimental​​​‌ condition (ToGather app) and‌ 11 to the control‌​‌ condition (usual follow-up).

Primary​​ outcomes measures covered stakeholders’​​​‌ relationships, self-efficacy, and attitudes‌ towards inclusive education, while‌​‌ secondary outcomes measures were​​ related to stakeholders’ burden​​​‌ and quality of life,‌ as well as children’s‌​‌ school well-being and quality​​​‌ of life.

As the​ study ended recently, data​‌ analysis is still ongoing.​​ Preliminary results after 3​​​‌ months of use showed​ encouraging results with an​‌ improvement in communication between​​ stakeholders and their respective​​​‌ quality of life (paper​ in progress)

8.6.6 Curious​‌ and therefore not overloaded​​ : Study of the​​​‌ links between curiosity and​ cognitive load in learning​‌ mediated by immersive technologies​​

Participants: Hélène Sauzéon [correspondant]​​​‌, Matisse Poupard,​ André Tricot [Cosupervisor -​‌ Univ. Montpellier], Florian​​ Larrue [Industrialist - Le​​​‌ Catie].

Conducted in​ collaboration with CATIE (industrial​‌ partner) and the EPSYLON​​ laboratory at the University​​​‌ of Montpellier (under the​ supervision of Prof. André​‌ Tricot), this research program​​ was initiated in April​​​‌ 2022 and defended on​ September 11th,​‌ 2025. It pursued two​​ main objectives:

  • To establish​​​‌ theoretical links between cognitive​ load theory and models​‌ of curiosity-driven learning.
  • To​​ experimentally examine how the​​​‌ choice of educational technology​ modulates the relationship between​‌ pedagogical approaches (guided instruction​​ vs. exploration) and learner​​​‌ expertise.

To address these​ objectives, the thesis was​‌ structured into three main​​ phases.

Literature Review. 

A​​​‌ systematic review examining the​ contributions and limitations of​‌ Virtual Reality (VR) and​​ Augmented Reality (AR) for​​​‌ learning was conducted, with​ a specific focus on​‌ their effects on cognitive​​ load and intrinsic motivation.​​​‌ This review identified both​ the pedagogical potential of​‌ immersive technologies and persistent​​ methodological limitations in the​​​‌ field, particularly regarding the​ measurement of motivation and​‌ cognitive processes. The results​​ were published in the​​​‌ British Journal of Educational​ Technology (BJET) 39.​‌

Experimental Research in XR-Based​​ Anatomy Learning. 

Two experimental​​​‌ studies were conducted in​ 2023 with 131 second-year​‌ medical students and replicated​​ in 2024 with 164​​​‌ medical students from the​ second to fifth years.​‌

The first experiment investigated​​ whether supporting students’ drawing​​​‌ activity during lectures using​ augmented and mixed reality​‌ could reduce cognitive load​​ and enhance motivation. Participants​​​‌ followed a 20-minute neuroanatomy​ video lecture while simultaneously​‌ reproducing drawings demonstrated by​​ the instructor. Four experimental​​​‌ conditions were compared:

  • Spatial​ Augmented Reality (SAR): A​‌ digital overlay of the​​ anatomical structure was projected​​​‌ onto paper, allowing learners​ to trace it using​‌ a projector and tracking​​ system.
  • Mixed Reality (MR):​​​‌ The digital overlay was​ displayed through a HoloLens​‌ 2 headset.
  • Mixed Reality​​ with 3D Model (MR+3D):​​​‌ In addition to the​ digital overlay, learners could​‌ manipulate a 3D anatomical​​ model.
  • Control Condition: No​​​‌ digital overlay was provided.​
Figure 41

Experimental conditions for experiment​‌ 1 : Support Drawing​​ with Augmented Reality

Figure​​​‌ 41: Experimental conditions​ for experiment 1 :​‌ Support Drawing with Augmented​​ Reality

Results from the​​​‌ 2023 dataset showed that​ both AR- and MR-supported​‌ drawing conditions significantly reduced​​ extraneous cognitive load, increased​​​‌ intrinsic motivation, and improved​ drawing accuracy. However, no​‌ significant differences in knowledge​​ acquisition were observed between​​​‌ conditions. Notably, in the​ stereoscopic 3D visualization condition,​‌ learners with higher intrinsic​​ motivation exhibited poorer learning​​​‌ outcomes, possibly due to​ increased attentional focus on​‌ system interaction rather than​​ conceptual understanding. Visuospatial ability​​ and prior knowledge moderated​​​‌ the effectiveness of AR‌ and MR interventions, with‌​‌ more experienced learners benefiting​​ the most. These results​​​‌ are reported in a‌ manuscript currently under review‌​‌ in the Journal of​​ Computing in Higher Education​​​‌65.

The second‌ experiment explored a different‌​‌ learning paradigm using virtual​​ reality (VR), manipulating levels​​​‌ of interactivity and instructional‌ guidance. This design enabled‌​‌ the examination of how​​ exploration and embodied interaction​​​‌ with a 3D anatomical‌ model affect learning outcomes,‌​‌ cognitive load, and curiosity.​​

Figure 42

Experimental conditions for experiment​​​‌ 2 : Embodied learning‌ in virtual reality, effect‌​‌ of interactivity

Figure 42​​: Experimental conditions for​​​‌ experiment 2 : Embodied‌ learning in virtual reality,‌​‌ effect of interactivity

Analyses,​​ published in Computers &​​​‌ Education38, showed‌ that VR conditions led‌​‌ to superior learning performance,​​ particularly in the passive​​​‌ and active interaction conditions.‌ These conditions were associated‌​‌ with higher intrinsic motivation​​ and a more optimized​​​‌ cognitive load profile. Moreover,‌ intrinsic motivation was positively‌​‌ correlated with germane cognitive​​ load (i.e., cognitive resources​​​‌ devoted to learning) and‌ negatively correlated with extraneous‌​‌ cognitive load. In other​​ words, highly motivated learners​​​‌ experienced fewer irrelevant cognitive‌ demands, allowing them to‌​‌ allocate more resources to​​ meaningful learning processes.

Following​​​‌ the systematic review, which‌ highlighted the lack of‌​‌ reliable and context-sensitive measures​​ of intrinsic motivation in​​​‌ XR research, a third‌ study leveraged a key‌​‌ affordance of VR: the​​ continuous collection of behavioral​​​‌ data. By analyzing head‌ and hand movements during‌​‌ the neuroanatomy learning task,​​ this study aimed to​​​‌ identify implicit behavioral indicators‌ of curiosity and cognitive‌​‌ engagement. Results showed that​​ increased hand movement was​​​‌ associated with lower intrinsic‌ motivation, whereas greater head‌​‌ movement was positively associated​​ with both germane cognitive​​​‌ load and intrinsic motivation,‌ suggesting deeper cognitive engagement.‌​‌ Additionally, movement entropy emerged​​ as a significant predictor​​​‌ of curiosity-driven learning, highlighting‌ its potential as an‌​‌ implicit marker of learning-related​​ behaviors in immersive environments.​​​‌ These findings are presented‌ in a manuscript currently‌​‌ under review in the​​ International Journal of Human–Computer​​​‌ Studies37.

Figure 43

Illustration‌ of movement entropy calculations‌​‌ in the virtual environment​​

Figure 43: Illustration​​​‌ of movement entropy calculations‌ in the virtual environment.‌​‌ The top section depicts​​ the three spatial axes​​​‌ used to track hand‌ and head positions, as‌​‌ well as head rotation​​ along the X, Y,​​​‌ and Z axes. The‌ bottom section presents heatmaps‌​‌ of time spent by​​ participants' heads in different​​​‌ spatial regions, visualizing entropy‌ patterns in 2D space.‌​‌
Toward a Unified Model:​​ Cognitive Load, Motivation, and​​​‌ Expertise. 

Building on the‌ empirical results of the‌​‌ previous studies, which revealed​​ dynamic interactions between cognitive​​​‌ load and intrinsic motivation,‌ this final phase addressed‌​‌ the second overarching objective​​ of the thesis: the​​​‌ empirical integration of cognitive‌ load theory and the‌​‌ Learning Progress Hypothesis. Using​​ structural equation modeling (SEM),​​​‌ this study tested a‌ comprehensive model describing the‌​‌ relationships between XR technologies,​​ cognitive load, intrinsic motivation,​​​‌ perceived learning progress, and‌ learning outcomes.

Results indicated‌​‌ that both AR and​​​‌ VR significantly reduced extraneous​ cognitive load and increased​‌ intrinsic motivation. However, intrinsic​​ motivation did not directly​​​‌ predict immediate learning performance.​ Instead, extraneous cognitive load​‌ negatively affected perceived learning​​ progress and autonomy, which​​​‌ in turn predicted intrinsic​ motivation, revealing a key​‌ mediating pathway.

Figure 44

Resulting model​​ from the SEM.

Figure​​​‌ 44: Resulting model​ from the SEM.

Overall,​‌ these findings demonstrate that​​ unnecessary cognitive demands not​​​‌ only hinder learning efficiency​ but also disrupt learners’​‌ perceived progress and sense​​ of control, thereby undermining​​​‌ curiosity and intrinsic motivation.​ This work contributes to​‌ a unified theoretical framework​​ by showing how optimizing​​​‌ extraneous cognitive load in​ XR environments supports both​‌ cognitive efficiency and curiosity-driven​​ learning. The results are​​​‌ presented in a manuscript​ currently under review in​‌ Educational Psychology Review66​​.

Effect of XR​​​‌ technology displays on everyday​ memory 

In addition to​‌ this work, the co-design​​ of an augmented reality​​​‌ (AR) application simulating a​ museum visit (Co-led with​‌ P. Dragicevic, Bivouac, under​​ the I-am AEx project,​​​‌ 2023) and integrated with​ an evaluation of involuntary​‌ and uncontrollable memory revival​​ has originally demonstrated that​​​‌ AR enhances this type​ of memory compared to​‌ 3D images 53,​​ suggesting potential cognitive manipulations​​​‌ with AR (Honorable Mention​ at CHI 2025).

Display​‌ features and personal factors​​ (e.g., intellectual curiosity/humility) are​​​‌ being studied 54 to​ develop robust usage recommendations​‌ (L. Petiot’s PhD thesis​​ co-supervized by; H. Sauzéon).​​​‌

8.6.7 Self-determination-driven digital services​ for supporting aging-in place​‌ and well-being: a study​​ of relationships between longitudinal​​​‌ data from smart home​ and clinical data

Participants:​‌ Hélène Sauzéon, Juliette​​ Deyts, Lucile Dupuy​​​‌, Rafik Belloum.​

This work relies on​‌ longitudinal data collected from​​ frail older adults living​​​‌ alone at home who​ used the HomeAssist ambient​‌ assisted living platform for​​ up to 24 months.​​​‌ HomeAssist was designed according​ to a self-determination and​‌ user-centered approach, covering three​​ domains of need: daily​​​‌ activities, home safety, and​ social participation. The objective​‌ of this research is​​ to analyze relationships between​​​‌ clinical data (e.g., cognitive​ assessments, frailty, autonomy, self-determination)​‌ and usage-related data (user​​ experience questionnaires, usage diaries,​​​‌ and actimetric data derived​ from environmental sensors), in​‌ order to both assess​​ the benefits of assistive​​​‌ and monitoring services and​ explore the predictive value​‌ of sensor-based data for​​ explaining clinical outcomes.

A​​​‌ first study focused on​ identifying factors influencing user​‌ experience (UX) and long-term​​ adoption of HomeAssist, based​​​‌ on data from 131​ participants. Despite a user-centered​‌ design, long-term adoption remained​​ limited, with only 18​​​‌ users continuing after 24​ months and 38 requesting​‌ removal within the first​​ six months. Regression analyses​​​‌ showed that UX dimensions​ were mainly predicted by​‌ other UX dimensions rather​​ than by individual health​​​‌ or psychosocial characteristics. In​ contrast, long-term adoption was​‌ weakly predicted by level​​ of education and computer​​​‌ ownership, suggesting that while​ user-centered design may reduce​‌ the impact of individual​​ characteristics on user experience,​​​‌ adoption remains influenced by​ digital literacy and social​‌ inequalities.

Overall, these activities​​ contribute to the design​​ of a user-centered visualization​​​‌ tool intended for clinicians‌ (psychologists and physicians), enabling‌​‌ them to better understand​​ the links between long-term​​​‌ usage data and clinical‌ evolution, and to detect‌​‌ early “weak signals” of​​ decline (e.g., changes in​​​‌ sleep patterns), thereby facilitating‌ timely and targeted interventions.‌​‌

8.7 Curiosity-driven AI for​​ assisted scientific discovery

8.7.1​​​‌ Design of an Interactive‌ Software for Automated Discovery‌​‌ in Complex Systems

Participants:​​ Clément Romac [correspondant],​​​‌ Zacharie Bugaud, Clément‌ Moulin-Frier, Pierre-Yves Oudeyer‌​‌.

We further developed​​ our Automated Discovery software​​​‌ and particularly focused on‌ adding and experimenting with‌​‌ new systems.

Our public​​ software now features more​​​‌ than ten examples ranging‌ from artificial life, to‌​‌ physics or protein docking.​​ The software was publicly​​​‌ released in 2024: presentation‌ thread.

Figure 45

Technical architecture‌​‌ of our software.

Figure​​ 45: Technical architecture​​​‌ of our software.

8.7.2‌ Discovering Sensorimotor Agency in‌​‌ Cellular Automata using Diversity​​ Search

Participants: Gautier Hamon​​​‌ [correspondant], Mayalen Etcheverry‌, Bert Chan,‌​‌ Clément Moulin-Frier, Pierre-Yves​​ Oudeyer.

As a​​​‌ continuation of the previous‌ projects in Automated Discovery‌​‌ in Self-Organizing Systems, we​​ have been working on​​​‌ expanding the set of‌ discoveries of possible structures‌​‌ in continuous CAs such​​ as Lenia 82,​​​‌ 81, and in‌ particular we have been‌​‌ interested to search for​​ emerging agents with sensorimotor​​​‌ capabilities. Understanding what has‌ led to the emergence‌​‌ of life and sensorimotor​​ agency as we observe​​​‌ in living organisms is‌ a fundamental question. In‌​‌ our work, we initially​​ only assume environments made​​​‌ of low-level elements of‌ matter (called atoms, molecules‌​‌ or cells) locally interacting​​ via physics-like rules. There​​​‌ is no predefined notion‌ of agent embodiment and‌​‌ yet we aim to​​ answer the following scientific​​​‌ question: is it possible‌ to find environments in‌​‌ which there exists/emerge a​​ subpart that could be​​​‌ called a sensorimotor agent‌?

We use Lenia‌​‌ continuous cellular automaton as​​ our artificial "world" 81​​​‌. We introduce a‌ novel method based on‌​‌ gradient descent and curriculum​​ learning combined within an​​​‌ intrinsically-motivated goal exploration process‌ (IMGEP) to automatically search‌​‌ parameters of the CA​​ rule that can self-organize​​​‌ spatially localized 1 and‌ moving patterns 2 within‌​‌ Lenia. The IMGEP defines​​ an outer exploratory loop​​​‌ (generation of training goal/loss)‌ and an inner optimization‌​‌ loop (goal-conditioned). We use​​ a population-based version of​​​‌ IMGEP 12,91‌ but introduce two novel‌​‌ elements compared to previous​​ papers in the IMGEP​​​‌ literature. First, whereas previous‌ work in 29 and‌​‌ 10 used a very​​ basic nearest-neighbor goal-achievement strategy,​​​‌ our work relies on‌ gradient descent for the‌​‌ local optimization of the​​ (sensitive) parameters of the​​​‌ complex system, which has‌ shown to be very‌​‌ powerful. To do so​​ we made a differentiable​​​‌ version of the Lenia‌ framework, which is also‌​‌ a contribution of this​​ work. Secondly, we propose​​​‌ to control subparts of‌ the environmental dynamics with‌​‌ functional constraints (through predefined​​ channels and kernels in​​​‌ Lenia) to build a‌ curriculum of tasks; and‌​‌ to integrate this stochasticity​​​‌ in the inner optimization​ loop. This has shown​‌ central to train the​​ system to emerge sensorimotor​​​‌ agents that are robust​ to stochastic perturbations in​‌ the environment. In particular,​​ we focus on modeling​​​‌ obstacles in the environment​ physics and propose to​‌ probe the agent sensorimotor​​ capability as its performance​​​‌ to move forward under​ a variety of obstacle​‌ configurations. We also provide​​ in this work tests​​​‌ and metrics to measure​ the robustness of the​‌ obtained agents.

Figure 46.a
Figure 46.b
Figure 46.c
Figure 46.d
Figure 46.e

Robustness test​​ to harder/unseen obstacle configurations:​​​‌ straight wall, bigger obstacle,​ dead ends.

Figure 46​‌: Robustness test to​​ harder/unseen obstacle configurations: straight​​​‌ wall, bigger obstacle, dead​ ends.
Figure 47.a
Figure 47.b

Change of scale​‌ changing the kernel size​​ and initialization, the grid​​​‌ is the same size​ in both

Figure 47​‌: Change of scale​​ changing the kernel size​​​‌ and initialization, the grid​ is the same size​‌ in both

While many​​ complex behaviors have already​​​‌ been observed in Lenia,​ among which some could​‌ qualify as sensorimotor behaviors,​​ they have so far​​​‌ been discovered "by chance"​ as the result of​‌ time-consuming manual search or​​ with simple evolutionary algorithms.​​​‌ Our method provides a​ more systematic way to​‌ automatically learn the CA​​ rules leading to the​​​‌ emergence of basic sensorimotor​ structures, as shown in​‌ Figure 48. Moreover,​​ we investigated and provided​​​‌ ways to measure the​ (zero-shot) generalization of the​‌ discovered sensorimotor agents to​​ several out-of-distribution perturbations that​​​‌ were not encountered during​ training. Impressively, even though​‌ the agents still fail​​ to preserve their integrity​​​‌ in certain configurations, they​ show very strong robustness​‌ to most of the​​ tested variations. The agents​​​‌ are able to navigate​ in unseen and harder​‌ environmental configurations while self-maintaining​​ their individuality (Figure 46​​​‌). Not only the​ agents are able to​‌ recover their individuality when​​ subjected to external perturbations​​​‌ but also when subjected​ to internal perturbations: they​‌ resist variations of the​​ morphogenetic processes such that​​​‌ less frequent cell updates,​ quite drastic changes of​‌ scales as well as​​ changes of initialization (Figure​​​‌ 47). Furthermore, when​ tested in a multi-entity​‌ initialization and despite hav,ing​​ been trained alone, not​​​‌ only the agents are​ able to preserve their​‌ individuality but they show​​ forms of coordinated interactions​​​‌ (attractiveness and reproduction). Our​ results sug,gest that, contrary​‌ to the (still predominant)​​ mechanistic view on embodiment,​​​‌ biologically-inspired embodiment could pave​ the way toward agents​‌ with strong coherence and​​ generalization to out-of-distribution changes,​​​‌ mimicking the remarkable robustness​ of living systems to​‌ maintain specific functions despite​​ environmental and body perturbations​​​‌ 116. Searching for​ rules at the cell-level​‌ in order to give​​ rise to higher-level cognitive​​​‌ processes at the level​ of the organism and​‌ at the level of​​ the group of organisms​​​‌ opens many exciting opportunities​ to the development of​‌ embodied approaches in AI​​ in general.

Figure 48

Scatter plot​​​‌ of the agents as​ their measured performances of​‌ robustness to obstacles (y​​ axis) and speed in​​​‌ obstacles (x axis) obtained​ by IMGEP (red), random​‌ search with the same​​ compute resources as IMGEP(blue)​​ and the one from​​​‌ the original lenia paper‌ (green)

Figure 48:‌​‌ Scatter plot of the​​ agents as their measured​​​‌ performances of robustness to‌ obstacles (y axis) and‌​‌ speed in obstacles (x​​ axis) obtained by IMGEP​​​‌ (red), random search with‌ the same compute resources‌​‌ as IMGEP(blue) and the​​ one from the original​​​‌ lenia paper (green)

The‌ work has been released‌​‌ in 2022 as a​​ distill-like article which is​​​‌ currently hosted at this‌ link. This article‌​‌ contains an interactive demo​​ in webGL and javascript,​​​‌ as well as many‌ videos and animations of‌​‌ the results. A colab​​ notebook with the source​​​‌ code of the work‌ is publicly available at‌​‌.

In 2024, additional​​ quantitative experiments were conducted​​​‌ as well as ablations.‌ This work was published‌​‌ in 2025in the​​ Science Advances journal.

8.7.3​​​‌ Semantic Open-Endedness in Flow-Lenia‌ using Vision Language Models‌​‌ and IMGEP

Participants: Sina​​ Khajehabdollahi [correspondent], Gautier​​​‌ Hamon, Marko Cvjetko‌, Cédric Colas,‌​‌ Pierre-Yves Oudeyer, Clément​​ Moulin-Frier.

Discovering diverse​​​‌ visual patterns in continuous‌ cellular automata (CA) is‌​‌ challenging due to the​​ vastness and redundancy of​​​‌ high-dimensional behavioral spaces. Traditional‌ exploration methods like Novelty‌​‌ Search (NS) expand locally​​ by mutating known novel​​​‌ solutions but often plateau‌ when local novelty is‌​‌ exhausted, failing to reach​​ distant, unexplored regions. We​​​‌ introduce Expedition & Expansion‌ (E&E), a hybrid strategy‌​‌ where exploration alternates between​​ local novelty-driven expansions and​​​‌ goal-directed expeditions. During expeditions,‌ E&E leverages a Vision-Language‌​‌ Model (VLM) to generate​​ linguistic goals—descriptions of interesting​​​‌ but hypothetical patterns that‌ drive exploration toward uncharted‌​‌ regions. By operating in​​ semantic spaces that align​​​‌ with human perception, E&E‌ both evaluates novelty and‌​‌ generates goals in conceptually​​ meaningful ways, enhancing the​​​‌ interpretability and relevance of‌ discovered behaviors. Tested on‌​‌ Flow Lenia, a continuous​​ CA known for its​​​‌ rich, emergent behaviors, E&E‌ consistently uncovers more diverse‌​‌ solutions than existing exploration​​ methods. A genealogical analysis​​​‌ further reveals that solutions‌ originating from expeditions disproportionately‌​‌ influence long-term exploration, unlocking​​ new behavioral niches that​​​‌ serve as stepping stones‌ for subsequent search. These‌​‌ findings highlight E&E's capacity​​ to break through local​​​‌ novelty boundaries and explore‌ behavioral landscapes in human-aligned,‌​‌ interpretable ways, offering a​​ promising template for open-ended​​​‌ exploration in artificial life‌ and beyond. The project‌​‌ was published at the​​ Artificial Life 2025 conference​​​‌ 48. A summary‌ and the result visualization‌​‌ are available on the​​ project website.

8.7.4​​​‌ Exploring Flow-Lenia Universes with‌ a Curiosity-driven AI Scientist:‌​‌ Discovering Diverse Ecosystem Dynamics​​

Participants: Thomas Michel [correspondent]​​​‌, Marko Cvjetko [correspondent]‌, Gautier Hamon,‌​‌ Pierre-Yves Oudeyer, Clément​​ Moulin-Frier.

We present​​​‌ a method for the‌ automated discovery of system-level‌​‌ dynamics in Flow-Lenia—a continuous​​ cellular automaton with mass​​​‌ conservation and parameter localization—using‌ a curiosity-driven AI scientist.‌​‌ This method aims to​​ uncover processes leading to​​​‌ self-organization of evolutionary and‌ ecosystemic dynamics in CAs.‌​‌ We build on previous​​ work which uses diversity​​​‌ search algorithms in Lenia‌ to find self- organized‌​‌ individual patterns, and extend​​​‌ it to large environments​ that support distinct interacting​‌ patterns. We adapt Intrinsically​​ Motivated Goal Exploration Processes​​​‌ (IMGEPs) to drive exploration​ of diverse Flow-Lenia environments​‌ using simulation-wide metrics, such​​ as evolutionary activity, compression-based​​​‌ complexity, and multi-scale entropy.​ We test our method​‌ in two experiments, showcasing​​ its ability to illuminate​​​‌ significantly more diverse dynamics​ compared to random search.​‌ We show qualitative results​​ illustrating how ecosystemic simulations​​​‌ enable self-organization of complex​ collective behaviors not captured​‌ by previous individual pattern​​ search and analysis. We​​​‌ complement automated discovery with​ an interactive exploration tool,​‌ creating an effective human-AI​​ collaborative workflow for scientific​​​‌ investigation. Though demonstrated specifically​ with Flow-Lenia, this methodology​‌ provides a framework potentially​​ applicable to other parameterizable​​​‌ complex systems where understanding​ emergent collective properties is​‌ of interest.

This work​​ was published at the​​​‌ Artificial Life 2025 conference​ 51, with a​‌ companion website containing videos​​ of the discoveries, the​​​‌ interactive exploration tool and​ source code.

A showcase​‌ of discovered diversity while​​ searching for ecosystemic dynamics.​​​‌

Figure 49: A​ showcase of discovered diversity​‌ while searching for ecosystemic​​ dynamics.

8.7.5 Discovering and​​​‌ Controlling Diverse Self-Organized Patterns​ in Cellular Automata Using​‌ Autotelic Reinforcement Learning

Participants:​​ Marko Cvjetko [correspondent],​​​‌ Gautier Hamon, Pierre-Yves​ Oudeyer, Clément Moulin-Frier​‌.

Autotelic AI algorithms,​​ which pursue self-generated goals,​​​‌ have proven to be​ effective as automated discovery​‌ assistants in cellular automata.​​ Previous work in this​​​‌ domain focused on algorithms​ which produce diverse behaviors​‌ by setting the automaton’s​​ initial conditions. Here, we​​​‌ extend these methods beyond​ initial-condition search and adapt​‌ them to systems that​​ support sequences of closed-loop​​​‌ interventions. Using Lenia (a​ continuous cellular automaton) as​‌ a test environment, we​​ train goal-conditioned reinforcement learning​​​‌ agents to perform targeted​ interventions during the system’s​‌ evolution, guiding it towards​​ desired states. The resulting​​​‌ agent behaviors are robust​ and diverse, demonstrating the​‌ potential of closed-loop interaction​​ for discovery and control.​​​‌ Furthermore, we show that​ goal-conditioned RL agents performing​‌ interventions can discover novel​​ self-organising patterns and generalize​​​‌ to previously unseen and​ noisy environments. The project​‌ was presented as a​​ late-breaking abstract at the​​​‌ Artificial Life 2025 conference​ 45, and disseminated​‌ through a website.​​

9 Bilateral contracts and​​​‌ grants with industry

9.1​ Bilateral contracts with industry​‌

  • CATIE: CIFRE PhD grant​​ of Matisse Poupard with​​​‌ CATIE and EPSYLON Lab​ (Univ. Montpellier) until April​‌ 2025.
  • Hugging Face PhD​​ of Clément Romac with​​​‌ Hugging Face on "Augmenting​ curiosity-driven exploration with very​‌ large language models in​​ deep reinforcement learning agents"​​​‌

We received a 70keuros​ grant from Google, as​‌ a PhD fellowship for​​ Julien Pourcel.

9.2 Bilateral​​​‌ Grants with Fundation

CLEMENCE​ Cohort (Fondation de France​‌ and Théa Pharma)

Participants:​​ Hélène Sauzéon [correspondant],​​​‌ Cécile Mazon, Cécile​ Delcourt.

The project​‌ "Cohorte LongitudinalE sur la​​ Myopie et le développement​​​‌ oculaire dans l’ENfanCE(CLEMENCE) is​ led by C. Delcourt​‌ from the lab of​​ Bordeaux Populational Health (2M€).​​​‌ Hélène Sauzéon and Cécile​ Mazon participate to the​‌ research program with the​​ study of developemental changes​​ due to Myopa in​​​‌ visual attention.

10 Partnerships‌ and cooperations

10.1 International‌​‌ initiatives

10.1.1 Inria associate​​ team not involved in​​​‌ an IIL or an‌ international program

Participants: Helene‌​‌ Sauzéon, Edith Law​​.

CuriousTECH
  • Title:
    Curiosity-Driven​​​‌ Learning Across the Lifespan‌
  • Duration:
    2023 -> 2025‌​‌
  • Coordinator:
    Edith Law (edith.law@uwaterloo.ca)​​
  • Partners:
    • University of Waterloo​​​‌ Waterloo (Canada)
  • Inria contact:‌
    Helene Sauzéon
  • Summary:
    Since‌​‌ several years, the HCI​​ lab and the cognitive​​​‌ neuroscience lab of the‌ University of Waterloo (Canada)‌​‌ have been collaborating with​​ researchers from the Bordeaux​​​‌ site, especially the Flowers‌ team and the Flowers‌​‌ team from inria, as​​ well as the ACTIVE​​​‌ team from the BPH‌ laboratory (Inserm-Uni. Bordeaux ).‌​‌ This collaboration is motivated​​ by a common desire​​​‌ to better understand the‌ role of curiosity in‌​‌ lifelong learning, and to​​ constitute a new multidisciplinary​​​‌ research avenue on the‌ design of original interactive‌​‌ systems for (re)education. Several​​ studies report that curiosity​​​‌ is not only beneficial‌ to children and young‌​‌ adults but also to​​ older adults and neurodiverse​​​‌ individuals. This field of‌ study is in its‌​‌ infancy and deserves collaborative​​ efforts to identify the​​​‌ underlying cognitive mechanisms, the‌ learning situations that benefit‌​‌ them in order to​​ ultimately design and develop​​​‌ curiosity-driven (re)educational technologies (ETs),‌ and then deploy them‌​‌ in natural environments (school,​​ home) to be reliably​​​‌ and rigorously tested. For‌ this multidisciplinary purpose, the‌​‌ consortium gathers competences in​​ AI, HCI, cognitive science,​​​‌ psychology in order to‌ cover the objectives given‌​‌ by the proposed associated​​ team, i.e. CuriousTech team.​​​‌ In addition to the‌ scientific potential, this team‌​‌ structuring also includes the​​ will of a quick​​​‌ transfer of the ET‌ in France and Canada‌​‌ towards the socio-economic fields​​ of Ed Tech but​​​‌ also of e-health.

10.2‌ European initiatives

10.2.1 Horizon‌​‌ Europe

Participants: Cédric Colas​​.

INTERACT

INTERACT project​​​‌ on cordis.europa.eu

  • Title:
    Help‌ Me Grow: Artificial Cognitive‌​‌ Development via Human-Agent Interactions​​ Supported by New Interactive,​​​‌ Intrinsically Motivated Program Synthesis‌ Methods.
  • Duration:
    From October‌​‌ 1, 2022 to August​​ 31, 2026
  • Partners:
    • INSTITUT​​​‌ NATIONAL DE RECHERCHE EN‌ INFORMATIQUE ET AUTOMATIQUE (INRIA),‌​‌ France
    • MASSACHUSETTS INSTITUTE OF​​ TECHNOLOGY (MIT), United States​​​‌
  • Inria contact:
    Cedric Colas‌
  • Coordinator:
  • Summary:
    Building machines‌​‌ that interact with their​​ world, discover interesting interactions​​​‌ and learn open-ended repertoires‌ of skills is a‌​‌ long-standing goal in AI.​​ This project aims at​​​‌ tackling the limits of‌ current AI systems by‌​‌ building on three families​​ of methods: Bayesian program​​​‌ induction, intrinsically motivated learning‌ and human-machine linguistic interactions.‌​‌ It targets three objectives:​​ 1) building autonomous agents​​​‌ that learn to generate‌ programs to solve problems‌​‌ with occasional human guidance;​​ 2) studying linguistic interactions​​​‌ between humans and machines‌ via web-based experiments (e.g.‌​‌ properties of human guidance,​​ its impact on learning,​​​‌ human subjective evaluations); and‌ 3) scaling the approach‌​‌ to the generation of​​ constructions in Minecraft, guided​​​‌ by real players. The‌ researcher will collaborate with‌​‌ scientific pioneers and experts​​ in the key fields​​​‌ and methods supporting the‌ project. This includes supervisors‌​‌ Joshua Tenenbaum (program synthesis,​​​‌ MIT) and Pierre-Yves Oudeyer​ (autonomous learning, Inria); diverse​‌ collaborators, and an advisory​​ board composed of an​​​‌ entrepreneur and leading scientists​ in developmental psychology and​‌ human-robot interactions. The 3rd​​ objective will be pursued​​​‌ via a secondment with​ Thomas Wolf (CSO) at​‌ HuggingFace, a world-leading company​​ in the open source​​​‌ development of natural language​ processing methods and their​‌ transfer to the industry.​​ By enabling users to​​​‌ participate in the training​ of artificial agents, the​‌ project aims to open​​ research avenues for more​​​‌ interpretable, performant and adaptive​ AI systems. This will​‌ result in scientific (e.g.​​ interactive program synthesis approaches),​​​‌ societal (e.g. democratized AI​ training) and economic impacts​‌ (e.g. adaptive AI assistants).​​ The dissemination, communication and​​​‌ exploitation plans support these​ objectives by targeting scientific​‌ (AI, cognitive science), industrial​​ (video games, smart homes)​​​‌ and larger communities (gamers,​ software engineers, large public).​‌

10.2.2 Other european programs/initiatives​​

Participants: Helene Sauzéon,​​​‌ Pierre-Yves Oudeyer, Mathias​ Grüber.

DEVCUR:

ORA​‌ project 2024-2027 - Open​​ Research Area (ORA) for​​​‌ the Social Sciences 8th​ call for proposals

  • Title:​‌
    How curiosity enhances learning​​ across childhood and adolescence:​​​‌ The role of metacognition​ and agency.
  • Duration:
    From​‌ Sept 1, 2025 to​​ December 31, 2027
  • Partners:​​​‌
    • INSTITUT NATIONAL DE RECHERCHE​ EN INFORMATIQUE ET AUTOMATIQUE​‌ (INRIA), France
    • Cardiff University,​​ UK
    • MaxPlanck Institute, Berlin,​​​‌ Germany
    • Inria contact:
      Pierre-Yves​ Oudeyer and Hélène Sauzéon​‌
    • Coordinator:
      Mathias Grüber, Brain​​ and Imagery centre, Cardiff​​​‌ University, UK / Funds​ : 1,177 k€
    • Summary:​‌
      This project investigates the​​ bidirectional relationship between curiosity-based​​​‌ learning and metacognition during​ late childhood and adolescence,​‌ a critical period when​​ both abilities develop. Using​​​‌ five experiments with behavioral,​ neuroimaging, training, and longitudinal​‌ methods, three research teams​​ from Cardiff, Bordeaux, and​​​‌ Trier will examine how​ metacognition and agency enhance​‌ curiosity-driven learning. The study​​ will explore both individual​​​‌ differences and developmental changes​ in how metacognitive awareness​‌ strengthens curiosity's learning benefits.​​ Findings will be translated​​​‌ into classroom interventions to​ stimulate curiosity and metacognition​‌ in educational settings. This​​ interdisciplinary collaboration aims to​​​‌ advance understanding of curiosity​ development with significant scientific​‌ and societal impact.

10.3​​ National initiatives

GAIMHE project​​​‌ (BPI France 2030):

GAIMHE​ is a strategic research​‌ and innovation project funded​​ by Bpifrance, coordinated by​​​‌ EvidenceB in partnership with​ the Flowers AI &​‌ Cognitive Science Laboratory at​​ Inria, Café pédagogique, and​​​‌ Association Class'Code. The project​ aims to develop next-generation​‌ intelligent tutoring systems that​​ combine the pedagogical rigor​​​‌ of traditional adaptive learning​ algorithms with the flexibility​‌ of generative AI. Current​​ educational AI technologies present​​​‌ a fundamental trade-off. Intelligent​ Tutoring Systems (ITS) offer​‌ pedagogically grounded, personalized curricula​​ through algorithms such as​​​‌ ZPDES, but require substantial​ manual content development. Conversely,​‌ generative AI provides interactional​​ flexibility yet lacks pedagogical​​​‌ structure, cannot support long-term​ curriculum personalization, and presents​‌ significant computational costs. GAIMHE​​ proposes a hybrid methodology​​​‌ structured around three axes:​ automated content generation leveraging​‌ generative AI for the​​ rapid creation of pedagogically​​​‌ validated exercises to populate​ ITS knowledge graphs; targeted​‌ generative assistance deploying optimally-sized​​ models to provide pedagogically​​ principled guidance at key​​​‌ learning moments; and advanced‌ personalization through compact student‌​‌ models capable of predicting​​ and adapting learning trajectories​​​‌ across extensive exercise spaces,‌ building on prior MAGELLAN‌​‌ research. The project benefits​​ from EvidenceB's existing infrastructure,​​​‌ which serves tens of‌ thousands of classrooms across‌​‌ primary and secondary education​​ in France. This enables​​​‌ large-scale evaluation with authentic‌ learning data. In partnership‌​‌ with Région Île-de-France, the​​ consortium will release annotated​​​‌ datasets, learning analytics tools,‌ and software components as‌​‌ open-source digital commons to​​ support France's educational technology​​​‌ ecosystem.

ANR Chaire Individuelle‌ Deep Curiosity

- PY‌​‌ Oudeyer continued to work​​ on the research program​​​‌ of this Chaire, funding‌ 2 PhDs and 3‌​‌ postdocs for five years​​ (until 2025).

ANR JCJC​​​‌ ECOCURL

- C. Moulin-Frier‌ obtained an ANR JCJC‌​‌ grant. The project is​​ entitled "ECOCURL: Emergent communication​​​‌ through curiosity-driven multi-agent reinforcement‌ learning". The project starts‌​‌ in Feb 2021 for​​ a duration of 48​​​‌ months. It will fund‌ a PhD student (36‌​‌ months) and a Research​​ Engineer (18 months) as​​​‌ well as 4 Master‌ internships (one per year).‌​‌

Projet AIxIA: "Analyse d’Interférences​​ par Intelligence Artificielle".

Pierre-Yves​​​‌ Oudeyer and Clément Moulin-Frier‌ obtained a grant from‌​‌ the call for project​​ AIRSTRIP "L'intelligence Artificielle au​​​‌ service de l'IngénieRie des‌ SysTèmes aéRonautIques et sPatiaux",‌​‌ in collaboration with the​​ IRT Saint Exupery. The​​​‌ project was accepted in‌ 2023 and will fund‌​‌ 18 months of a​​ research engineer position starting​​​‌ in 2024.

Inria Exploratory‌ Action AIDE

- Didier‌​‌ Roy is collaborator of​​ the Inria Exploratory Action​​​‌ AIDE "Artificial Intelligence Devoted‌ to Education", ported by‌​‌ Frédéric Alexandre (Inria Mnemosyne​​ Project-Team), Margarida Romero (LINE​​​‌ Lab) and Thierry Viéville‌ (Inria Mnemosyne Project-Team, LINE‌​‌ Lab). The aim of​​ this Exploratory Action consists​​​‌ to explore to what‌ extent approaches or methods‌​‌ from cognitive neuroscience, linked​​ to machine learning and​​​‌ knowledge representation, could help‌ to better formalize human‌​‌ learning as studied in​​ educational sciences. AIDE is​​​‌ a four year project‌ started middle 2020 until‌​‌ 2024 see.

Inria​​ Exploratory Action I'AM

-​​​‌ Hélène Sauzéon is co-PI‌ with P. Dragicevic of‌​‌ the Inria Exploratory Action​​ I'AM "Impact of Augmented​​​‌ Reality on Autobiographical Memory:‌ Examining Involuntary Memories and‌​‌ False Memories" (174,5k€). Starting​​ in last september, the​​​‌ aim of this Exploratory‌ Action consists to explore‌​‌ to what extent augmented​​ reality based devices can​​​‌ produce erroneous autobiographical memories,‌ and more particularly in‌​‌ vulnerable people (Children and​​ older adults or yound​​​‌ adults with low memory‌ abilities of source monitoring).‌​‌

New collaboration with Maxime​​ Derex from IAST Toulouse​​​‌

for the co-direction of‌ the PhD thesis of‌​‌ Jeremy Perez with Clément​​ Moulin-Frier and Pierre-Yves Oudeyer​​​‌ on "Interactions between intrinsically‌ motivated goal-exploration processes and‌​‌ cummulative cultural evolution" (see​​ section 8.2.2).

France​​​‌ 2030 - PPR AUTONOMIE‌ : Vieillissement Et Situations‌​‌ De Handicap - Projet​​ INNOVCare (Lechevalier S., 3,5M€)​​​‌ (2023-26)

- Hélène Sauzéon‌ and AS Rigaud will‌​‌ supervize the WP5 dedicated​​ to two care-led innovation​​​‌ experiments with assistive technologies‌ (400k € for Bordeaux).‌​‌ - Hélène Sauzéon is​​​‌ responsible of the WP3​ « Digital technology for​‌ aging in place »​​ (470k€/3,5M€), Défi 4 -​​​‌ Numérique, Innovcare (PPR Autonomie​ PIA2030, 2023-28).

VBHI project(Vascular​‌ Brain Health Institute -IHU,​​ led by S. Debette,​​​‌ 5M€)) (2023-26)

- Hélène​ Sauzéon will supervize the​‌ WP4.3 dedicated to "Explore​​ Digital Therapeutics To Slow​​​‌ Down Cognitive Decline In​ Covert Csvd" (150k€)

11​‌ Dissemination

11.1 Promoting scientific​​ activities

11.1.1 Scientific events:​​​‌ organisation

PY Oudeyer continued​ to be a member​‌ of the organization committe​​ of the Life, Structure​​​‌ and Cognition symposium series​ at IHES, France. H​‌ sauzeon continued to be​​ a member of the​​​‌ Technical Program committee of​ ACHI conference.

11.1.2 Reviewer​‌ - reviewing activities

Matisse​​ Poupard has reviewed for​​​‌ Computers & Education, Education​ and Information Technologies, Frontiers​‌ in Psychology and Frontiers​​ in Virtual Reality. Jeremy​​​‌ Perez has reviewed for​ the Judgment and Decision​‌ Making Journal and Topics​​ in Cognitive Science. Hélène​​​‌ Sauzéon has reviewed for​ ACM-CHI25 , British Journal​‌ of Psychology, Computer in​​ human behavior, and for​​​‌ Journal of Research in​ Science Teaching. Cécile Mazon​‌ has reviewed for Nature​​ Scientific Reports, Education and​​​‌ Information Technology, and BMC​ Psychology.

PY Oudeyer was​‌ a reviewer for the​​ journals Developmental Science and​​​‌ Child Development.

11.1.3 Invited​ talks

Matisse Poupard gave​‌ 3 invited talks:

Clément Romac gave​​ 1 invited talks:

  • (May​​​‌ 2025) Invited talk at​ the ISIR lab from​‌ Sorbonne University on “Grounding​​ LLMs through curiosity-driven online​​​‌ RL”.
  • (Spetember 2025) Invited​ talk at the SMILES​‌ workshop at ICDL on​​ “Grounding LLMs through curiosity-driven​​​‌ online RL”.

Marie-Sarah Desvaux​ gave 1 invited talk:​‌

Hélène​ Sauzéon gave 3 invited​‌ talk:

  • (October 2025) "Supporting​​ digital accessibility of MOOC​​​‌ based-learning for individuals with​ cognitive impairments: The Aïana​‌ project" Intersections: Translation, Accessibility,​​ Inclusion, Forum des Savoirs,​​​‌ MSH Dijon
  • (Mai 2025)​ Les interventions de santé​‌ pour le bien-viellir des​​ personnes âgés à l'aide​​​‌ de technologies numériques Journée​ IFRATH - Troubles sociocognitifs​‌ et technologies : Perspectives​​ sur l'enfance et le​​​‌ vieillissement, 'Institut National de​ Jeunes Sourds de Paris,​‌ 254 Rue Saint-Jacques, 75005​​ Paris.
  • (November 2025) The​​​‌ intrinsic motivations as design​ principles of technologies for​‌ cognition : Examples about​​ Educational Technologies and Technologies​​​‌ for aging in place,​ Institut de Psychologie, Université​‌ Paris Cité, Paris

Loris​​ Gaven gave 2 invited​​​‌ talks:

  • (January 2026) "Toward​ Artificial Curiosity" at University​‌ of Padua (Online)
  • (August​​ 2025) "MAGELLAN: Metacognitive predictions​​​‌ of learning progress guide​ autotelic LLM agents in​‌ large goal spaces" at​​ the Metacognitive Satellite of​​ CCN in Amsterdam.

PY​​​‌ Oudeyer gave these invited‌ talks:

11.1.4 Scientific expertise

Hélène​​ Sauzéon was:

  • Vice-president of​​​‌ Pluridisciplinary committee (Digital sciences‌ and Humanities) of French‌​‌ National agency for Research​​ (CES 38 -Interface ANR),​​​‌ since 2025
  • Member of‌ Research Council of Finland‌​‌ POC (12 projects on​​ applied computer sciences) in​​​‌ 2025
  • Member of the‌ Scientific Committee of Calyxis,‌​‌ a center focused on​​ research and development of​​​‌ technological solutions to prevent‌ daily accidents through public‌​‌ laboratory-enterprise collaborations, since 2019.​​
  • Expert for grant applications:​​​‌ Evaluation of 2CIFRE-ANRT PhD‌ proposals ; Evaluation of‌​‌ 1 GATES (Grenoble ATtractiveness​​ and ExcellenceS) proposal for​​​‌ the SHS Cluster of‌ Université Grenoble in 2025.‌​‌
  • Member of committe for​​ a permanent Professor position​​​‌ in psychology at the‌ university of Bordeaux
  • Member‌​‌ of committe for a​​ permanent Assistant Professor position​​​‌ in Occupational Science (91‌ section) at the university‌​‌ of Limoges

PY Oudeyer​​ was:

  • a reviewer and​​​‌ expert for ANR (National‌ Research Agency) as well‌​‌ as for the European​​ Research Council (ERC), the​​​‌ Cyprus Research Council and‌ for the Swedish Foundation‌​‌ for Strategic Research.
  • an​​ invited expert for the​​​‌ "Curiosity Convening" event organized‌ by the Scratch Foundation‌​‌ at OECD, Paris.
  • invited​​ to be a member​​​‌ of the scientific council‌ of the "Main à‌​‌ la Pate" foundation.​​
  • a member of the​​​‌ GT "IA et éducation"‌ at Conseil Scientifique de‌​‌ l'Education Nationale.

Cécile Mazon​​ reviewed one proposal for​​​‌ ANR-AAPG JCJC.

11.1.5 Research‌ administration

Hélène Sauzéon was:‌​‌

  • Member of the Research​​ Committee of IMT Atlantique​​​‌ since 2025, working to‌ promote Human and Social‌​‌ Sciences (SHS) in engineering​​ education.
  • Co-organizer (Inria) since​​​‌ 2024 of the annual‌ "JS & GT 'Handicap'"‌​‌ (Thematic Days & Working​​ Groups on Disability) and​​​‌ contributor to the consultation‌ for Inria's 2025 Disability‌​‌ Roadmap.
  • Member of the​​ extented "BCP" of BSO​​​‌ Inria centre, since 2020.‌ Advisory roles for the‌​‌ center's “surrounding” scientific policy​​ and strategy, recruitment of​​​‌ permanent researchers, especially ,‌ monitoring and assistance in‌​‌ setting up Inria teams,​​​‌ organization of intern scientific​ events, writting support to​‌ communication staff for popularization​​ contents on AI, disabilty,​​​‌ health and Education, etc.​
  • PIQ Referal for the​‌ centre Inria of Univ.​​ of Bordeaux covering 3​​​‌ universities (Bordeaux, LaRochelle, Limoges),​ since 2024. My role​‌ is twofold: 1) to​​ follow up and help​​​‌ site referents and applicants​ to define and draft​‌ projects, while ensuring compliance​​ with PIQ program policy,​​​‌ i.e. close dialogue with​ PIQ staff, and 2)​‌ to inform the center's​​ scientific management of applications​​​‌ in progress in New​ Aquitaine via a dedicated​‌ "pad", and their positioning​​ in relation to the​​​‌ PIQ program's national results.​
  • Referal of Education topic​‌ for the centre Inria​​ of Univ. of Bordeaux​​​‌ (covering 3 teams :​ Bivouac, Flowers, Mnemosyne), and​‌ for which I'm the​​ centre proxis at RTP​​​‌ CNRS Éducation.
  • Head of​ an Associate Inria Team-​‌ CuriousTech Inria-UW–Univ. Waterloo (Canada),​​ since 2023. The multi-disciplinary​​​‌ program (Prof. M. Fernandes'​ psychology lab, and Edith​‌ Law's HCI lab) involves​​ designing innovative interactive systems​​​‌ for education and cognitive​ health at all ages,​‌ with the singularity of​​ leveraging intrinsic motivations (self-determination​​​‌ and curiosity) as reinforcers​ of human performance.
  • Member​‌ of the Direction Committee​​ of IFR Handicap (Inserm)​​​‌ labelled Fedhra since 2023,​ since 2019
  • Member of​‌ the Direction Committee of​​ BIND - centre excellence​​​‌ BIND de Bordeaux, since​ 2019
  • Member of the​‌ scientific Committee of SOUND​​ - centre excellence TND​​​‌ Bordeaux, since 2025
  • Resp.​ of Research Axis on​‌ Innovating Interventions at ACTIVE​​ Team (BPH Lab), since​​​‌ 2022

Cécile Mazon is:​

  • Co-responsible of the Digital​‌ Tools workpackage of the​​ PIA Atypie-Friendly
  • Local contact​​​‌ for Inria HandiTechLab
  • Member​ of the Digital Tools​‌ axis of the Bordeaux​​ Excellence center for Neurodevelopmental​​​‌ disorders (SOUND project)

PY​ Oudeyer was:

  • head of​‌ the Flowers AI &​​ CogSci lab
  • member of​​​‌ the piloting committee of​ the France 2030 BPI​‌ project GAIMHE
  • representative of​​ Inria in the piloting​​​‌ committee of the Nouvelle​ Aquitaine Research Network on​‌ Educational Technologies (R3NumEd)

11.2​​ Teaching - Supervision -​​​‌ Juries - Educational and​ pedagogical outreach

11.2.1 Teaching​‌

Cécile Mazon is responsible​​ of:

  • Cognitive science curriculum​​​‌ in MIASHS bachelor (Mathematics​ and Computer Science applied​‌ to Social and Human​​ Sciences) - since 2024​​​‌
  • Technology, Ergonomy, Cognition, Disability​ curriculum in Cognitive Sciences​‌ master, since 2022
  • Apprenticeship​​ academic coordination for Technology,​​​‌ Ergonomy, Cognition, Disability curriculum​ in Cognitive Sciences master​‌ - since 2023

Leslie​​ Tricoche , as ATER,​​​‌ gave the following courses:​

  • L2 MIASHS - UFR​‌ Sciences and Technology, Bordeaux​​ University: Neurobiology (lectures and​​​‌ tutorials)
  • L3 MIASHS -​ UFR Sciences and Technology,​‌ Bordeaux University: Neuropathology (lectures​​ and tutorials)
  • M1 Cognitive​​​‌ Sciences - UFR Sciences​ and Technology, Bordeaux University:​‌ Cognitive functions in situations​​ and disabilities (lectures and​​​‌ tutorials)
  • M2 Cognitive Sciences​ - UFR Sciences and​‌ Technology, Bordeaux University: Multiple​​ forms of the profession​​​‌ (lectures and tutorials)

Marie-Sarah​ Desvaux , as Teaching​‌ Assistant, gave the following​​ courses:

  • M2 Cognitive Sciences​​​‌ - UFR Sciences and​ Technology, Bordeaux University: Multiple​‌ forms of the profession​​ - Project management
  • L3​​ MIASHS - UFR Sciences​​​‌ and Technology, Bordeaux University:‌ Web Accessibility

Juliette Deyts‌​‌ , as Teaching Assistant,​​ gave the following courses:​​​‌

  • M2 Cognitive Sciences -‌ UFR Sciences and Technology,‌​‌ Bordeaux University: Disability, Autonomy,​​ Cognition and Technology

Matisse​​​‌ Poupard , as ATER,‌ gave the following courses:‌​‌

  • L1 MIASHS - UFR​​ Sciences and Technology, Bordeaux​​​‌ University: Introduction to Cognitive‌ Science
  • L2 MIASHS -‌​‌ UFR Sciences and Technology,​​ Bordeaux University: Neurological Foundations,​​​‌ Cognitive Fundamentals, and Learning‌
  • L3 MIASHS - UFR‌​‌ Sciences and Technology, Bordeaux​​ University: Knowledge and Representations,​​​‌ Language, and Natural Language‌ Processing
  • M1 Cognitive Sciences‌​‌ - UFR Sciences and​​ Technology, Bordeaux University:
    • Scientific​​​‌ Foundations
    • Cognitive Functions in‌ Situations and Disabilities
  • M2‌​‌ Cognitive Sciences - UFR​​ Sciences and Technology, Bordeaux​​​‌ University:
    • Disability, Activity, Cognition,‌ Technology
    • Multiple Forms of‌​‌ the Profession
    • Virtual Reality,​​ Interaction, and Health Applications​​​‌

Cécile Mazon , as‌ assistant professor, gave lectures‌​‌ and tutorials (280hETD) in​​ cognitive sciences to students​​​‌ in MIASHS bachelor (L1-2-3)‌ and Cognitive sciences master‌​‌ (M1-M2). Key teaching topics​​ include introduction to cognitive​​​‌ sciences, cognitive psychology (main‌ cognitive functions, experimental methods),‌​‌ cognitive sciences applied to​​ disability and/or technology design,​​​‌ as well as methodology‌ and statistics.

Helene Sauzéon‌​‌ participated to the inria​​ mentoring program as mentor​​​‌ of one PhD student‌ from the centre Inria‌​‌ of Paris

11.2.2 Supervision​​

PY Oudeyer (co-)supervised the​​​‌ following PhD students:

  • PhD‌ defended in 2025: Grgur‌​‌ Kovac, "Developmental training of​​ socio-cognitive abilities in AI​​​‌ systems", (supervisors:PF. Dominey and‌ PY. Oudeyer)
  • PhD defended‌​‌ in 2025: Gauthier Hamon,​​ "Open-endedness in artificial life​​​‌ and articial intelligence: an‌ eco-evo-devo perspective" (supervisor: C.‌​‌ Moulin-Frier)
  • PhD defended in​​ 2025: Nicolas Yax, "Studying​​​‌ cognitive and metacognitive skills‌ in foundation models" (supervisors:‌​‌ S. Palminteri, PY. Oudeyer)​​
  • PhD defended in 2025:​​​‌ Clément Romac, "Grounding LLMs‌ with online RL", (supervisors:‌​‌ T. Wolf and PY.​​ Oudeyer) item PhD in​​​‌ progress: Julien Rosenberg, "Models‌ and experimental study of‌​‌ the co-development of curiosity​​ and metacognition in adolescents"​​​‌ (supervisors: H. Sauzéon, PY‌ Oudeyer)
  • PhD in progress:‌​‌ Paul Tabbara, "Autotelic generative​​ AI systems for automated​​​‌ discovery in mathematics" (supervisors:‌ G. Baudart, PY. Oudeyer)‌​‌
  • PhD in progress: Julien​​ Pourcel, "Autotelic LLMs that​​​‌ learn how to code",‌ (supervisors: C. Moulin-Frier and‌​‌ PY. Oudeyer)
  • PhD in​​ progress: Thomas Carta, "LLM-based​​​‌ Autotelic deep reinforcement learning‌ agents", (supervisors: O. Sigaud,‌​‌ S. Lamprier and PY.​​ Oudeyer)
  • PhD in progress:​​​‌ Jeremy Perez, "Studying mechanisms‌ and roles of curiosity‌​‌ in socio-cultural contexts" (supervisors:​​ C. Moulin-Frier, M. Derex,​​​‌ PY. Oudeyer)
  • PhD in‌ progress: Timothé Boulet, "Controller‌​‌ synthesis for artificial agents​​ in simulated environments using​​​‌ generative AI" (supervisors C.‌ Moulin-Frier, X. Hinault, N.‌​‌ Fijalkow)
  • PhD in progress:​​ Marko Cvjetko, "Autotelic exploration​​​‌ algorithms for automated search‌ of open-endedness in artificial‌​‌ life" (supervisors: C. Moulin-Frier,​​ PY. Oudeyer) item PhD​​​‌ in progress: Loris Gaven,‌ "Metacognitive prediction of learning‌​‌ progress for guiding autotelic​​ agents" (supervisors: PY. Oudeyer​​​‌ and C. Moulin-Frier)

H.‌ Sauzéon (co-)supervised the following‌​‌ PhD students:

  • PhD defended​​ in 2025: M. POUPARD​​​‌ " Curious and thus‌ not overloaded !". (supervisors:H.‌​‌ sauzeon and A. tricot​​​‌ / CIFRE with CATIE)​
  • PhD in progress: L.​‌ PETIOT " AR effect​​ on memory distorsions" (supervisors​​​‌ : H. sauzeon and​ P. Dragicevic) ( AEx​‌ IAM, 2023-25).
  • PhD in​​ progress: C. DESVAUX "Design​​​‌ and Asssement of metacognitive​ interventions supporting curiosity and​‌ creativy at school" (Alloc.​​ MESRI, ED SP2).
  • PhD​​​‌ in progress: J. DEYTS​ "Self-determination driven technologies for​‌ healthy aging" (Alloc. from​​ Projet ANR Innovcare)
  • hD​​​‌ in progress: J. ROSENBERGER​ Curiosity-driven learnig as developmental​‌ function of metacognition in​​ adolescents aged of 12​​​‌ to 16 y/o. (supervisors​ : H. sauzeon and​‌ PY oudeyer(Alloc. from ORA​​ funds, ED SP2 -Univ.​​​‌ Bordeaux)
  • PhD in progress:​ M. BOURDIL "A neurotechnological​‌ approach using EEG for​​ the characterising and the​​​‌ therapeutic treatment of smal​ vessels syndrome. (supervisors: F​‌ Lotte and H. sauzeon)​​ (Alloc from IHU-VBHI project)​​​‌

11.2.3 Juries

PY Oudeyer​ was a member of:​‌

  • the selection committee of​​ the Inria Prizes from​​​‌ Académie des Sciences.
  • the​ PhD juries of Marie​‌ Martin (Université Interdisciplinaire de​​ Paris), Théo Cachet (Sorbonne​​​‌ University) and J. Daly​ (Univ. Texas, Austin)
  • the​‌ PhD "comité de suivi"​​ of Reem al Najjar​​​‌ (Sorbonne Université), Matthis Poupard​ (Univ. Bordeaux), Paul Pacaud​‌ (Université Paris Sciences Lettres)​​

Hélène Sauzéon was a​​​‌ part of 6 PhD​ boards :

  • "Conception, développement​‌ et évaluation d'un exergame​​ en réalité augmentée pour​​​‌ la rééducation cognitivo-motrice d'enfants​ atteints de Paralysie Cérébrale​‌ ou de Lésions Cérébrales​​ Acquises : le projet​​​‌ TERAPACE by Maxime Balloufaud​ - Limoges
  • "Optimizing sensory​‌ feedback and manual interaction​​ efficiency within XR experiments"​​​‌ by Julien Cauquis -​ Ecole nationale supérieure Mines-Télécom​‌ Atlantique Bretagne Pays de​​ la Loire
  • "Careless or​​​‌ care-led innovation? : socio-ethnography​ of social robots and​‌ social tiesin eldercare settings​​ in France and Japan​​​‌ : tensions and contradictions​ in needs, temporalities and​‌ representations" by Yuko Tamaki​​ - Paris, EHESS
  • "Neurocognitive​​​‌ mechanisms of self-referenced memory​ encoding: a naturalistic and​‌ embodied approach to episodic​​ memory" by Sylvain Penaud​​​‌ - Université Paris Cité​
  • "Intrinsic vs. Extrinsic Motivation:​‌ Computational Modelling, Neural Bases,​​ and Clinical Applications" by​​​‌ Jade Seguin - Sorbonne​ université
  • "Prise de décision​‌ lors de la planification​​ d'itinéraires avec des applications​​​‌ : une approche cognitive​ pour la régulation des​‌ flux voyageurs dans les​​ transports en commun" by​​​‌ Archana Prabhakar - Université​ Paris Cité

Cécile Mazon​‌ is permanent member of​​ the jury for Cognitive​​​‌ Sciences master thesis defenses​ (M1/M2) and for bachelor​‌ undergraduate projects (L3 MIASHS).​​

11.2.4 Support to public​​​‌ policies

PY. Oudeyer and​ H. Sauzéon and the​‌ whole team were involved​​ in several major actions​​​‌ to support public policies​ on the topic of​‌ AI and education. Members​​ of the team designed​​​‌ and conducted training sessions​ in different academies for​‌ supervisory staff and teachers,​​ e.g. ETAPP-IA day in​​​‌ Nouvelle-Aquitaine (January 2025); departmental​ training of CPE and​‌ documentary teachers of Nouvelle-Aquitaine​​ during a day at​​​‌ the Lycée Les Iris​ in Lormont (May 2025);​‌ Academic Days of Innovation​​ for teachers of Nouvelle-Aquitaine,​​​‌ Spring Days of Education​ Research at INSPEs, (June​‌ 2025); PhilosophIA Citizens' Convention​​ (April 2025), twin conference​​ of Cnesco/Cardie Charente-Maritime (January​​​‌ 2025), working group Education‌ and Cognitive Sciences of‌​‌ the academies of Créteil,​​ Versailles and Paris, scheduled​​​‌ for March 2026.

H.‌ Sauzéon and PY. Oudeyer‌​‌ were interviewed and wrote​​ reports to contribute to​​​‌ the report of French‌ Senate on AI and‌​‌ education.

PY Oudeyer​​ was auditioned by the​​​‌ commission on cultural and‌ educational affairs in the‌​‌ French parliament, to discuss​​ the major challenges and​​​‌ opportunities of AI and‌ education.

11.2.5 Educational and‌​‌ pedagogical outreach

Cécile Mazon​​ participated to events for​​​‌ promoting university programs in‌ cognitive science: the Salon‌​‌ de l’Étudiant (January 2025),​​ the University of Bordeaux​​​‌ Open Days (January 2025),‌ and the Orientation Days‌​‌ (May 2025).

11.3 Popularization​​

11.3.1 Specific official responsibilities​​​‌ in science outreach structures‌

PY. Oudeyer collaborated with‌​‌ the Pix organization as​​ main scientific and editorial​​​‌ design consultant for the‌ Pix IA training modules,‌​‌ which will be dissemnated​​ to all French students​​​‌ in 4ème, 2nde and‌ CAP in 2026.

11.3.2‌​‌ Productions (articles, videos, podcasts,​​ serious games, ...)

PY.​​​‌ Oudeyer gave several public‌ talks on AI and‌​‌ education available on a​​ youtube channel.

D. Roy​​​‌ and P-Y. Oudeyer wrote‌ a popular science book‌​‌ to introduce generative AI​​ (mechanisms, applications, societal dimensions)​​​‌ to adolescents, as well‌ as to their teachers‌​‌ and families. It is​​ entitled "C'est (pas) moi,​​​‌ c'est l'IA", and was‌ published in september 2024‌​‌ by Nathan. It was​​ reviewed in widely distributed​​​‌ magazines (e.g. Magazine de‌ l'APEL) and radios (e.g.‌​‌ France Culture, RFI). The​​ web page of the​​​‌ book is here: link‌.

A. Torres-Leguet, C.‌​‌ Romac, T. Carta and​​ PY. Oudeyer produced the​​​‌ pedagogical video series "ChatGPT‌ explained in 5 mn",‌​‌ aimed at training generative​​ AI literacy in a​​​‌ wide diversity of students‌ (e.g. high school), available‌​‌ here: link. They​​ are under a Creative​​​‌ Commons licence, CC-BY, enabling‌ open and free reuse.‌​‌ They were already integrated​​ in the MOOC AI4T​​​‌ (link), as‌ well as in an‌​‌ internal training platform of​​ "Académie du Numérique du​​​‌ Ministère de la défense",‌ in a mobile app‌​‌ made by Inria with​​ educational materials related to​​​‌ AI (link),‌ and are being adapted‌​‌ and integrated in a​​ training platform for the​​​‌ whole population of civil‌ servants in France, coordinated‌​‌ by DINUM.

PY Oudeyer​​ wrote a note for​​​‌ the French educational institutions‌ on "IA générative, société‌​‌ et éducation: En quoi​​ l’IA générative représente-elle un​​​‌ enjeu dans la formation‌ des citoyens ?",‌​‌ in the context of​​ the Conférence de Consensus​​​‌ on Nouveaux Savoirs et‌ Nouvelles Compétences des Jeunes‌​‌ of Cnesco, (Nov. 2024)​​

Hélène Sauzéon wrote a​​​‌ web article on the‌ following topic: "Why agency‌​‌ is a key ability​​ in the workplace"

Hélène​​​‌ Sauzéon participated to the‌ "mental health and Technology"‌​‌ podcast organized by BPH​​ -Inserm (october 2025)​​​‌ in Bordeaux

Marie-Sarah Desvaux‌ walk interviewed by Curieux!‌​‌ Live on Educational Technologies​​ for learning

11.3.3 Participation​​​‌ in Live events

Jeremy‌ Perez and Clément Romac‌​‌ gave a presentation on​​​‌ Artificial Intelligence to high​ school teachers as part​‌ of the "Journée formation​​ IA pour les enseignant.es"​​​‌ at the Bordeaux INRIA​ Center on February 5th.​‌

Clément Romac gave a​​ talk on generative AIs​​​‌ to La main à​ la pâte, a French​‌ association promoting science in​​ classrooms.

Hélène Sauzéon ,​​​‌ Cécile Mazon , Sophie​ Lepennetier , Julien Rosenberger​‌ , Loris Gaven ,​​ Paul Tabbara and Julien​​​‌ Pourcel hosted a stand​ at the Village des​‌ Sciences on October 11th​​ and 12th. It gave​​​‌ an opportunity to introduce​ curiosity to visitors of​‌ CapScience, especially kids and​​ parents.

Hélène Sauzéon and​​​‌ Marie-Sarah Desvaux animated a​ workshop on curiosity-driven learning​‌ to teachers and trainers​​ during the "Learning Show"​​​‌ 2025 (13th of October)​ in Rennes

Hélène Sauzéon​‌ gave a talk at​​ the event organized by​​​‌ "Science with and for​ Society" by Université of​‌ Bordeaux: Samedis Sciences #4​​ "Artificial Intelligence and Education​​​‌ : the future of​ learning?"

Hélène Sauzéon gave​‌ a talk at the​​ event « Journée académique​​​‌ de l’expérimentation" organized by​ CARDIE Grenoble, Grenoble (14​‌ mai 2025)

Hélène Sauzéon​​ participated in "CoAnimation" for​​​‌ the Portes Fermées at​ INRIA, for a workshop​‌ to promote dialogue between​​ digital and social sciences​​​‌

Hélène Sauzéon participated in​ "Circuit scientifique Hors les​‌ murs" in October 2025​​

Hélène Sauzéon participated in​​​‌ the Chiche program, visiting​ 2 to 3 classrooms​‌

Marie-Sarah Desvaux gave a​​ talk on the use​​​‌ of Generative AI in​ classrooms to INSPE students​‌ of University of Bordeaux​​ (March 2025)

Marie-Sarah Desvaux​​​‌ animated a workshop on​ the use of Generative​‌ AI in classrooms during​​ the Journée Académique (August​​​‌ 2025) organized by CARDIE​ Poitiers

Marie-Sarah Desvaux gave​‌ an interactive talk on​​ curisoity-driven learning in classrooms​​​‌ to teachers during the​ Cogni'Forum 2025 (October) organized​‌ by "Apprendre et Former​​ avec les Sciences Cognitives"​​​‌

PY Oudeyer participated to​ several live events:

11.3.4 Others​‌ science outreach relevant activities​​

Press:

PY. Oudeyer was​​​‌ interviewed, or the work​ of the team was​‌ discussed, in various newspapers,​​ magazines and radios/podcasts:

12 Scientific production

12.1​​​‌ Major publications

  • 1 inproceedings‌R.Rania Abdelghani,‌​‌ E.Edith Law,​​ C.Chloé Desvaux,​​​‌ P.-Y.Pierre-Yves Oudeyer and‌ H.Hélène Sauzéon.‌​‌ Interactive environments for training​​ children’s curiosity through the​​​‌ practice of metacognitive skills‌ : a pilot study‌​‌.IDC 2023 -​​ The 22nd annual ACM​​​‌ Interaction Design and Children‌ ConferenceChicago IL, United‌​‌ StatesACM; ACMNovember​​ 2023, 495-501HAL​​​‌DOI
  • 2 articleR.‌Rania Abdelghani, P.-Y.‌​‌Pierre-Yves Oudeyer, E.​​Edith Law, C.​​​‌Catherine de Vulpillières and‌ H.Hélène Sauzéon.‌​‌ Conversational agents for fostering​​ curiosity-driven learning in children​​​‌.International Journal of‌ Human-Computer Studies167November‌​‌ 2022, 102887HAL​​DOI
  • 3 articleR.​​​‌Rania Abdelghani, Y.-H.‌Yen-Hsiang Wang, X.‌​‌Xingdi Yuan, T.​​Tong Wang, P.​​​‌Pauline Lucas, H.‌Hélène Sauzéon and P.-Y.‌​‌Pierre-Yves Oudeyer. GPT-3-driven​​ pedagogical agents for training​​​‌ children's curious question-asking skills‌.International Journal of‌​‌ Artificial Intelligence in Education​​June 2023HALDOI​​​‌
  • 4 articleM.Maxime‌ Adolphe, M.Masataka‌​‌ Sawayama, D.Denis​​ Maurel, A.Alexandra​​​‌ Delmas, P.-Y.Pierre-Yves‌ Oudeyer and H.Helene‌​‌ Sauzeon. An Open-Source​​ Cognitive Test Battery to​​​‌ Assess Human Attention and‌ Memory.Frontiers in‌​‌ Psychology13June 2022​​HALDOIback to​​​‌ text
  • 5 articleA.‌Adrien Baranes and P.-Y.‌​‌Pierre-Yves Oudeyer. Active​​ Learning of Inverse Models​​​‌ with Intrinsically Motivated Goal‌ Exploration in Robots.‌​‌Robotics and Autonomous Systems​​611January 2013​​​‌, 69-73HALDOI‌
  • 6 inproceedingsT.Thomas‌​‌ Carta, C.Clément​​ Romac, T.Thomas​​​‌ Wolf, S.Sylvain‌ Lamprier, O.Olivier‌​‌ Sigaud and P.-Y.Pierre-Yves​​ Oudeyer. Grounding Large​​​‌ Language Models in Interactive‌ Environments with Online Reinforcement‌​‌ Learning.International Conference​​ on Machine Learning 2023​​​‌2023676-3713Honololu, Hawaii,‌ United States2023HAL‌​‌
  • 7 articleP.-A.Pierre-Antoine​​ Cinquin, P.Pascal​​​‌ Guitton and H.Hélène‌ Sauzéon. Towards Truly‌​‌ Accessible MOOCs for Persons​​ with Cognitive Impairments: a​​​‌ Field Study.Human-Computer‌ Interaction2021HAL
  • 8‌​‌ inproceedingsC.Cédric Colas​​, P.Pierre Fournier​​​‌, O.Olivier Sigaud‌, M.Mohamed Chetouani‌​‌ and P.-Y.Pierre-Yves Oudeyer​​. CURIOUS: Intrinsically Motivated​​​‌ Modular Multi-Goal Reinforcement Learning‌.International Conference on‌​‌ Machine LearningLong Beach,​​ FranceJune 2019HAL​​​‌
  • 9 inproceedingsC.Cédric‌ Colas, T.Tristan‌​‌ Karch, N.Nicolas​​ Lair, J.-M.Jean-Michel​​​‌ Dussoux, C.Clément‌ Moulin-Frier, P. F.‌​‌Peter Ford Dominey and​​ P.-Y.Pierre-Yves Oudeyer.​​​‌ Language as a Cognitive‌ Tool to Imagine Goals‌​‌ in Curiosity-Driven Exploration.​​​‌NeurIPS 2020 - 34th​ Conference on Neural Information​‌ Processing SystemsContains main​​ article and supplementariesVancouver​​​‌ / Virtual, CanadaDecember​ 2020HALback to​‌ text
  • 10 inproceedingsM.​​Mayalen Etcheverry, C.​​​‌Clément Moulin-Frier and P.-Y.​Pierre-Yves Oudeyer. Hierarchically​‌ Organized Latent Modules for​​ Exploratory Search in Morphogenetic​​​‌ Systems.NeurIPS 2020​ - 34th Conference on​‌ Neural Information Processing Systems​​Vancouver / Virtual, Canada​​​‌December 2020HALback​ to text
  • 11 article​‌M.Mayalen Etcheverry,​​ C.Clément Moulin-Frier,​​​‌ P.-Y.Pierre-Yves Oudeyer and​ M.Michael Levin.​‌ AI-driven Automated Discovery Tools​​ Reveal Diverse Behavioral Competencies​​​‌ of Biological Networks.​eLifeAugust 2024HAL​‌DOI
  • 12 articleS.​​Sébastien Forestier, R.​​​‌Rémy Portelas, Y.​Yoan Mollard and P.-Y.​‌Pierre-Yves Oudeyer. Intrinsically​​ Motivated Goal Exploration Processes​​​‌ with Automatic Curriculum Learning​.Journal of Machine​‌ Learning ResearchApril 2022​​HALback to text​​​‌
  • 13 inproceedingsL.Loris​ Gaven, T.Thomas​‌ Carta, C.Clément​​ Romac, C.Cédric​​​‌ Colas, S.Sylvain​ Lamprier, O.Olivier​‌ Sigaud and P.-Y.Pierre-Yves​​ Oudeyer. MAGELLAN: Metacognitive​​​‌ predictions of learning progress​ guide autotelic LLM agents​‌ in large goal spaces​​.ICML 2025 -​​​‌ 42nd International Conference on​ Machine Learning267Vancouver​‌ (BC), Canada2025HAL​​
  • 14 articleJ.Jacqueline​​​‌ Gottlieb and P.-Y.Pierre-Yves​ Oudeyer. Towards a​‌ neuroscience of active sampling​​ and curiosity.Nature​​​‌ Reviews Neuroscience1912​December 2018, 758-770​‌HAL
  • 15 articleG.​​Gautier Hamon, M.​​​‌Mayalen Etcheverry, B.-C.​ W.Bert Wang-Chak Chan​‌, C.Clément Moulin-Frier​​ and P.-Y.Pierre-Yves Oudeyer​​​‌. Discovering Sensorimotor Agency​ in Cellular Automata using​‌ Diversity Search.Science​​ Advances 11442025​​​‌HALDOI
  • 16 article​G.Grgur Kovač,​‌ R.Rémy Portelas,​​ M.Masataka Sawayama,​​​‌ P. F.Peter Ford​ Dominey and P.-Y.Pierre-Yves​‌ Oudeyer. Stick to​​ your role! Stability of​​​‌ personal values expressed in​ large language models.​‌PLoS ONE198​​August 2024, e0309114​​​‌HALDOI
  • 17 inproceedings​A.Adrien Laversanne-Finot,​‌ A.Alexandre Péré and​​ P.-Y.Pierre-Yves Oudeyer.​​​‌ Curiosity Driven Exploration of​ Learned Disentangled Goal Spaces​‌.CoRL 2018 -​​ Conference on Robot Learning​​​‌Zürich, SwitzerlandOctober 2018​HAL
  • 18 articleC.​‌Cécile Mazon, B.​​Benjamin Clément, D.​​​‌Didier Roy, P.-Y.​Pierre-Yves Oudeyer and H.​‌Hélène Sauzéon. Pilot​​ study of an intervention​​​‌ based on an intelligent​ tutoring system (ITS) for​‌ instructing mathematical skills of​​ students with ASD and/or​​​‌ ID.Education and​ Information Technologies2022HAL​‌DOI
  • 19 inproceedingsE.​​Eleni Nisioti, K.​​​‌Katia Jodogne-del Litto and​ C.Clément Moulin-Frier.​‌ Grounding an Ecological Theory​​ of Artificial Intelligence in​​​‌ Human Evolution.NeurIPS​ 2021 - Conference on​‌ Neural Information Processing Systems​​ / Workshop: Ecological Theory​​​‌ of Reinforcement Learningvirtual​ event, FranceDecember 2021​‌HAL
  • 20 inproceedingsE.​​Eleni Nisioti, E.​​​‌Elías Masquil, G.​Gautier Hamon and A.​‌ C.And Clément Moulin-Frier​​. Autotelic Reinforcement Learning​​ in Multi-Agent Environments.​​​‌CoLLAs 2023, Conference on‌ Lifelong Learning AgentsMontréal,‌​‌ CanadaAugust 2023HAL​​
  • 21 inproceedingsE.Eleni​​​‌ Nisioti, S.Sebastian‌ Risi, I.Ida‌​‌ Momennejad, P.-Y.Pierre-Yves​​ Oudeyer and C.Clément​​​‌ Moulin-Frier. Collective Innovation‌ in Groups of Large‌​‌ Language Models.ALIFE​​ 2024 - The Conference​​​‌ on Artificial LifeCopenhagen,‌ DenmarkMIT Press2024‌​‌HALDOI
  • 22 inproceedings​​A.Alexandre Péré,​​​‌ S.Sébastien Forestier,‌ O.Olivier Sigaud and‌​‌ P.-Y.Pierre-Yves Oudeyer.​​ Unsupervised Learning of Goal​​​‌ Spaces for Intrinsically Motivated‌ Goal Exploration.ICLR2018‌​‌ - 6th International Conference​​ on Learning RepresentationsVancouver,​​​‌ CanadaApril 2018HAL‌
  • 23 inproceedingsJ.Jérémy‌​‌ Perez, G.Grgur​​ Kovač, C.Corentin​​​‌ Léger, C.Cédric‌ Colas, G.Gaia‌​‌ Molinaro, M.Maxime​​ Derex, P.-Y.Pierre-Yves​​​‌ Oudeyer and C.Clément‌ Moulin-Frier. When LLMs‌​‌ Play the Telephone Game:​​ Cultural Attractors as Conceptual​​​‌ Tools to Evaluate LLMs‌ in Multi-turn Settings.‌​‌The Thirteenth International Conference​​ on Learning Representations (ICLR​​​‌ 2025)Singapour, Singapore2025‌HAL
  • 24 inproceedingsE.‌​‌Erwan Plantec, G.​​Gautier Hamon, M.​​​‌Mayalen Etcheverry, P.-Y.‌Pierre-Yves Oudeyer, C.‌​‌Clément Moulin-Frier and B.-C.​​ W.Bert Wang-Chak Chan​​​‌. Flow-Lenia: Towards open-ended‌ evolution in cellular automata‌​‌ through mass conservation and​​ parameter localization.The​​​‌ 2023 Conference on Artificial‌ LifeTokyo, JapanMIT‌​‌ Press; MIT PressJuly​​ 2023HALDOI
  • 25​​​‌ inproceedingsR.Rémy Portelas‌, C.Cédric Colas‌​‌, L.Lilian Weng​​, K.Katja Hofmann​​​‌ and P.-Y.Pierre-Yves Oudeyer‌. Automatic Curriculum Learning‌​‌ For Deep RL: A​​ Short Survey.IJCAI​​​‌ 2020 - International Joint‌ Conference on Artificial Intelligence‌​‌Kyoto / Virtuelle, Japan​​January 2021HAL
  • 26​​​‌ articleM.Matisse Poupard‌, F.Florian Larrue‌​‌, M.Martin Bertrand​​, D.Dominique Liguoro​​​‌, A.Andre Tricot‌ and H.Hélène Sauzéon‌​‌. Using virtual reality​​ for enhancing neuroanatomy learning​​​‌ by optimizing cognitive load‌ and intrinsic motivation..‌​‌Computers and Education235​​October 2025, 105332​​​‌HALDOI
  • 27 inproceedings‌J.Julien Pourcel,‌​‌ C.Cédric Colas,​​ G.Gaia Molinaro,​​​‌ P.-Y.Pierre-Yves Oudeyer and‌ L.Laetitia Teodorescu.‌​‌ ACES: Generating diverse programming​​ puzzles with autotelic language​​​‌ models and semantic descriptors‌.NeurIPS 2024 -‌​‌ The 38th Annual Conference​​ on Neural Information Processing​​​‌ SystemsVancouver, Canada2024‌HAL
  • 28 articleJ.‌​‌Julien Pourcel, C.​​Cédric Colas and P.-Y.​​​‌Pierre-Yves Oudeyer. Self-Improving‌ Language Models for Evolutionary‌​‌ Program Synthesis: A Case​​ Study on ARC-AGI.​​​‌Proceedings of Machine Learning‌ Research2025HALDOI‌​‌
  • 29 inproceedingsC.Chris​​ Reinke, M.Mayalen​​​‌ Etcheverry and P.-Y.Pierre-Yves‌ Oudeyer. Intrinsically Motivated‌​‌ Discovery of Diverse Patterns​​ in Self-Organizing Systems.​​​‌International Conference on Learning‌ Representations (ICLR)Source code‌​‌ and videos athttps://automated-discovery.github.io/Addis​​ Ababa, EthiopiaApril 2020​​​‌HALback to text‌
  • 30 articleY.Yadurshana‌​‌ Sivashankar, M.Myra​​ Fernandes, P.-Y.Pierre-Yves​​​‌ Oudeyer and H.Hélène‌ Sauzéon. The beneficial‌​‌ role of curiosity on​​​‌ route memory in children​.Frontiers in Cognition​‌3March 2024HAL​​DOI
  • 31 articleA.​​​‌Alexandr Ten, P.​Pramod Kaushik, P.-Y.​‌Pierre-Yves Oudeyer and J.​​Jacqueline Gottlieb. Humans​​​‌ monitor learning progress in​ curiosity-driven exploration.Nature​‌ Communications121December​​ 2021HALDOI
  • 32​​​‌ inproceedingsZ.Ziang Xiao​, X.Xingdi Yuan​‌, Q. V.Q.​​ Vera Liao, R.​​​‌Rania Abdelghani and P.-Y.​Pierre-Yves Oudeyer. Supporting​‌ Qualitative Analysis with Large​​ Language Models: Combining Codebook​​​‌ with GPT-3 for Deductive​ Coding.IUI 2023​‌ - 28th International Conference​​ on Intelligent User Interfaces​​​‌Sydney, AustraliaACMMarch​ 2023, 75-78HAL​‌DOI
  • 33 inproceedingsN.​​Nicolas Yax, P.-Y.​​​‌Pierre-Yves Oudeyer and S.​Stefano Palminteri. PhyloLM​‌ : Inferring the Phylogeny​​ of Large Language Models​​​‌ and Predicting their Performances​ in Benchmarks.ICLR​‌ 2025Singapore, Singapore2025​​HAL

12.2 Publications of​​​‌ the year

International journals​

International​​​‌ peer-reviewed conferences

National peer-reviewed Conferences

  • 57​​​‌ inproceedingsS.Sofiya Kobylyanskaya​, C.Catherine de​‌ Vulpillères and P.-Y.Pierre-Yves​​ Oudeyer. A hybrid​​​‌ AI approach to educational​ technologies : augmenting ITS​‌ with generative AI.​​Actes de l'atelier Intelligence​​ Artificielle générative et ÉDUcation​​​‌ : Enjeux, Défis et‌ Perspectives de Recherche 2025‌​‌ (IA-ÉDU)20e Conférence en​​ Recherche d’Information et Applications​​​‌ (CORIA) 32ème Conférence sur‌ le Traitement Automatique des‌​‌ Langues Naturelles (TALN) 27ème​​ Rencontre des Étudiants Chercheurs​​​‌ en Informatique pour le‌ Traitement Automatique des Langues‌​‌ (RECITAL) Les 18e Rencontres​​ Jeunes Chercheurs en RI​​​‌ (RJCRI)Marseille, FranceATALA‌ et ARIA2025,‌​‌ 145-148HAL

Conferences without​​ proceedings

Doctoral dissertations‌ and habilitation theses

  • 61‌​‌ thesisG.Gautier Hamon​​. Towards open-ended dynamics​​​‌ in Artificial Life and‌ Artificial Intelligence : an‌​‌ eco-evo-devo perspective.Université​​ de BordeauxMarch 2025​​​‌HAL
  • 62 thesisG.‌Grgur Kovač. Building,‌​‌ evaluating and understanding socio-cultural​​ AI : leveraging concepts​​​‌ and methods from human‌ sciences.Université de‌​‌ BordeauxNovember 2025HAL​​
  • 63 thesisM.Matisse​​​‌ Poupard. Curious and‌ therefore not overloaded: Towards‌​‌ an integrated understanding of​​ curiosity and cognitive load​​​‌ in XR learning environments.‌.Université de bordeaux‌​‌September 2025HAL

Reports​​ & preprints

Scientific popularization​

12.3 Cited publications​

  • 68 inproceedingsR.Rania​‌ Abdelghani, E.Edith​​ Law, C.Chloé​​​‌ Desvaux, P.-Y.Pierre-Yves​ Oudeyer and H.Hélène​‌ Sauzéon. Interactive environments​​ for training children's curiosity​​​‌ through the practice of​ metacognitive skills : a​‌ pilot study.IDC​​ 2023 - The 22nd​​​‌ annual ACM Interaction Design​ and Children ConferenceChicago​‌ IL, United StatesACM​​June 2023, 495-501​​​‌HALDOIback to​ text
  • 69 articleR.​‌Rania Abdelghani, P.-Y.​​Pierre-Yves Oudeyer, E.​​​‌Edith Law, C.​Catherine de Vulpillières and​‌ H.Hélène Sauzéon.​​ Conversational agents for fostering​​​‌ curiosity-driven learning in children​.International Journal of​‌ Human-Computer Studies167November​​ 2022, 102887HAL​​​‌DOIback to text​back to text
  • 70​‌ inproceedings R.Rania Abdelghani​​, H.Hélène Sauzéon​​​‌ and P.-Y.Pierre-Yves Oudeyer​. Generative AI in​‌ the Classroom: Can Students​​ Remain Active Learners? NeurIPS​​​‌ 2023 - GAIED Workshop​ - Conference on Neural​‌ Information Processing Systems New​​ orleans, USA, United States​​​‌ arXiv December 2023 HAL​ DOI back to text​‌
  • 71 unpublishedM.Maxime​​ Adolphe, M.Marion​​​‌ Pech, M.Masataka​ Sawayama, D.Denis​‌ Maurel, A.Alexandra​​ Delmas, P.-Y.Pierre-Yves​​​‌ Oudeyer and H.Hélène​ Sauzéon. Exploring the​‌ Potential of Artificial Intelligence​​ in Individualized Cognitive Training:​​​‌ a Systematic Review.​December 2023, working​‌ paper or preprintHAL​​DOIback to text​​​‌
  • 72 inproceedingsM.Mehdi​ Alaimi, E.Edith​‌ Law, K. D.​​Kevin Daniel Pantasdo,​​​‌ P.-Y.Pierre-Yves Oudeyer and​ H.Hélène Sauzéon.​‌ Pedagogical Agents for Fostering​​ Question-Asking Skills in Children​​​‌.CHI '20 -​ CHI Conference on Human​‌ Factors in Computing Systems​​Honolulu / Virtual, United​​​‌ StatesApril 2020HAL​DOIback to text​‌
  • 73 articleM.Mark​​ Alfano, K.Kathryn​​​‌ Iurino, P.Paul​ Stey, B.Brian​‌ Robinson, M.Markus​​ Christen, F.Feng​​​‌ Yu and D.Daniel​ Lapsley. Development and​‌ validation of a multi-dimensional​​ measure of intellectual humility​​​‌.PloS one12​82017, e0182950​‌back to textback​​ to text
  • 74 inproceedings​​​‌A.Aurélien Appriou,​ J.Jessy Ceha,​‌ S.Smeety Pramij,​​ D.Dan Dutartre,​​​‌ E.Edith Law,​ P.-Y.Pierre-Yves Oudeyer and​‌ F.Fabien Lotte.​​ Towards measuring states of​​​‌ epistemic curiosity through electroencephalographic​ signals.IEEE SMC​‌ 2020 - IEEE International​​ conference on Systems, Man​​​‌ and CyberneticsToronto /​ Virtual, CanadaOctober 2020​‌HALback to text​​back to text
  • 75​​​‌ inproceedingsP.Paul Barde​, T.Tristan Karch​‌, D.Derek Nowrouzezahrai​​, C.Clément Moulin-Frier​​​‌, C.Christopher Pal​ and P.-Y.Pierre-Yves Oudeyer​‌. Learning to Guide​​ and to Be Guided​​​‌ in the Architect-Builder Problem​.International Conference on​‌ Learning RepresentationsVirtual, France​​April 2022HALback​​ to text
  • 76 inproceedings​​​‌J. C.Jonathan C.‌ Brant and K. O.‌​‌Kenneth O. Stanley.​​ Minimal Criterion Coevolution: A​​​‌ New Approach to Open-Ended‌ Search.Proceedings of‌​‌ the Genetic and Evolutionary​​ Computation ConferenceGECCO '17​​​‌2017, 67--74back‌ to text
  • 77 article‌​‌L.Levin Brinkmann,​​ F.Fabian Baumann,​​​‌ J.-F.Jean-François Bonnefon,‌ M.Maxime Derex,‌​‌ T. F.Thomas F.​​ Müller, A.-M.Anne-Marie​​​‌ Nussberger, A.Agnieszka‌ Czaplicka, A.Alberto‌​‌ Acerbi, T. L.​​Thomas L. Griffiths,​​​‌ J.Joseph Henrich,‌ J. Z.Joel Z.‌​‌ Leibo, R.Richard​​ McElreath, P.-Y.Pierre-Yves​​​‌ Oudeyer, J.Jonathan‌ Stray and I.Iyad‌​‌ Rahwan. Machine Culture​​.Nature Human Behaviour​​​‌711November 2023‌, 1855--1868DOIback‌​‌ to text
  • 78 article​​M.Mauricio Cantor,​​​‌ M.Michael Chimento,‌ S. Q.Simeon Q‌​‌ Smeele, P.Peng​​ He, D.Danai​​​‌ Papageorgiou, L. M.‌Lucy M Aplin and‌​‌ D. R.Damien R​​ Farine. Social network​​​‌ architecture and the tempo‌ of cumulative cultural evolution‌​‌.9back to​​ text
  • 79 articleT.​​​‌Thomas Carta, C.‌Clément Romac, T.‌​‌Thomas Wolf, S.​​Sylvain Lamprier, O.​​​‌Olivier Sigaud and P.-Y.‌Pierre-Yves Oudeyer. Grounding‌​‌ large language models in​​ interactive environments with online​​​‌ reinforcement learning.arXiv‌ preprint arXiv:2302.026622023back‌​‌ to text
  • 80 article​​J.Jessy Ceha,​​​‌ E.Edith Law,‌ D.Dana Kulić,‌​‌ v.ves Oudeyer and​​ D.Didier Roy.​​​‌ Identifying Functions and Behaviours‌ of Social Robots for‌​‌ In-Class Learning Activities: Teachers'​​ Perspective.International Journal​​​‌ of Social RoboticsSeptember‌ 2021HALDOIback‌​‌ to text
  • 81 proceedings​​Lenia and Expanded Universe​​​‌.ALIFE 2020: The‌ 2020 Conference on Artificial‌​‌ LifeALIFE 2021: The​​ 2021 Conference on Artificial​​​‌ Life07 2020,‌ 221-229URL: https://doi.org/10.1162/isal_a_00297DOI‌​‌back to textback​​ to text
  • 82 article​​​‌B.-C. W.Bert Wang-Chak‌ Chan. Lenia-biology of‌​‌ artificial life.Complex​​ Systems2832019​​​‌, 251-286back to‌ text
  • 83 miscF.‌​‌François Chollet. On​​ the Measure of Intelligence​​​‌.November 2019DOI‌back to text
  • 84‌​‌ articleJ.Junyi Chu​​ and L. E.Laura​​​‌ E. Schulz. Play,‌ Curiosity, and Cognition.‌​‌Annual Review of Developmental​​ Psychology212020​​​‌, 317-343URL: https://doi.org/10.1146/annurev-devpsych-070120-014806‌DOIback to text‌​‌
  • 85 articleJ.Junyi​​ Chu, J. B.​​​‌Joshua B. Tenenbaum and‌ L. E.Laura E.‌​‌ Schulz. In Praise​​ of Folly: Flexible Goals​​​‌ and Human Cognition.‌Trends in Cognitive Sciences‌​‌287July 2024​​, 628--642DOIback​​​‌ to text
  • 86 phdthesis‌B.Benjamin Clément.‌​‌ Adaptive Personalization of Pedagogical​​ Sequences using Machine Learning​​​‌.Université de Bordeaux‌December 2018HALback‌​‌ to textback to​​ text
  • 87 articleB.​​​‌Benjamin Clément, D.‌Didier Roy, P.-Y.‌​‌Pierre-Yves Oudeyer and M.​​Manuel Lopes. Multi-Armed​​​‌ Bandits for Intelligent Tutoring‌ Systems.Journal of‌​‌ Educational Data Mining (JEDM)​​​‌72June 2015​, 20--48HALback​‌ to textback to​​ text
  • 88 inproceedingsC.​​​‌Cédric Colas, T.​Tristan Karch, N.​‌Nicolas Lair, J.-M.​​Jean-Michel Dussoux, C.​​​‌Clément Moulin-Frier, P.​Peter Dominey and P.-Y.​‌Pierre-Yves Oudeyer. Language​​ as a Cognitive Tool​​​‌ to Imagine Goals in​ Curiosity Driven Exploration.​‌Advances in Neural Information​​ Processing Systems33Curran​​​‌ Associates, Inc.2020,​ 3761--3774URL: https://proceedings.neurips.cc/paper/2020/hash/274e6fcf4a583de4a81c6376f17673e7-Abstract.htmlback​‌ to text
  • 89 article​​C.Cédric Colas,​​​‌ T.Tristan Karch,​ C.Clément Moulin-Frier and​‌ P.-Y.Pierre-Yves Oudeyer.​​ Language and culture internalization​​​‌ for human-like autotelic AI​.412December​‌ 2022, 1068--1076URL:​​ https://doi.org/10.1038/s42256-022-00591-4DOIback to​​​‌ textback to text​
  • 90 articleC.Cédric​‌ Colas, T.Tristan​​ Karch, O.Olivier​​​‌ Sigaud and P.-Y.Pierre-Yves​ Oudeyer. Autotelic Agents​‌ with Intrinsically Motivated Goal-Conditioned​​ Reinforcement Learning: A Short​​​‌ Survey.Journal of​ Artificial Intelligence Research74​‌July 2022, 1159--1199​​URL: https://www.jair.org/index.php/jair/article/view/13554DOIback​​​‌ to text
  • 91 unpublished​C.Cédric Colas,​‌ T.Tristan Karch,​​ O.Olivier Sigaud and​​​‌ P.-Y.Pierre-Yves Oudeyer.​ Intrinsically Motivated Goal-Conditioned Reinforcement​‌ Learning: a Short Survey​​.January 2021,​​​‌ working paper or preprint​HALback to text​‌
  • 92 articleE. S.​​Enrico Sandro Colizzi,​​​‌ R. M.Renske MA​ Vroomans and R. M.​‌Roeland MH Merks.​​ Evolution of multicellularity by​​​‌ collective integration of spatial​ information.eLife9​‌oct 2020, e56349​​URL: https://doi.org/10.7554/eLife.56349DOIback​​​‌ to text
  • 93 article​G.Guy Davidson,​‌ G.Graham Todd,​​ J.Julian Togelius,​​​‌ T. M.Todd M.​ Gureckis and B. M.​‌Brenden M. Lake.​​ Goals as Reward-Producing Programs​​​‌.Nature Machine Intelligence​72February 2025​‌, 205--220DOIback​​ to textback to​​​‌ text
  • 94 articleM.​Maxime Derex and R.​‌Robert Boyd. Partial​​ connectivity increases cultural accumulation​​​‌ within groups.Proceedings​ of the National Academy​‌ of Sciences11311​​March 2016, 2982--2987​​​‌URL: http://www.pnas.org/lookup/doi/10.1073/pnas.1518798113DOIback​ to textback to​‌ text
  • 95 articleM.​​Maxime Derex and A.​​​‌Alex Mesoudi. Cumulative​ Cultural Evolution within Evolving​‌ Population Structures.Trends​​ in Cognitive Sciences24​​​‌82020, 654--667​DOIback to text​‌
  • 96 phdthesisM.Mayalen​​ Etcheverry. Curiosity-driven AI​​​‌ for Science : Automated​ Discovery of Self-Organized Structures​‌.Université de Bordeaux​​November 2023HALback​​​‌ to textback to​ text
  • 97 miscM.​‌Mayalen Etcheverry. Intrinsically​​ Motivated Discovery of Diverse​​​‌ Patterns in Self-Organizing Systems​.Self-organisation occurs in​‌ many physical, chemical and​​ biological systems, as well​​​‌ as in artificial systems​ like the Game of​‌ Life. Yet, these systems​​ are still full of​​​‌ mysteries and we are​ far from fully grasping​‌ what structures can self-organize,​​ how to represent and​​​‌ classify them, and how​ to predict their evolution.​‌ In this blog post,​​ we present our recent​​​‌ paper which formulates the​ problem of automated discovery​‌ of diverse self-organized patterns​​ in such systems. Using​​ a continuous Game of​​​‌ Life as a testbed,‌ we show how intrinsically-motivated‌​‌ goal exploration processes, initially​​ developed for learning of​​​‌ inverse models in robotics,‌ can efficiently be transposed‌​‌ to this novel application​​ area.March 2020HAL​​​‌back to text
  • 98‌ inproceedingsM.Mayalen Etcheverry‌​‌, C.Clément Moulin-Frier​​ and P.-Y.Pierre-Yves Oudeyer​​​‌. Hierarchically Organized Latent‌ Modules for Exploratory Search‌​‌ in Morphogenetic Systems.​​NeurIPS 2020 - 34th​​​‌ Conference on Neural Information‌ Processing SystemsVancouver /‌​‌ Virtual, CanadaDecember 2020​​HALback to text​​​‌
  • 99 articleM.Mayalen‌ Etcheverry, C.Clément‌​‌ Moulin-Frier, P.-Y.Pierre-Yves​​ Oudeyer and M.Michael​​​‌ Levin. AI-driven Automated‌ Discovery Tools Reveal Diverse‌​‌ Behavioral Competencies of Biological​​ Networks.eLifeAugust​​​‌ 2024HALDOIback‌ to textback to‌​‌ textback to text​​
  • 100 miscM.Maxence​​​‌ Faldor, J.Jenny‌ Zhang, A.Antoine‌​‌ Cully and J.Jeff​​ Clune. OMNI-EPIC: Open-endedness​​​‌ via Models of human‌ Notions of Interestingness with‌​‌ Environments Programmed in Code​​.2025, URL:​​​‌ https://arxiv.org/abs/2405.15568back to text‌
  • 101 articleM.Maxime‌​‌ Gasse, D.Damien​​ Grasset, G.Guillaume​​​‌ Gaudron and P.-Y.Pierre-Yves‌ Oudeyer. Using Confounded‌​‌ Data in Latent Model-Based​​ Reinforcement Learning.Transactions​​​‌ on Machine Learning Research‌ JournalAugust 2023HAL‌​‌back to text
  • 102​​ articleJ.Jacqueline Gottlieb​​​‌, P.-Y.Pierre-Yves Oudeyer‌, M.Manuel Lopes‌​‌ and A.Adrien Baranes​​. Information-seeking, curiosity, and​​​‌ attention: computational and neural‌ mechanisms.Trends in‌​‌ Cognitive Sciences1711​​November 2013, 585-93​​​‌HALDOIback to‌ text
  • 103 articleL.‌​‌Louise Goupil and J.​​Joëlle Proust. Curiosity​​​‌ as a Metacognitive Feeling‌.Cognition231February‌​‌ 2023, 105325DOI​​back to text
  • 104​​​‌ articleJ. P.J.‌ P. Guilford. Creativity:‌​‌ Yesterday, Today, and Tomorrow​​.The Journal of​​​‌ Creative Behavior11‌1967, 3--14DOI‌​‌back to text
  • 105​​ articleY.Yejia Guo​​​‌ and B.Baker Ayoun‌. What's in It‌​‌ for Them? The Role​​ of Social Curiosity and​​​‌ Social Needs in Motivating‌ and Retaining Hospitality Employees‌​‌.International Journal of​​ Hospitality Management1152023​​​‌, 1--12DOIback‌ to textback to‌​‌ text
  • 106 articleL.​​ P.Lydia Paine Hagtvedt​​​‌, K.Karyn Dossinger‌, S. H.Spencer‌​‌ H. Harrison and L.​​Li Huang. Curiosity​​​‌ Made the Cat More‌ Creative: Specific Curiosity as‌​‌ a Driver of Creativity​​.Organizational Behavior and​​​‌ Human Decision Processes150‌2019, 1--13DOI‌​‌back to text
  • 107​​ articleW. D.W.​​​‌ D. Hamilton. The‌ genetical evolution of social‌​‌ behaviour. I.Journal​​ of Theoretical Biology7​​​‌1July 1964,‌ 1--16URL: https://www.sciencedirect.com/science/article/pii/0022519364900384DOI‌​‌back to text
  • 108​​ articleW.W.D. Hamilton​​​‌. The genetical evolution‌ of social behaviour. II‌​‌.Journal of Theoretical​​ Biology711964​​​‌, 17-52URL: https://www.sciencedirect.com/science/article/pii/0022519364900396‌DOIback to text‌​‌
  • 109 articleR. A.​​Ryan A. Hargrove and​​​‌ J. L.John L.‌ Nietfeld. The Impact‌​‌ of Metacognitive Instruction on​​​‌ Creative Problem Solving.​Journal of Experimental Education​‌8332015,​​ 291--318DOIback to​​​‌ text
  • 110 articleF.-M.​Freda-Marie Hartung and B.​‌Britta Renner. Perceived​​ and Actual Social Discrimination:​​​‌ The Case of Overweight​ and Social Inclusion.​‌Frontiers in Psychology4​​2013DOIback to​​​‌ textback to text​
  • 111 articleJ.Joseph​‌ Henrich, R.Robert​​ Boyd, M.Maxime​​​‌ Derex, M. A.​Michelle A Kline,​‌ A.Alex Mesoudi,​​ M.Michael Muthukrishna,​​​‌ A. T.Adam T​ Powell, S. J.​‌Stephen J Shennan and​​ M. G.Mark G​​​‌ Thomas. Understanding cumulative​ cultural evolution.Proceedings​‌ of the National Academy​​ of Sciences11344​​​‌2016, E6724--E6725back​ to textback to​‌ text
  • 112 articleX.​​Xiaoyu Jia, W.​​​‌Weijian Li and L.​Liren Cao. The​‌ Role of Metacognitive Components​​ in Creative Thinking.​​​‌Frontiers in Psychology10​2019DOIback to​‌ text
  • 113 articleF.​​Frederic Kaplan and P.-Y.​​​‌Pierre-Yves Oudeyer. In​ Search of the Neural​‌ Circuits of Intrinsic Motivation​​.Frontiers in Neuroscience​​​‌11October 2007​, 225--236URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2518057/​‌DOIback to text​​
  • 114 articleT. B.​​​‌Todd B. Kashdan,​ D. J.David J.​‌ Disabato, F. R.​​Fallon R. Goodman and​​​‌ P. E.Patrick E.​ McKnight. The Five-Dimensional​‌ Curiosity Scale Revised (5DCR):​​ Briefer Subscales While Separating​​​‌ Overt and Covert Social​ Curiosity.Personality and​‌ Individual Differences157April​​ 2020, 109836DOI​​​‌back to textback​ to textback to​‌ text
  • 115 articleT.​​ B.Todd B Kashdan​​​‌, M. C.Melissa​ C Stiksma, D.​‌ J.David J Disabato​​, P. E.Patrick​​​‌ E McKnight, J.​John Bekier, J.​‌Joel Kaji and R.​​Rachel Lazarus. The​​​‌ five-dimensional curiosity scale: Capturing​ the bandwidth of curiosity​‌ and identifying four unique​​ subgroups of curious people​​​‌.Journal of Research​ in Personality732018​‌, 130--149back to​​ text
  • 116 articleH.​​​‌Hiroaki Kitano. Biological​ robustness.Nature Reviews​‌ Genetics5112004​​, 826--837back to​​​‌ text
  • 117 articleW.​Wilma Koutstaal, K.​‌Kara Kedrick and J.​​Joshua Gonzalez-Brito. Capturing,​​​‌ Clarifying, and Consolidating the​ Curiosity-Creativity Connection.12​‌1September 2022,​​ 15300DOIback to​​​‌ textback to text​
  • 118 unpublishedG.Grgur​‌ Kovaċ, R.Rémy​​ Portelas, K.Katja​​​‌ Hofmann and P.-Y.Pierre-Yves​ Oudeyer. SocialAI: Benchmarking​‌ Socio-Cognitive Abilities in Deep​​ Reinforcement Learning Agents.​​​‌October 2021, working​ paper or preprintHAL​‌back to text
  • 120 articleK.​​​‌ N.Kevin N Laland​, T.Tobias Uller​‌, M. W.Marcus​​ W Feldman, K.​​Kim Sterelny, G.​​​‌ B.Gerd B Müller‌, A.Armin Moczek‌​‌, E.Eva Jablonka​​ and J.John Odling-Smee​​​‌. The extended evolutionary‌ synthesis: its structure, assumptions‌​‌ and predictions.Proceedings​​ of the royal society​​​‌ B: biological sciences282‌18132015, 20151019‌​‌back to text
  • 121​​ articleD.David Lazer​​​‌ and A.Allan Friedman‌. The Network Structure‌​‌ of Exploration and Exploitation​​.Administrative Science Quarterly​​​‌524December 2007‌, 667--694URL: http://journals.sagepub.com/doi/10.2189/asqu.52.4.667‌​‌DOIback to text​​
  • 122 inproceedingsJ. Z.​​​‌Joel Z. Leibo,‌ V.Vinicius Zambaldi,‌​‌ M.Marc Lanctot,​​ J.Janusz Marecki and​​​‌ T.Thore Graepel.‌ Multi-Agent Reinforcement Learning in‌​‌ Sequential Social Dilemmas.​​Proceedings of the 16th​​​‌ Conference on Autonomous Agents‌ and MultiAgent SystemsAAMAS‌​‌ '17São Paulo, Brazil​​2017, 464–473back​​​‌ to textback to‌ text
  • 123 articleS.‌​‌Sheina Lew-Levy and D.​​Dorsa Amir. Children​​​‌ as Agents of Cultural‌ Adaptation.The Behavioral‌​‌ and Brain SciencesDecember​​ 2024, 1--68DOI​​​‌back to text
  • 124‌ articleW. A.Winter‌​‌ A. Mason, A.​​Andy Jones and R.​​​‌ L.Robert L. Goldstone‌. Propagation of innovations‌​‌ in networked groups.​​Journal of Experimental Psychology:​​​‌ General13732008‌, 422--433URL: http://doi.apa.org/getdoi.cfm?doi=10.1037/a0012798‌​‌DOIback to text​​
  • 125 articleC.Cécile​​​‌ Mazon, B.Benjamin‌ Clément, D.Didier‌​‌ Roy, P.-Y.Pierre-Yves​​ Oudeyer and H.Hélène​​​‌ Sauzéon. Pilot study‌ of an intervention based‌​‌ on an intelligent tutoring​​ system (ITS) for instructing​​​‌ mathematical skills of students‌ with ASD and/or ID‌​‌.Education and Information​​ Technologies2022HALDOI​​​‌back to text
  • 126‌ articleC.Cécile Mazon‌​‌, K.Kattalin Etchegoyhen​​, I.Isabeau Saint-Supery​​​‌, A.Anouck Amestoy‌, M.Manuel Bouvard‌​‌, C.Charles Consel​​ and H.Hélène Sauzéon​​​‌. Fostering parents-professional collaboration‌ for facilitating the school‌​‌ inclusion of students with​​ ASD: Design of the​​​‌ ''ToGather'' web-based prototype.‌Educational Technology Research and‌​‌ DevelopmentDecember 2021HAL​​DOIback to text​​​‌
  • 127 articleC.Cécile‌ Mazon, C.Charles‌​‌ Fage and H.Hélène​​ Sauzéon. Effectiveness and​​​‌ usability of technology-based interventions‌ for children and adolescents‌​‌ with ASD: A systematic​​ review of reliability, consistency,​​​‌ generalization and durability related‌ to the effects of‌​‌ intervention.Computers in​​ Human Behavior93April​​​‌ 2019HALDOIback‌ to text
  • 128 incollection‌​‌C.Cécile Mazon and​​ H.Hélène Sauzéon.​​​‌ Utilisation des technologies mobiles‌ auprès des enfants avec‌​‌ TSA..Autisme et​​ usages du numériques en​​​‌ éducation2022HALback‌ to text
  • 129 inproceedings‌​‌E.Eric Meyer,​​ H.Hélène Sauzéon,​​​‌ I.Isabeau Saint-Supery and‌ C.Cécile Mazon.‌​‌ Systematic review of technologies​​ to collaborate and co-educate​​​‌ students with special educational‌ needs and supporting their‌​‌ schooling.IHIET 2023​​ - 10th International Conference​​​‌ on Human Interaction and‌ Emerging Technologies111Nice,‌​‌ FranceAHFE InternationalAugust​​ 2023, 1-12HAL​​​‌DOIback to text‌
  • 130 articleA. B.‌​‌Andrea B. Migliano,​​​‌ F.Federico Battiston,​ S.Sylvain Viguier,​‌ A. E.Abigail E.​​ Page, M.Mark​​​‌ Dyble, R.Rodolph​ Schlaepfer, D.Daniel​‌ Smith, L.Leonora​​ Astete, M.Marilyn​​​‌ Ngales, J.Jesus​ Gomez-Gardenes, V.Vito​‌ Latora and L.Lucio​​ Vinicius. Hunter-gatherer multilevel​​​‌ sociality accelerates cumulative cultural​ evolution.Science Advances​‌69February 2020​​, eaax5913DOIback​​​‌ to text
  • 131 phdthesis​C.Clément Moulin-Frier.​‌ The Ecology of Open-Ended​​ Skill Acquisition.Université​​​‌ de Bordeaux (UB)December​ 2022HALback to​‌ text
  • 132 articleK.​​Kou Murayama. A​​​‌ Reward-Learning Framework of Knowledge​ Acquisition: An Integrated Account​‌ of Curiosity, Interest, and​​ Intrinsic--Extrinsic Rewards.Psychological​​​‌ Review12912022​, 175--198DOIback​‌ to text
  • 133 article​​K.Kou Murayama,​​​‌ L.Lily FitzGibbon and​ M.Michiko Sakaki.​‌ Process Account of Curiosity​​ and Interest: A Reward-Learning​​​‌ Perspective.Educational Psychology​ Review314December​‌ 2019, 875--895URL:​​ http://link.springer.com/10.1007/s10648-019-09499-9DOIback to​​​‌ text
  • 134 techreportA.​Arun Nair, P.​‌Praveen Srinivasan, S.​​Sam Blackwell, C.​​​‌Cagdas Alcicek, R.​Rory Fearon, A.​‌Alessandro De Maria,​​ V.Vedavyas Panneershelvam,​​​‌ M.Mustafa Suleyman,​ C.Charles Beattie,​‌ S.Stig Petersen,​​ S.Shane Legg,​​​‌ V.Volodymyr Mnih,​ K.Koray Kavukcuoglu and​‌ D.David Silver.​​ Massively Parallel Methods for​​​‌ Deep Reinforcement Learning.​arXiv:1507.04296arXiv:1507.04296 [cs]arXiv​‌July 2015, URL:​​ http://arxiv.org/abs/1507.04296back to text​​​‌
  • 135 articleX.Xinxiao​ Nie, Y.Yuan​‌ Tian, M.Mengjie​​ Liu, D.Di​​​‌ Wu and Y.Yunxiao​ Guo. The impact​‌ of generative artificial intelligence​​ on students' higher order​​​‌ thinking: Evidence from a​ three-level meta-analysis.Education​‌ and Information Technologies2025​​, 1--32back to​​​‌ text
  • 136 unpublishedE.​Eleni Nisioti, M.​‌Mateo Mahaut, P.-Y.​​Pierre-Yves Oudeyer, I.​​​‌Ida Momennejad and C.​Clément Moulin-Frier. Social​‌ Network Structure Shapes Innovation:​​ Experience-sharing in RL with​​​‌ SAPIENS.July 2022​, working paper or​‌ preprintHALback to​​ text
  • 137 miscE.​​​‌Eleni Nisioti, M.​Mateo Mahaut, P.-Y.​‌Pierre-Yves Oudeyer, I.​​Ida Momennejad and C.​​​‌Clément Moulin-Frier. Social​ Network Structure Shapes Innovation:​‌ Experience-sharing in RL with​​ SAPIENS.arXiv:2206.05060 [cs]​​​‌November 2022, URL:​ http://arxiv.org/abs/2206.05060DOIback to​‌ textback to text​​back to text
  • 138​​​‌ inproceedingsE.Eleni Nisioti​, S.Sebastian Risi​‌, I.Ida Momennejad​​, P.-Y.Pierre-Yves Oudeyer​​​‌ and C.Clément Moulin-Frier​. Collective Innovation in​‌ Groups of Large Language​​ Models.MIT Press​​​‌July 2024, URL:​ https://dx.doi.org/10.1162/isal_a_00730DOIback to​‌ textback to text​​
  • 139 inproceedingsE.Eleni​​​‌ Nisioti, S.Sebastian​ Risi, I.Ida​‌ Momennejad, P.-Y.Pierre-Yves​​ Oudeyer and C.Clément​​​‌ Moulin-Frier. Collective Innovation​ in Groups of Large​‌ Language Models.ALIFE​​ 2024 - The Conference​​​‌ on Artificial LifeCopenhagen,​ DenmarkMIT PressJuly​‌ 2024HALDOIback​​ to text
  • 140 misc​​K.Kenneth O. Stanley​​​‌, J.Joel Lehman‌ and L.Lisa Soros‌​‌. Open-endedness: The last​​ grand challenge you've never​​​‌ heard of.December‌ 2017, URL: https://www.oreilly.com/radar/open-endedness-the-last-grand-challenge-youve-never-heard-of/‌​‌back to text
  • 141​​ articleP.-Y.Pierre-Yves Oudeyer​​​‌, F.F. Kaplan‌ and V.V. Hafner‌​‌. Intrinsic Motivation Systems​​ for Autonomous Mental Development​​​‌.IEEE Transactions on‌ Evolutionary Computation112‌​‌2007, 265--286DOI​​back to text
  • 142​​​‌ articleP.-Y.Pierre-Yves Oudeyer‌, F.Frédéric Kaplan‌​‌ and V.Véréna Hafner​​. Intrinsic Motivation for​​​‌ Autonomous Mental Development.‌IEEE Transactions on Evolutionary‌​‌ Computation112January​​ 2007, 265-286HAL​​​‌DOIback to text‌
  • 143 articleP.-Y.P-Y‌​‌ Oudeyer, J.Jacqueline​​ v and M.Manuel​​​‌ Lopes. Intrinsic motivation,‌ curiosity, and learning: Theory‌​‌ and applications in educational​​ technologies.Progress in​​​‌ brain research2292016‌, 257--284back to‌​‌ text
  • 144 inproceedingsJ.​​Jérémy Perez, G.​​​‌Grgur Kovaċ, C.‌Corentin Léger, C.‌​‌Cédric Colas, G.​​Gaia Molinaro, M.​​​‌Maxime Derex, P.-Y.‌Pierre-Yves Oudeyer and C.‌​‌Clément Moulin-Frier. When​​ LLMs Play the Telephone​​​‌ Game: Cultural Attractors as‌ Conceptual Tools to Evaluate‌​‌ LLMs in Multi-turn Settings​​.The Thirteenth International​​​‌ Conference on Learning Representations‌ (ICLR 2025)Singapour, Singapore‌​‌April 2025HALback​​ to text
  • 145 article​​​‌J.Julien Perolat,‌ J. Z.Joel Z‌​‌ Leibo, V.Vinicius​​ Zambaldi, C.Charles​​​‌ Beattie, K.Karl‌ Tuyls and T.Thore‌​‌ Graepel. A multi-agent​​ reinforcement learning model of​​​‌ common-pool resource appropriation.‌Advances in neural information‌​‌ processing systems302017​​back to textback​​​‌ to textback to‌ text
  • 146 articleR.‌​‌Richard Phillips. Curious​​ about Others: Relational and​​​‌ Empathetic Curiosity for Diverse‌ Societies.New Formations‌​‌8888March 2016​​, 123--142DOIback​​​‌ to text
  • 147 book‌J.J. Piaget.‌​‌ The Language and Thought​​ of the Child.​​​‌The Language and Thought‌ of the ChildOxford,‌​‌ EnglandHarcourt, Brace1926​​, xxiii, 246back​​​‌ to text
  • 148 article‌P. R.Paul R.‌​‌ Pintrich, D. A.​​David A. F. Smith​​​‌, T.Teresa Garcia‌ and W. J.Wilbert‌​‌ J. McKeachie. A​​ manual for the use​​​‌ of the Motivated Strategies‌ for Learning Questionnaire (MSLQ).‌​‌.1991back to​​ text
  • 149 inproceedingsE.​​​‌Erwan Plantec, G.‌Gautier Hamon, M.‌​‌Mayalen Etcheverry, P.-Y.​​Pierre-Yves Oudeyer, C.​​​‌Clément Moulin-Frier and B.-C.‌ W.Bert Wang-Chak Chan‌​‌. Flow-Lenia: Towards open-ended​​ evolution in cellular automata​​​‌ through mass conservation and‌ parameter localization.The‌​‌ 2023 Conference on Artificial​​ LifeTokyo, JapanMIT​​​‌ PressJuly 2023HAL‌DOIback to text‌​‌
  • 150 inproceedingsJ.Julien​​ Pourcel, C.Cédric​​​‌ Colas, G.Gaia‌ Molinaro, P.-Y.Pierre-Yves‌​‌ Oudeyer and L.Laetitia​​ Teodorescu. ACES: Generating​​​‌ diverse programming puzzles with‌ autotelic language models and‌​‌ semantic descriptors.NeurIPS​​ 2024 - The 38th​​​‌ Annual Conference on Neural‌ Information Processing SystemsVancouver,‌​‌ CanadaDecember 2024HAL​​​‌back to text
  • 151​ articleR.Rogelio Puente-Diaz​‌ and J.Judith Cavazos-Arroyo​​. Creative Metacognitive Feelings​​​‌ as a Source of​ Information for Creative Self-efficacy,​‌ Creativity Potential, Intrapersonal Idea​​ Selection, and Task Enjoyment​​​‌.The Journal of​ Creative Behavior543​‌2020, 499--507DOI​​back to text
  • 152​​​‌ articleJ. K.Justin​ K. Pugh, L.​‌ B.Lisa B. Soros​​ and K. O.Kenneth​​​‌ O. Stanley. Quality​ Diversity: A New Frontier​‌ for Evolutionary Computation.​​Frontiers in Robotics and​​​‌ AI32016,​ URL: https://www.frontiersin.org/articles/10.3389/frobt.2016.00040back to​‌ text
  • 153 incollectionI.​​Isabeau Saint-Supery, C.​​​‌Cécile Mazon, M.​Meyer Eric and H.​‌Hélène Sauzéon. Conception​​ d'une application de soutien​​​‌ à la coéducation pour​ l'inclusion scolaire des élèves​‌ TSA.Éthiques inclusives​​ en éducation. Recherches, contextes​​​‌ et pratiques (p. 145-160)​Parentalité & HandicapChamps​‌ Social2023, 260​​HALback to text​​​‌
  • 154 inproceedingsI.Isabeau​ Saint-Supery, H.Hélène​‌ Sauzéon, C.Christelle​​ Maillart, N.Nicolas​​​‌ Neu, E.Eric​ Meyer and C.Cécile​‌ Mazon. Cross-cultural evaluation​​ of a web application​​​‌ to support communication and​ collaboration among stakeholders of​‌ the school inclusion of​​ children with ASD.​​​‌AAATE 2023 - The​ 17h International Conference of​‌ the Association for the​​ Advancement of Assistive Technology​​​‌ in EuropeAAATEParis,​ FranceAugust 2023HAL​‌back to text
  • 155​​ unpublishedI.Isabeau Saint-Supery​​​‌, H.Hélène Sauzéon​, E.Eric Meyer​‌ and C.Cécile Mazon​​. ToGather, an interactive​​​‌ website for the stakeholders​ of school inclusion of​‌ children with ASD: an​​ iterative design including user​​​‌ testing.2022,​ working paper or preprint​‌HALback to text​​
  • 156 articleN. S.​​​‌Nicola S. Schutte and​ J. M.John M.​‌ Malouff. Connections between​​ Curiosity, Flow and Creativity​​​‌.152January 2020​, 109555DOIback​‌ to textback to​​ text
  • 157 articleM.​​​‌Michael Shulman. Strange​ new universes: Proof assistants​‌ and synthetic foundations.​​Bulletin of the American​​​‌ Mathematical Society612​2024, 257--270back​‌ to text
  • 158 misc​​O.Olivier Sigaud,​​​‌ G.Gianluca Baldassarre,​ C.Cedric Colas,​‌ S.Stephane Doncieux,​​ R.Richard Duro,​​​‌ P.-Y.Pierre-Yves Oudeyer,​ N.Nicolas Perrin-Gilbert and​‌ V. G.Vieri Giuliano​​ Santucci. A Definition​​​‌ of Open-Ended Learning Problems​ for Goal-Conditioned Agents.​‌June 2024DOIback​​ to text
  • 159 unpublished​​​‌Y.Yadurshana Sivashankar,​ M.Myra Fernandes,​‌ P.-Y.Pierre-Yves Oudeyer and​​ H.Hélène Sauzéon.​​​‌ The Beneficial Role of​ Curiosity on Route memory​‌ in Children.January​​ 2024, working paper​​​‌ or preprintHALDOI​back to textback​‌ to text
  • 160 article​​J. M.J. Maynard​​​‌ Smith. Group selection​ and kin selection.​‌Nature2011964,​​ 1145-1147URL: https://doi.org/10.1038/2011145a0back​​​‌ to text
  • 161 article​K. O.Kenneth O.​‌ Stanley, J.Jeff​​ Clune, J.Joel​​​‌ Lehman and R.Risto​ Miikkulainen. Designing Neural​‌ Networks through Neuroevolution.​​Nature Machine Intelligence1​​1January 2019,​​​‌ 24--35DOIback to‌ text
  • 162 articleA.‌​‌Adam Tapal, E.​​Ela Oren, R.​​​‌Reuven Dar and B.‌Baruch Eitam. The‌​‌ sense of agency scale:​​ A measure of consciously​​​‌ perceived control over one's‌ mind, body, and the‌​‌ immediate environment.Frontiers​​ in psychology82017​​​‌, 1552back to‌ text
  • 163 incollectionA.‌​‌Alexandr Ten, P.-Y.​​Pierre-Yves Oudeyer and C.​​​‌Clément Moulin-Frier. Curiosity-Driven‌ Exploration: Diversity of Mechanisms‌​‌ and Functions.The​​ Drive for Knowledge: The​​​‌ Science of Human Information‌ Seeking2022DOIback‌​‌ to text
  • 164 article​​D. R.Daryl R​​​‌ Van Tongeren, V.‌Vincent Ng, L.‌​‌Louis Hickman and L.​​Louis Tay. Behavioral​​​‌ measures of humility: Part‌ 1. Theoretical and methodological‌​‌ review.The Journal​​ of Positive Psychology18​​​‌52023, 711--721‌back to text
  • 165‌​‌ articleJ. P.John​​ P Veillette, L.​​​‌Letitia Ho and H.‌ C.Howard C Nusbaum‌​‌. Metacognition bridges experiences​​ and beliefs in sense​​​‌ of agency.Consciousness‌ and Cognition1242024‌​‌, 103745back to​​ text
  • 166 articleS.​​​‌Stéphan Vincent-Lancrin and R.‌Reyer Van der Vlies‌​‌. Trustworthy artificial intelligence​​ (AI) in education: Promises​​​‌ and challenges.OECD‌ education working papers218‌​‌2020, 0_1--17back​​ to text
  • 167 article​​​‌S.Sophie Von Stumm‌ and P. L.Phillip‌​‌ L Ackerman. Investment​​ and intellect: a review​​​‌ and meta-analysis..Psychological‌ bulletin13942013‌​‌, 841back to​​ text
  • 168 bookL.​​​‌ S.Lev Semenovich Vygotsky‌ and M.Michael Cole‌​‌. Mind in society:​​ Development of higher psychological​​​‌ processes.Harvard university‌ press1978back to‌​‌ text
  • 169 articleD.​​Dennis Whitcomb, H.​​​‌Heather Battaly, J.‌Jason Baehr and D.‌​‌Daniel Howard-Snyder. Intellectual​​ humility.Philosophy and​​​‌ Phenomenological Research943‌2017, 509--539back‌​‌ to text
  • 170 inproceedings​​Z.Ziang Xiao,​​​‌ X.Xingdi Yuan,‌ Q. V.Q. Vera‌​‌ Liao, R.Rania​​ Abdelghani and P.-Y.Pierre-Yves​​​‌ Oudeyer. Supporting Qualitative‌ Analysis with Large Language‌​‌ Models: Combining Codebook with​​ GPT-3 for Deductive Coding​​​‌.IUI 2023 -‌ 28th International Conference on‌​‌ Intelligent User InterfacesSydney,​​ AustraliaACMMarch 2023​​​‌, 75-78HALDOI‌back to text
  1. 1‌​‌Spatially-localized pattern a pattern​​ existing within some (fuzzy)​​​‌ boundary i.e. with a‌ limited range in space‌​‌ as opposed to patterns​​ with unbounded growth
  2. 2​​​‌Moving patterns a spatially-localized‌ pattern that move and‌​‌ propagate information in space​​