FLOWERS

FLOWERS - 2025

2025Activity reportProject-TeamFLOWERS

RNSR: 200820949R‌

Research center Inria Centre‌ at the University of‌‌ Bordeaux
In partnership with:Ecole nationale supérieure des‌ techniques avancées - Institut‌ polytechnique de Paris, Université‌‌ de Bordeaux
Team name: FLOW in Exploration, leaRning,‌ and diScovery

Creation of‌ the Project-Team: 2025 March‌‌ 01

Each year, Inria research teams publish an‌ Activity Report presenting their‌ work and results over‌‌ the reporting period. These reports follow a common‌ structure, with some optional‌ sections depending on the‌‌ specific team. They typically begin by outlining the‌ overall objectives and research‌ programme, including the main‌‌ research themes, goals, and methodological approaches. They also‌ describe the application domains‌ targeted by the team,‌‌ highlighting the scientific or societal contexts in which‌ their work is situated.‌

The reports then present‌‌ the highlights of the year, covering major scientific‌ achievements, software developments, or‌ teaching contributions. When relevant,‌‌ they include sections on software, platforms, and open‌ data, detailing the tools‌ developed and how they‌‌ are shared. A substantial part is dedicated to‌ new results, where scientific‌ contributions are described in‌‌ detail, often with subsections specifying participants and associated‌ keywords.

Finally, the Activity‌ Report addresses funding, contracts,‌‌ partnerships, and collaborations at various levels, from industrial‌ agreements to international cooperations.‌ It also covers dissemination‌‌ and teaching activities, such as participation in scientific‌ events, outreach, and supervision.‌ The document concludes with‌‌ a presentation of scientific production, including major publications‌ and those produced during‌ the year.

Keywords

Computer‌‌ Science and Digital Science

A5.1.1. Engineering of interactive‌ systems
A5.1.2. Evaluation of‌ interactive systems
A5.1.4. Brain-computer‌‌ interfaces, physiological computing
A5.1.5. Body-based interfaces
A5.1.6. Tangible‌ interfaces
A5.1.7. Multimodal interfaces‌
A5.8. Natural language processing‌‌
A5.10.5. Robot interaction (with the environment, humans, other‌ robots)
A5.10.8. Cognitive robotics‌ and systems
A6.3.1. Inverse‌‌ problems
A9.2. Machine learning
A9.4. Natural language processing‌
A9.5. Robotics and AI‌
A9.7. AI algorithmics
A9.9.‌‌ Distributed AI, Multi-agent
A9.10. Hybrid approaches for AI‌
A9.11. Generative AI
A9.12.1.‌ Object recognition
A9.12.2. Activity‌‌ recognition
A9.13. Agentic AI‌
A9.14. Evaluation of AI models
A9.16. Societal impact‌ of AI

1 Team‌ members, visitors, external collaborators

Research Scientists

Pierre-Yves Oudeyer‌ [Team leader, INRIA, Senior Researcher‌, HDR]
Clément Moulin-Frier [INRIA,‌ Researcher, until Apr 2025]
Hélène Sauzéon‌ [INRIA, Professor Detachement, HDR]‌

Faculty Member

Cécile Mazon [UNIV BORDEAUX,‌ Associate Professor]

Post-Doctoral Fellows

Olivier Clerc [‌INRIA, Post-Doctoral Fellow]
Cedric Colas [‌INRIA, Post-Doctoral Fellow]
Sina Khajehabdollahi [‌INRIA, Post-Doctoral Fellow, from Apr 2025‌ until Jul 2025]
Marion Pech [INRIA‌, Post-Doctoral Fellow, until Mar 2025]‌
Leslie Tricoche [UNIV BORDEAUX, from Oct‌ 2025]

PhD Students

Timothe Boulet [INRIA‌]
Thomas Carta [INRIA, from Feb‌ 2025 until Nov 2025]
Thomas Carta [‌UNIV BORDEAUX, until Jan 2025]
Marko‌ Cvjetko [UNIV BORDEAUX]
Marie-Sarah Desvaux [‌UNIV BORDEAUX]
Juliette Deyts [UNIV BORDEAUX‌]
Loris Gaven [INRIA]
Gautier Hamon‌ [INRIA, until Apr 2025]
Sina‌ Khajehabdollahi [INRIA, until Mar 2025]‌
Grgur Kovac [INRIA, from Aug 2025‌ until Nov 2025]
Grgur Kovac [INRIA‌, until Jul 2025]
Jeremy Perez [‌UNIV BORDEAUX]
Matisse Poupard [UNIV BORDEAUX‌, from Sep 2025]
Matisse Poupard [‌INRIA, from Apr 2025 until Jul 2025‌]
Matisse Poupard [CATIE, CIFRE,‌ until Mar 2025]
Julien Pourcel [INRIA‌]
Clément Romac [HUGGING FACE SAS,‌ CIFRE]
Julien Rosenberger [INRIA, from‌ Oct 2025]
Julien Rosenberger [UCL,‌ from May 2025 until Sep 2025]
Isabeau‌ Saint-Supery [UNIV BORDEAUX, until Apr 2025‌]
Paul Tabbara [INRIA, from Dec‌ 2025]
Nicolas Yax [ENS Paris]‌

Technical Staff

Camille Anthounet [UNIV BORDEAUX,‌ Engineer, from Nov 2025]
Zacharie Bugaud‌ [INRIA, Engineer, until Jan 2025‌]
Ludovic Matar [INRIA, Engineer,‌ from Feb 2025]

Interns and Apprentices

Hana‌ Al Mrayati [UNIV BORDEAUX, Intern,‌ from Mar 2025 until Jun 2025]
Loic‌ Blouin [UNIV BORDEAUX, Intern, from‌ May 2025 until Aug 2025]
Sophie Lepennetier‌ [UNIV PADOVA, Intern, from Sep‌ 2025 until Nov 2025]
Eliott Poisson [‌UNIV BORDEAUX, Intern, from May 2025‌ until Jul 2025]
Paul Tabbara [INRIA‌, Intern, from May 2025 until Nov‌ 2025]
Kan Yao [INRIA, Intern, from Mar 2025‌ until Jul 2025]‌

Administrative Assistants

Fabienne Cuyollaa‌‌ [INRIA]
Nathalie Robin [INRIA]‌

External Collaborators

Eleni Nisioti‌ [UNIV COPENHAGUE]‌‌
Didier Roy [EPFL]

2 Overall objectives‌

Abstract: This project-team aims‌ to study the fundamental‌‌ mechanisms that can enable open-ended learning and development‌ in humans and machines,‌ i.e. how individuals, or‌‌ groups of individuals, can continuously discover and learn‌ novel skills of increasing‌ complexity. We also aim‌‌ to leverage this fundamental understanding for human-centered real-world‌ applications in education and‌ in assisted scientific discovery.‌‌

In particular, we focus on studying mechanisms enabling‌ Autotelic and Aligned Intelligence‌ in humans and machines.‌‌ A first key ingredient of open-ended learning is‌ curiosity-driven autotelic learning, which‌ is the ability of‌‌ individuals to set and pursue their own goals‌ (from the greek ‘telos’/goal,‌ and ‘auto’/self), a form‌‌ of intrinsic motivation pushing organisms to continuously seek‌ new knowledge and skills.‌ self-organizing their own learning‌‌ curriculum, using meta-cognition and leading to creative exploration.‌

To enable abstraction, collective‌ intelligence, and alignment of‌‌ autotelic systems on human cultures (values, preferences), we‌ also aim to study‌ how language and social‌‌ interaction, both as a communication system and as‌ a cognitive tool, can‌ guide autotelic exploration. Symmetrically,‌‌ using multi-scale models, we aim to study how‌ curiosity-driven autotelic exploration could‌ self-organize at the group‌‌ level. We also aim to study what are‌ the ecosystemic and evolutionary‌ origins of autotelic systems.‌‌

Context: Humans explore, learn and discover continuously novel‌ skills and knowledge, through‌ open-ended processes. In fact,‌‌ humans, and some other life forms, are equipped‌ with intrinsic motivation systems‌ (“curiosity”) pushing them to‌‌ spontaneously explore and actively seek new knowledge 102‌, setting and pursuing‌ their own goals (they‌‌ are autotelic 90), ranging from the most‌ concrete (e.g. stack cubes)‌ to the most abstract‌‌ (e.g. invent new maths problems). This happens at‌ the level of individuals,‌ starting with children who‌‌ eagerly and spontaneously explore their bodies and their‌ environment as they develop,‌ up to adults of‌‌ all ages and all backgrounds. This autotelic exploration‌ process also benefits from‌ social and collective dynamics,‌‌ leveraging past discoveries and being guided to align‌ with the culture (values,‌ preferences, ethics) of a‌‌ given group 111. The iteration of this‌ collective intelligence process, accumulating‌ and transmitting discoveries over‌‌ generations, gives rise to open-ended cultural evolution, and‌ to autotelic exploration at‌ the level of collectives.‌‌ Understanding the mechanisms that enable the origins and‌ functionalities of autotelic learning‌ in interaction with social‌‌ groups and culture, giving rise to open-endedness, is‌ still a major mystery‌ for science.

We study‌‌ these mechanisms from four complementary scientific perspectives, structured‌ into three main objectives:‌

Objective 1: Improve understanding‌‌ of human autotelic and aligned intelligence.
A first‌ objective of this project‌ is to advance our‌‌ fundamental understanding of the origins, mechanisms and functionalities‌ of autotelic learning and‌ exploration, and how this‌‌ interacts and is aligned‌ with collective dynamics. This will involve a combination‌ of computational models for developing new theories and‌ hypotheses, as well as design and analysis of‌ new human experimental paradigms analysed with these computational‌ models. Particular scientific questions we target include studying‌ links between autotelic learning, metacognition (one’s own ability‌ to know and control one’s own knowledge and‌ cognitive functions) and creativity in humans, e.g. how‌ do these skills develop in childhood and which‌ internal and external factors influence them ? How‌ do they link to language and processes of‌ social interaction ?
Objective 2: Building curiosity-driven autotelic‌ and aligned AI systems.
Our second major objective‌ will be to build and study curiosity-driven autotelic‌ artificial agents that learn by interacting with external‌ environments and within socio-cultural collectives. To do this,‌ we leverage and extend state-of-the art deep reinforcement‌ learning algorithms and transform them into autotelic RL‌ systems. Also and crucially, we will study how‌ algorithms for autotelic learning can be made better‌ aligned (teachable, driveable), more robust and more creative‌ using language both as a tool for social‌ interaction, and as a cognitive tool to make‌ abstractions and leveraging knowledge acquired by others. To‌ achieve this, we will use pre-trained generative AI‌ models as a cognitive tool bootstrap, enabling us‌ to address the poor sample efficiency (i.e. require‌ large amounts of environment interaction) and poor generalisation‌ of classical (autotelic) RL algorithms. In addition, autotelic‌ architectures will enable us to achieve incremental grounding‌ and alignment of generative AI models with external‌ physical and social dynamics. To improve further abstraction,‌ generalisation, we aim to establish links between autotelic‌ learning and program synthesis techniques, whereby 1) autotelic‌ generative models will self-improve their coding abilities by‌ setting learnable coding problems of increasing complexity; 2)‌ we will use code for autotelic procedural self-generation‌ of environments, tasks and policies. Because of its‌ expressivity and abstractness, we believe working in code‌ spaces will open new perspective on open-endedness. We‌ also aim to study how autotelic algorithms, using‌ the learning progress theory, will enable frugal adaptation‌ of generative models thanks to automatic curriculum learning.‌ Finally, we aim to study how groups of‌ autotelic language-augmented agents can work in group and‌ give rise to higher-order forms of autotelic collective‌ intelligence, using multi-agent autotelic reinforcement learning techniques and‌ measurement tools from the field of cultural evolution‌ to track collective innovations.
Objective 3: Applications in‌ education and assisted scientific discovery.

Objective 3.1: Train‌ curiosity-driven autotelic learning in humans across the lifespan.‌ We aim to develop educational technologies and interventions‌ that help children and adults across the lifespan‌ to learn in ways that are more motivating‌ and more efficient, for example by stimulating curiosity‌ and meta-cognition and using both models of curiosity‌ and generative AI. Our approach combines 1) an‌ interdisciplinary perspective using both cognitive science, educational sciences‌ and machine learning; 2) a user-centric approach with‌ real-world field studies, in particular with real classrooms in the French educational‌ system, or with field‌ studies with adult of‌‌ ageing populations; 3) consideration of both neurotypical and‌ neurodiverse populations. Beyond directly‌ training curiosity and meta-cognition,‌‌ and given their transversal role, we also aim‌ to study how personalised‌ training techniques (e.g. using‌‌ adaptive curriculum with algorithms that maximize learning progress‌ measures) can enable more‌ efficient and more motivating‌‌ training of disciplinary skills (e.g. maths, languages) and‌ other cognitive dimensions (attention,‌ working memory, etc). Beyond‌‌ showing effective impact in RCTs (randomized control trials),‌ our objective is that‌ our techniques and interventions‌‌ be used large-scale in the real world. To‌ achieve this, we will‌ combine focused collaboration with‌‌ the educational institutions and edTech industry, with user-centered‌ design of open licence‌ pedagogical material that will‌‌ aim to be directly and easily reusable by‌ teachers. We also aim‌ to help public action‌‌ in this domain, through interaction and advising with‌ national and European public‌ institutions.

Objective 3.2: Assisted‌‌ scientific discovery with autotelic exploration algorithms. Some of‌ the greatest scientific challenges‌ include the study and‌‌ design of novel materials, molecules or networks with‌ complex dynamics, where the‌ space of possible self-organized‌‌ behaviour is often initially mostly unknown, and the‌ space of parameters very‌ large, making exploration and‌‌ discoveries very costly and difficult for physicists, chemists‌ or biologists. We aim‌ to study and show‌‌ how autotelic aligned exploration algorithms can be used‌ as powerful discovery assistants‌ in these contexts. We‌‌ believe they have specific capabilities making them highly‌ relevant for this application:‌ they are made to‌‌ explore and discover in a sample efficient manner‌ a high diversity of‌ behaviours in complex systems‌‌ (autotelic), while being driveable so that scientists can‌ drive them in directions‌ of interest (aligned). To‌‌ maximize diversity, we aim to develop methods learning‌ a diversity of goal‌ representations (autotelic and quality-diversity‌‌ exploration using meta-diversity search). To enable abstractness and‌ high-level guidance from human‌ scientists (e.g. to provide‌‌ feedback on measures of interestingness), we aim to‌ leverage language and multimodal‌ generative models. To make‌‌ fast progress in this direction, we first aim‌ to use artificial life‌ environments, such as continuous‌‌ cellular automata, as an experimental domain, aiming to‌ use autotelic exploration algorithms‌ to help discover the‌‌ origins of autopoietic systems (and even autotelic systems‌ self-organised from the ground‌ up) as well as‌‌ study how evolutionary processes themselves could self-organise. For‌ further real world impact,‌ we aim to develop‌‌ collaborations with physics/chemistry/biology academic labs, as well as‌ various industrial companies working‌ on the design of‌‌ new physical or biomolecular systems.

Beyond core scientific‌ questions across disciplines, this‌ project addresses two key‌‌ societal challenges: 1) How can we build AI‌ systems that serve humans‌ and human societies in‌‌ their diversity, helping their curiosity and cultures to‌ bloom? 2) How can‌ we provide educational opportunities‌‌ for all children, and adults across the lifespan,‌ in a world with‌ many challenges, to become‌‌ intrinsically motivated learners, critical‌ thinkers, autotelic explorers?

3 Research program

3.1 Background:‌

Around the mid-20th century, psychologists started studying the‌ hypothesis that humans, and some other animals, are‌ endowed with mechanisms of intrinsic motivation, also called‌ “curiosity” in everyday language, leading them to spontaneously‌ explore novel activities for their own sake. Such‌ curiosity-driven exploration processes were hypothesised to play important‌ roles in learning, both in cognitive and educational‌ sciences: however, until the start of the 21st‌ century, research for understanding of the underlying mechanisms‌ was still very scarce. This also explains why‌ such mechanisms were overlooked in machine learning and‌ robotics.

In the first years of the 2000s,‌ several labs in the world began studying these‌ mechanisms through proposing various computational theories and hypotheses.‌ Among these groups, Pierre-Yves Oudeyer and his colleagues,‌ first at Sony CSL Paris and then at‌ Inria Bordeaux, proposed several theoretical ideas and techniques‌ to build some of the foundations of a‌ new emerging field studying curiosity at the cross-roads‌ of AI, machine learning, cognitives sciences, psychology and‌ neuroscience. In particular, one major contribution has been‌ the development of the Learning Progress Hypothesis (LPH),‌ proposing that human brains are intrinsically motivated to‌ explore activities with high learning progress, leveraging meta-cognitive‌ processes and leading to the self-organisation of efficient‌ learning curricula 113, 143. A second‌ major contribution has been the development of a‌ theoretical framework to account for autotelic learning, a‌ form of learning where individuals learn to represent,‌ sample and pursue their own goals 141,‌ 89.

Based on several proof-of-concept studies of‌ these computational theories 142, the Flowers team‌ was founded in 2011 by Pierre-Yves Oudeyer (joining‌ Inria) and David Filliat (Ensta ParisTech), with a‌ research program aiming at scaling up these theories‌ along two main dimensions: 1) showing how the‌ LPH could account for key properties of sensorimotor‌ in human infants; 2) showing how it was‌ possible to develop curiosity-driven autotelic learning algorithms that‌ would enable high-dimensional real world robots to acquire‌ complex sensorimotor skills in a human-like way. Several‌ major results were achieved in the 2011-2016 period‌ along these lines.

In the 2017-25 period, we‌ have operated a strategic scientific and applicative pivot:‌ while keeping curiosity-driven autotelic learning in humans and‌ machines as our core research activity, we 1)‌ started projects testing our theoretical predictions in human‌ psychology experiments, and articulated links between curiosity and‌ metacognition; 2) Integrated modern Deep RL techniques with‌ autotelic algorithms, and shifted from the developmental robotics‌ to the machine learning community as target of‌ our contributions to the design of more open,‌ flexible and robust learning machines; 3) Shifted from‌ sensorimotor autotelic learning to language-based abstract yet grounded‌ autotelic learning, and built synergetic bridges with recent‌ advances in generative AI; 4) Scaled up our‌ research in educational technology by taking a translational‌ approach and developing industrial collaborations, with actions to‌ support public policies; 5) Started the new application domain of automated scientific‌ discovery. These constitute the‌ pillars of our current‌‌ research program, structured as follows:

3.2 Understanding Autotelic‌ Learning in Humans

3.2.1‌ Curiosity, meta-cognitionand agency across‌‌ the lifespan.

The Learning Progress hypothesis, as well‌ as other theories of‌ curiosity-driven learning, all assume‌‌ meta-cognitive competencies (e.g. ability to evaluate one’s own‌ uncertainty, knowledge gaps or‌ learning progress) as well‌‌ as forms of agency. However, experimental studies of‌ human curiosity have so‌ far mostly overlooked studying‌‌ the influence of meta-cognition and agency, let alone‌ simply measure them together‌ with various dimensions of‌‌ curiosity 132. Another major limit of current‌ models and experimental studies‌ of human curiosity has‌‌ been that they have not studied how curiosity‌ develops across the lifespan.‌ Actually, the scientific community‌‌ knows very little on how various forms of‌ curiosity change across childhood,‌ adolescence, and up to‌‌ ageing populations.

We will aim to address some‌ of these limitations by‌ collaborating with various international‌‌ groups, including M. Gruber (Univ. Cardiff) and Y.‌ Fandakova (Max Planck Institute‌ for Human Development) with‌‌ whom we just submitted a major ANR/DFG/ESRC project‌ on this topic. In‌ particular, we propose an‌‌ interdisciplinary approach to make new breakthroughs in understanding‌ how metacognition contributes to‌ the development of curiosity-based‌‌ learning, and set the stage for educational interventions‌ that could help children‌ develop their curiosity. Given‌‌ the links between curiosity and metacognition, and the‌ fact that metacognition continues‌ to improve across childhood‌‌ and adolescence, we formulate the hypothesis that the‌ efficiency of curiosity-based learning,‌ i.e. the ability to‌‌ inquire about and prioritise learning of information associated‌ with high curiosity, improves‌ across child and adolescent‌‌ development.

3.2.2 Experimental paradigms for studying autotelic learning‌ in humans.

Another limit‌ of existing experimental studies‌‌ of human curiosity, including the ones mentioned above,‌ has been that most‌ of them focused so‌‌ far on studying how humans prefer exploring one‌ of several pre-existing stimuli‌ or learning activities 163‌‌. However, as shown in our theoretical and‌ AI work described above,‌ and as argued in‌‌ complementary arguments from Laura Schulz and Junyi Chu‌ 84, exploration of‌ self-generated goals, including arbitrary‌‌ goals or games, may be key in accounting‌ for human development, and‌ further in accounting for‌‌ human innovation and cultural evolution 85. Only‌ very few exploratory experimental‌ protocols have started to‌‌ be investigated in the literature 93: we‌ aim to further develop‌ this form of experimental‌‌ protocol, informed by predictions made by our theoretical‌ models, in collaboration with‌ researchers such as G.‌‌ Molinaro and A. Collins (Univ. Berkeley, both in‌ a 6 months research‌ visit at Inria Flowers‌‌ and Mnemosyne in 2024) J. Chu (Harvard Univ,‌ US), L. Rat-Fischer (Univ.‌ Nanterre) and A. Ruggeri‌‌ (TU Munich).

3.2.3 Links between curiosity and creativity‌ for autotelic learning in‌ children:

The ability to‌‌ imagine abstract and new goals is essential for‌ creative discovery and open-ended‌ learning throughout life. Children‌‌ achieve this by using‌ the compositionality of language as a tool to‌ imagine situations they have never experienced before, targeting‌ them as goals during play 147, 168‌. Echoing the IMAGINE architecture 88, an‌ intrinsically motivated deep reinforcement learning architecture modelling compositional‌ imagination (the creation of new linguistic associations for‌ new goals), we aim to investigate the links‌ between curiosity and creativity in humans, focusing on‌ the metacognitive role of language in guiding autonomous‌ learning behaviours. Although the nature of the links‌ between curiosity and creativity is currently not well‌ defined, a recent meta-analysis shows that higher levels‌ of curiosity are significantly associated with higher levels‌ of creativity 156. Divergent thinking mechanisms are‌ said to be the cognitive resource common to‌ both skills 104, 117, and some‌ authors even identify curiosity as a facilitator, a‌ trigger for creativity 106: high curiosity states‌ induce better ideation and greater idea associations conducive‌ to problem solving. Also, both creativity and curiosity‌ are governed by metacognitive processes of self-regulation of‌ learning 151 enabling the identification of information gaps,‌ problem situations or uncertainties, the generation of ideas,‌ paths to resolution, and monitoring and evaluating the‌ value of ideas as creative output or as‌ majoring knowledge 109, 112. We aim‌ to investigate developmental differences on curiosity-based learning and‌ problem-solving tasks while studying their relationships and their‌ dependency to intrapersonal factors (especially metacognitive skills and‌ personality dimensions such as epistemic curiosity, creativity or‌ intellectual humility traits) in late childhood (from 6‌ to 11 yo). To achieve experiments needed to‌ address these topics, we will leverage an educational‌ Léa-Ifé collaboration network established with 10 primary schools‌ around Bordeaux. As a whole, in this part‌ of the project, we aim to demonstrate that‌ curiosity as a process to seek knowledge in‌ the face of self-generated goals of knowledge gaps,‌ or as a metacognitive feeling 103, leads‌ to better initiation of the creative process.

3.2.4‌ Curiosity to learn about others and social interaction:‌

Social curiosity is defined as the desire to‌ acquire knowledge about others in society, encompassing an‌ interest in their emotions, thoughts, and behaviours. This‌ type of curiosity can be divided into two‌ forms 146: 1) empathetic curiosity (the desire‌ to acquire knowledge about others), and 2) relational‌ curiosity (the desire to interact with others). Like‌ other types of curiosity, social curiosity motivates people‌ to engage in exploratory behaviours directed toward the‌ social world, seeking novel information about how people‌ think, behave, and feel. 110 proposes three functions‌ of social curiosity: 1) acquiring information useful for‌ learning and development, 2) establishing interpersonal relationships and‌ increasing a sense of social belonging, and 3)‌ controlling the social world by making it more‌ predictable and manageable. Thus, social curiosity enhances social‌ functioning and has been linked to improved social‌ behaviour adaptation, the ability to establish and maintain‌ social relationships, and better social judgement abilities 110. Recently, another distinction‌ has been proposed in‌ social curiosity 114:‌‌ 1) overt social curiosity, an explicit interest in‌ understanding other people, which‌ motivates direct communication with‌‌ others; 2) covert social curiosity, an “hidden” interest‌ that motivates more indirect‌ and furtive behaviours to‌‌ understand others, such as discreetly observing people, listening‌ to others’ conversations, and‌ reading tabloids and human-interest‌‌ stories. Covert curiosity is often associated with negative‌ outcomes like gossiping or‌ spying 114, but‌‌ it can also drive the understanding of the‌ social world through observation‌ and finally motive interactions‌‌ with others 105. On the other hand,‌ overt social curiosity has‌ been linked with open-mindedness,‌‌ extraversion, and sociability 114, and was associated‌ with better job performance‌ 105.

3.2.5 Autotelic‌‌ game invention and cultural transmission.

Leveraging the theoretical‌ ideas on the interaction‌ between autotelic learning and‌‌ cultural evolution as described in the previous section,‌ we also aim to‌ study experimentally these interactions‌‌ in chains of humans incentivized to transmit to‌ each other games or‌ artefacts of their own‌‌ intrinsically motivated invention (either physical or video games,‌ e.g. using experimental setups‌ like 93). We‌‌ aim to design new experimental protocols and run‌ them both in various‌ age ranges in European‌‌ populations, as well as in populations in non‌ western culture leveraging associated‌ collaborations with Maxime Derex‌‌ at IAST, Toulouse, Sheina Lew-Levy at Durham University,‌ and Sarah Pope-Caldwell at‌ Georgia State University.

3.3‌‌ Building Curiosity-Driven Autotelic and Aligned AI

3.3.1 Language-Augmented‌ Autotelic Agents with Foundational‌ Models

We will develop‌‌ architectures where LLMs function as cognitive tools for‌ autotelic RL agents across‌ five dimensions: (1) LLM-based‌‌ agents with environmental alignment—extending our work on grounding‌ LLMs through online RL‌ 79 where LLMs generate‌‌ goals, evaluate achievement, relabel experiences, and provide natural‌ language interfaces. We will‌ extend goal generation to‌‌ creative, time-extended, and learning-oriented goals including self-generated causal‌ questions and hypotheses 101‌, while correcting hallucinations‌‌ through incremental LLM updates via environment interaction. (2)‌ Multimodal grounding and social‌ environments—extending our SocialAI School‌‌ framework 118 to incorporate theory of mind, joint‌ intentionality, and social norms,‌ investigating whether social curiosity‌‌ can drive efficient acquisition of complex social skills.‌ (3) Real-time human-in-the-loop learning—enabling‌ agents to interpret instructions,‌‌ respond to feedback, explain exploration processes, and adapt‌ to user preferences for‌ education and discovery applications.‌‌ (4) Learning to use cognitive tools—agents will learn‌ when to invoke APIs,‌ generate/execute code, query knowledge‌‌ bases, or request human assistance, including chain-of-thought and‌ self-reflection mechanisms. (5) Metacognitive‌ curriculum learning with coordinated‌‌ interestingness measures—leveraging our MAGELLAN architecture 47 which enables‌ LLM agents to learn‌ metacognitive predictions of their‌‌ own competence and learning progress across large language-defined‌ goal spaces. By capturing‌ semantic relationships between goals,‌‌ MAGELLAN enables sample-efficient progress estimation and dynamic adaptation‌ to evolving goal spaces.‌ We will extend this‌‌ to develop meta-diversity search algorithms 97 leveraging LLMs‌ to generate novel conceptual‌ dimensions, enabling exploration across‌‌ objective (learning progress, novelty)‌ and subjective, culturally-contextualized criteria 158. Long-term objectives‌ will include studying how agents pursue goals across‌ extended timescales using cultural artifacts for long-term planning.‌

3.3.2 Program Synthesis for Abstract and Verifiable Intelligence‌

We will explore autotelic learning in formal language‌ spaces where goals and policies are represented as‌ programs. Our ACES architecture demonstrates autotelic LLMs self-improving‌ coding skills by iteratively generating diverse problems and‌ solutions using code interpreters 150. This addresses‌ three limitations: environments are not truly open-ended (code‌ is), LLMs lack grounding (interpreters provide it), and‌ code LLMs struggle beyond training distributions (autotelic learning‌ enables self-improvement). We will train small models with‌ advanced coding capabilities and extend to mathematical problem‌ invention and theorem proving. Within Inria LLM4Code, we‌ will develop autotelic LLMs interacting with proof assistants‌ like Coq and Lean 157. This approach‌ provides compact interpretable representations, formal verification, and compositional‌ generalization. Progress on benchmarks like ARC 83 will‌ validate these methods.

3.3.3 Curiosity in Cultural Evolution,‌ Collective Intelligence, and AI Science Teams

Understanding how‌ groups coordinate curiosity-driven exploration and self-organize collective intelligence‌ represents both a fundamental scientific question and a‌ path toward transformative applications. This research direction addresses‌ several interconnected challenges. First, how individual curiosity combines‌ with social transmission and collective innovation 111,‌ 95 is still poorly understood and modeled. Second,‌ as generative AI increasingly participates in human cultural‌ production, understanding cultural evolution in hybrid human-AI groups‌ becomes essential for anticipating societal impacts. Third, many‌ scientific and creative challenges require coordinated teams leveraging‌ complementary expertise and perspectives—motivating our vision of autotelic‌ AI science teams collaborating with human researchers.

Near-term‌ work will investigate coordination when self-generated goals conflict‌ in shared environments. Some goals require collaboration with‌ agents possessing complementary skills—agents must negotiate joint goals‌ serving individual curiosity. We will study how network‌ topology influences innovation dynamics 137, how agents‌ develop communication protocols 75, and whether groups‌ display curiosity at collective levels—pursuing structured exploration maximizing‌ diversity of learned goals through simple individual-level mechanisms.‌ The increasing role of generative AI in cultural‌ production necessitates understanding these dynamics in hybrid settings—the‌ emerging field of "Machine Culture" 77. We‌ will systematically investigate how interaction protocols, social structures,‌ and model capabilities shape cultural dynamics in LLM‌ populations and mixed human-AI groups (collaboration with Derex,‌ IAST).

3.4 Applications in Education and Scientific Discovery‌

3.4.1 Training Curiosity and Metacognition Across the Lifespan‌

Addressing 21st century educational challenges—inclusive education, cross-disciplinary skills‌ (attention, curiosity, learning to learn), and digital transformation—requires‌ technologies fostering curiosity and metacognition. Our ZPDES algorithm‌ personalizes curricula by maximizing learning progress 87.‌ RCTs with >1,000 children demonstrated enhanced learning efficiency‌ and motivation versus expert-designed curricula. We will extend‌ ZPDES to new domains (attention training, language learning)‌ and populations: aging adults, neurodiverse learners, professional contexts‌ (sports, gaming). To address ZPDES's requirement for expert-formatted‌ content, we will leverage generative AI to automate‌ exercise generation from textbooks, training smaller LLMs for lightweight systems avoiding foreign-hosted‌ dependencies. We will scale‌ up metacognitive skills and‌‌ curious question training 69, studying transfer to‌ creativity and including pioneering‌ work using GPT-3 conversational‌‌ agents in real classrooms 70.

A distinctive‌ objective involves metacognitive empowerment:‌ developing interventions helping children‌‌ understand their own learning progress to self-generate curricula‌ independently—addressing limited technology contexts‌ while fostering autonomy. Long-term‌‌ work will include teacher training programs embedding curiosity-fostering‌ practices (Peterson, 2020) and‌ unplugged activity versions. Partnerships‌‌ with educational institutions (Académie de Bordeaux), industry (EvidenceB,‌ Ubisoft), and NGOs (France‌ IOI) will enable deployment‌‌ and policy influence.

3.4.2 Assisted Scientific Discovery with‌ Autotelic Exploration

Scientists studying‌ complex systems face challenges‌‌ mapping behavioral spaces when lacking models and representations‌ with scarce experimental resources.‌ Autotelic algorithms offer sample-efficient‌‌ diverse behavior discovery while remaining steerable through natural‌ language. Proof-of-concept work efficiently‌ maps spaces in cellular‌‌ automata 98 and gene regulatory networks (99‌, with Levin, Harvard),‌ independently adopted by physics‌‌ researchers (U. Washington). We will maximize diversity through‌ meta-diversity search and leverage‌ language/multimodal models for abstract‌‌ guidance.

Near-term work will leverage artificial life as‌ testbeds, studying origins of‌ autopoietic systems and self-organizing‌‌ evolutionary processes in cellular automata 34, 149‌ (2023 Best Paper ALife).‌ Long-term objectives will transition‌‌ to real-world systems through collaborations: Levin (synthetic biology),‌ Murugan (soft condensed matter,‌ U. Chicago), Aymonier (chemistry,‌‌ ICMCB Bordeaux), with applications to power networks, neuromuscular‌ models, and artistic domains.‌ We will explore autotelic‌‌ algorithms for mathematical problem/proof exploration within LLM4Code. Success‌ will establish autotelic exploration‌ as methodology for materials‌‌ science, systems biology, and mathematical discovery, with industrial‌ translation potential (e.g. Solvay/Syensqo).‌

4 Application domains

Neuroscience,‌‌ Developmental Psychology and Cognitive Sciences Being primarily experts‌ in curiosity and its‌ links with open-ended learning,‌‌ our aim has been to build and grow‌ internationally an integrated science‌ of curiosity. By leveraging‌‌ and integrating concepts and techniques also often used‌ in AI, psychology and‌ education, we aim to‌‌ reinforce our existing contributions in this direction, ranging‌ from building theories and‌ experiments that add to‌‌ the corpus of scientific knowledge on curiosity, to‌ leading the organisation of‌ international events dedicated to‌‌ this integrated science. As an example, co-leading the‌ organisation of a Gordon‌ Research Conference series entitled‌‌ “The New Science of Curiosity” (see).‌ Complementarily, a European ORA‌ project on the cognitive‌‌ science study of curiosity and metacognition (with M.‌ Gruber and Y. Fandakova).‌ Other examples are the‌‌ study of the role of intrinsic motivation in‌ adoption of technologies fostering‌ autonomy in ageing populations,‌‌ with a view to assessing its positive value‌ against cognitive aging as‌ a protective ingredient. This‌‌ includes: CuriousTECH associate team with M. Fernendes from‌ the Cognitive Neuroscience Lab‌ of the University of‌‌ Waterloo, the InnovCare project (with S. Lechevalier) within‌ the PPR Autonomie-France 2030‌ (and with Fondation France-Japan‌‌ of EHESS), the project VBHI - France 2030‌ (IHU, S. Debette), with‌ F. Lotte and F.‌‌ Wagner from Inria.

Development‌ and open-endedness in generative AI There have been‌ revolutionary advances in AI in the last few‌ years, especially around generative systems such as multi-modal‌ foundational models. However, as described above, these systems‌ are still strongly limited in several key dimensions:‌ they are not pro-active agents interacting with external‌ environments, they lack grounding, meta-cognition and curiosity. One‌ of our goals is to make fundamental scientific‌ and technological contributions to adapt and extend current‌ generative AI systems by integrating forms of curiosity,‌ meta-cognition and grounding, for which we recently made‌ proofs of concepts, and vice-versa take advantage of‌ powerful capabilities of foundational models to build new‌ kinds of curiosity-driven learning systems capable of creative‌ and abstract exploration learning and discovery.

Machine culture‌ Beyond technological advances, generative AI is also starting‌ to have a major influence on human cultural‌ evolution. They are now massively used as intermediation‌ platforms between individuals and existing corpuses of knowledge‌ and culture, conveying multiple forms of biased cultural‌ perspectives that they can amplify. This phenomenon has‌ recently become massive as social networks are pervaded‌ by bots powered by generated AI systems, playing‌ the roles of humans with particular opinion or‌ backgrounds, and increasingly interacting directly among each other,‌ beyond interaction with humans. While generative AI offers‌ unique potential in enabling humans make discoveries and‌ know and understand each others' cultures, these properties‌ have also been leveraged by diverse organisations to‌ influence in unfair and dangerous manners what populations‌ think and do. Even though this poses major‌ societal issues, this evolution has been so rapid‌ that basic scientific understanding of cultural evolution in‌ hybrid human-machine groups is strongly lacking. Thus, we‌ believe the parts of our project which aim‌ at modelling cultural evolution in groups of generative‌ AI agents, or hybrid groups, as well as‌ its links with properties of curiosity-driven learning at‌ the level of individuals, has a potential to‌ make very useful contributions to these high stake‌ issues

Translational educational technologies that foster curiosity-driven and‌ critical mind We live in a world that‌ is evolving fast: global factors such as climate‌ change and geopolitical processes fragilize the context children‌ live in. New technologies, such as generative AI,‌ are profoundly impacting economic dynamics, democracy and cultural‌ evolution. Yet, in most educational contexts, including in‌ Europe, what is taught in classrooms is very‌ similar to what was taught 50 years ago.‌ And even for so-called “fundamental knowledge”, studies such‌ as PISA show a worrying decrease of skills‌ and motivation in children. As mentioned in a‌ recent report from OECD 166, we believe‌ it is essential to train children to become‌ autonomous lifelong learner, through fostering and training their‌ curiosity and their critical minds, their ability to‌ go search by themselves new information, and to‌ question the validity of information they collect, as‌ well as question their own knowledge and opinions.‌ Thus, our research program aimed to train curiosity and the associated metacognitive‌ skills that underlie the‌ critical mindset, has the‌‌ potential to contribute in this perspective. We aim‌ to leverage our fundamental‌ research in translational projects‌‌ where we will work directly with major educational‌ stakeholders from the start‌ (e.g. students, teachers, parents,‌‌ educational institutions like individual schools, Académie de Bordeaux,‌ edTech companies like EvidenceB,‌ government and in particular‌‌ ministry of education) to build educational interventions that‌ will be efficient, adapted‌ to the needs and‌‌ constraints of real world educational contexts, and with‌ the aim of large‌ scale adoption and use‌‌ (a first step in this direction are the‌ AdaptivMaths and MIA Seconde‌ educational software now deployed‌‌ in all French primary schools and supported by‌ the French ministry of‌ education).

Generative AI and‌‌ education: scientific understanding of stakes and opportunities in‌ support of public policies‌ One particular topic we‌‌ focus on is the study of the opportunities‌ and challenges of generative‌ AI in education. While‌‌ very recent (ChatGPT was introduced only 1.5 years‌ ago), generative AI has‌ already very importantly impacted‌‌ the educational world in the last few months.‌ More than 50% of‌ children in the 12-18‌‌ age range have already used generative AI systems‌ for their homework, and‌ this tendency is quickly‌‌ rising, including in Europe. Associated challenges include forms‌ of uses of generative‌ AI by students that‌‌ may harm their abilities to learn, understand, and‌ be motivated to put‌ effort and be actively‌‌ engaged in these processes. Also, it impacts profoundly‌ the way teachers design‌ homework - for which‌‌ students are already massively using these tools. On‌ the other hand, generative‌ AI opens unique opportunities‌‌ for rich personalised tutoring, ranging from opportunities to‌ obtain tailored explanations and‌ feedback, to getting the‌‌ opportunity to discuss and train in foreign languages.‌ Such opportunities may be‌ particularly magnified for countries‌‌ where the educational system is underdeveloped 135.‌ Key aspects of our‌ research program are geared‌‌ towards studying these opportunities and challenges, for example‌ running field studies in‌ middle and high schools‌‌ to understand how students currently (mis)understand and (mis)use‌ generative AI tools. In‌ complement, we continue working‌‌ on outreach, especially developing educational tools enabling to‌ improve generative AI literacy‌ in students, teachers and‌‌ parents, for example by further developing and disseminating‌ our pedagogical video series‌ “ChatGPT explained in 5mn”‌‌ (see), which has has been integrated in‌ various tools from DNE‌ and in the European‌‌ mooc made for introducing teachers to AI (the‌ AI4Teacher mooc). Participating in‌ popular science events, visiting‌‌ middle and high schools and welcoming students in‌ the lab, writing popular‌ science books (such as‌‌ “C'est pas moi, c'est l'IA”, published by Nathan),‌ and participating in discussions‌ on these topics in‌‌ wide audience media constitute another application axis. Given‌ the high societal challenges‌ associated with this line‌‌ of work, we also aim to strongly develop‌ our activities in informing‌ and supporting public policies:‌‌ a key vector for‌ such public support is actively participating in interactions‌ and discussions with public bodies that analyze current‌ stakes and propose new actions and laws. In‌ this lens we recently supported Inria in writing‌ notes on generative AI and its societal dimensions‌ for the cabinet of E. Macron, we participated‌ in interviews from senators preparing a report on‌ AI and education. We made presentations of the‌ stakes associated to training curiosity and metacognition using‌ AI technologies at Conseil Scientifique de l'Education Nationale,‌ and at an annual scientific event organized by‌ DNE (Direction du Numérique Educatif) and were invited‌ by BPI to participate to evaluation and monitoring‌ of projects related to education/edTech by this institution.‌ We are also working to develop collaborations between‌ Inria and the UK AI safety institute, towards‌ building a French institution similar to the UK‌ one. This includes developing a collaboration with Chris‌ Summerfield on doing field studies to assess the‌ current state of use of generative AI in‌ middle and high schools to inform public policies‌ on this topic. Lastly, our activities aimed at‌ sharing AI models and data that are fully‌ open-source (open weights and open data) and trained‌ on data associated with appropriate rights (we are‌ here also collaborating with the Hugging Face company‌ to distribute these open models and data on‌ their platform). For example, we recently built a‌ project with the EvidenceB company, in collaboration with‌ Région Ile-de-France, to build an open model trained‌ on data from free manuals, for which authors‌ will be retribution in an appropriate manner: this‌ kind of model will enable wider and legally‌ compliant access to AI models by the edTech‌ ecosystem in France.

Automated discovery in science Machine‌ learning algorithms integrating intrinsically-motivated goal exploration processes (IMGEPs)‌ with flexible modular representation learning are very promising‌ directions to help human scientists discover novel structures‌ in complex dynamical systems, in fields ranging from‌ biology to physics. The automated discovery project aims‌ to boost the efficiency of these algorithms by‌ empowering discovery in science and engineering. These entail‌ real-world applications with high societal stakes, such as‌ helping scientists make new discoveries that may for‌ e.g. help build more sustainable materials, generate cleaner‌ energy or save energy, find molecules with medical‌ applications, design accessible and efficient educational tools, or‌ help design more sustainable forms of plant growing‌ in agriculture. In many cases, the complexity of‌ self-organising materials or biological systems involves significant scientific‌ and engineering challenges for understanding, controlling and inventing.‌ Following several of our recent proof-of-concept projects 96‌, 99, we aim to do translational‌ research also in this domain, enabling chemists, physicists‌ and biologists, in both academia and industry, to‌ efficiently use our tools for curiosity-driven exploration to‌ help them make new discoveries. In particular, we‌ are now starting exploring several new collaborations in‌ these fields: with Solvay/Syensqo we have started several‌ discussions to develop collaborations on using autotelic exploration algorithms to efficiently explore‌ and map the space‌ of material design and‌‌ properties, with the aim to help scientists at‌ Syensqo to discover new‌ materials with high environmental‌‌ and functionality properties; with IRT Saint Exupery,‌ we have an ongoing‌ consortium collaboration around the‌‌ project AIxIA, where we study the use of‌ autotelic exploration algorithms to‌ map the space of‌‌ interference behaviours on embedded software and hardware.

Building‌ self-organising AI from the‌ ground up When using‌‌ continuous cellular automata as a playground for designing‌ and evaluating our algorithms‌ for curiosity-driven automated discovery‌‌ for the sciences, we are also actually making‌ direct contributions to the‌ domain of Artificial Life.‌‌ In particular, we believe the tools and approach‌ we are taking, in‌ particular exploring the self-organisation‌‌ of sensorimotor agency and open-ended evolutionary processes, has‌ the potential to have‌ significant impact in this‌‌ domain. This has been attested recently by our‌ Best Paper award at‌ the Alife 2023 conference‌‌ (and also wider impact, e.g. through > 2‌ millions views of the‌ popular science videos of‌‌ Sciences Etonnantes and EGO presenting - in part‌ - our work on‌ this topic). As we‌‌ are aiming to study the self-organisation of basic‌ forms of memory, learning,‌ and even autotelic learning,‌‌ in such environments, this may also constitute a‌ foundational approach to build‌ AI systems from the‌‌ ground up, possibly opening new possibilities in terms‌ of robustness, adaptivity and‌ generalisation

5 Social and‌‌ environmental responsibility

5.1 Footprint of research activities

AI‌ is a field of‌ research that currently requires‌‌ a lot of computational resources, which is a‌ challenge as these resources‌ have an environmental cost.‌‌ In the team we try to address this‌ challenge in two ways:‌

by working on developmental‌‌ machine learning approaches that model how humans manage‌ to learn open-ended and‌ diverse repertoires of skills‌‌ under severe limits of time, energy and compute:‌ for example, curiosity-driven learning‌ algorithms can be used‌‌ to guide agent's exploration of their environment so‌ that they learn a‌ world model in a‌‌ sample efficient manner, i.e. by minimizing the number‌ of runs and computations‌ they need to perform‌‌ in the environment;
by monitoring the number of‌ CPU and GPU hours‌ required to carry out‌‌ our experiments. For instance, our work 9 used‌ a total of 2.5‌ cpu years. More globally,‌‌ our work uses large scale computational resources, such‌ as the Jean Zay‌ supercomputer platform, in which‌‌ we use several hundred thousands hours of GPU‌ and CPU each year.‌

5.2 Impact of research‌‌ results

Our research activities are organized along two‌ fundamental research axis (models‌ of human learning and‌‌ algorithms for developmental machine learning) and one application‌ research axis (involving multiple‌ domains of application, see‌‌ the Application Domains section). This entails different dimensions‌ of potential societal impact:‌

Towards autonomous agents that‌‌ can be shaped to human preferences and be‌ explainable We work on‌ reinforcement learning architectures where‌‌ autonomous agents interact with‌ a social partner to explore a large set‌ of possible interactions and learn to master them,‌ using language as a key communication medium. As‌ a result, our work contributes to facilitating human‌ intervention in the learning process of agents (e.g.‌ digital assistants, video games characters, robots), which we‌ believe is a key step towards more explainable‌ and safer autonomous agents.
Reproducibility of research:‌ By releasing the codes of our research papers,‌ we believe that we help efforts in reproducible‌ science and allow the wider community to build‌ upon and extend our work in the future.‌ In that spirit, we also provide clear explanations‌ on the statistical testing methods when reporting the‌ results.
Digital transformation and Competences' challenges facing schools‌ in the 21st century. We expect our findings‌ to inform the broader societal challenges inherent to‌ the School of the 21st Century, ranging from‌ helping children (and their teachers) to develop cross-domain‌ skills for learning such as curiosity and meta-cognition,‌ while improving inclusivity in schools (learners with disabilities,‌ especially cognitive disabilities) as well as promoting lifelong‌ learning in older adults (successful aging), using cognitive-based‌ research findings.
AI and personalized educational technologies to‌ reduce inequalities due to neurodiversity The Flowers team‌ develops AI technologies aiming to personalize sequences of‌ educational activities in digital educational apps: this entails‌ the central challenge of designing systems which can‌ have equitable impact over a diversity of students‌ and reduce inequalities in academic achievement. Using models‌ of curiosity-driven learning to design AI algorithms for‌ such personalization, we have been working to enable‌ them to be positively and equitably impactful across‌ several dimensions of diversity: for young learners or‌ for aging populations; for learners with low initial‌ levels as well as for learners with high‌ initial levels; for "normally" developping children and for‌ children with developmental disorders; and for learners of‌ different socio-cultural backgrounds (e.g. we could show in‌ the KidLearn project that the system is equally‌ impactful along these various kinds of diversities).
Health:‌ Bio-printing The Flowers team is studying the use‌ of curiosity-driven exploration algorithm in the domain of‌ automated discovery, enabling scientists in physics/chemistry/biology to efficiently‌ explore and build maps of the possible structures‌ of various complex systems. One particular domain of‌ application we are studying is bio-printing, where a‌ challenge consists in exploring and understanding the space‌ of morphogenetic structures self-organized by bio-printed cell populations.‌ This could facilitate the design and bio-printing of‌ personalized skins or organoids for people that need‌ transplants, and thus could have major impact on‌ the health of people needing such transplants.
Tools‌ for human creativity and the arts Curiosity-driven exploration‌ algorithms could also in principle be used as‌ tools to help human users in creative activities‌ ranging from writing stories to painting or musical‌ creation, which are domains we aim to consider‌ in the future, and thus this constitutes another‌ societal and cultural domain where our research could have impact.
Education to‌ AI As artificial intelligence‌ takes a greater role‌‌ in human society, it is of foremost importance‌ to empower individuals with‌ understanding of these technologies.‌‌ For this purpose, the Flowers lab has been‌ actively involved in educational‌ and popularization activities, in‌‌ particular by designing educational robotics kits that form‌ a motivating and tangible‌ context to understand basic‌‌ concepts in AI: these include the Inirobot kit‌ (used by >30k primary‌ school students in France‌‌ (see) and the Poppy Education kit‌ (see) now‌ supported by the Poppy‌‌ Station educational consortium (see)
Health: optimization‌ of intervention strategies during‌ pandemic events Modelling the‌‌ dynamics of epidemics helps proposing control strategies based‌ on pharmaceutical and non-pharmaceutical‌ interventions (contact limitation, lock‌‌ down, vaccination, etc). Hand-designing such strategies is not‌ trivial because of the‌ number of possible interventions‌‌ and the difficulty to predict long-term effects. This‌ task can be cast‌ as an optimization problem‌‌ where state-of-the-art machine learning algorithms such as deep‌ reinforcement learning, might bring‌ significant value. However, the‌‌ specificity of each domain – epidemic modelling or‌ solving optimization problem –‌ requires strong collaborations between‌‌ researchers from different fields of expertise. Due to‌ its fundamental multi-objective nature,‌ the problem of optimizing‌‌ intervention strategies can benefit from the goal-conditioned reinforcement‌ learning algorithms we develop‌ at Flowers. In this‌‌ context, we have developped EpidemiOptim, a Python toolbox‌ that facilitates collaborations between‌ researchers in epidemiology and‌‌ optimization (see).

6 Highlights of the‌ year

Renewal of the‌ team: After a decade‌‌ of research and applications, the team was renewed‌ and is now named‌ the Flowers AI &‌‌ CogSci Lab. This new name highlights our activities‌ at the cross-roads of‌ AI and cognitive sciences,‌‌ studying curiosity and its roles in open-ended learning‌ in humans and machines,‌ from individuals to collectives.‌‌ Our new detailed research program is available here‌. The team is‌ associated with both Inria‌‌ and the University of Bordeaux, France.
Understanding human‌ curiosity and metacognition. We‌ started a new European‌‌ project in collaboration with cognitive neuroscience labs of‌ M. Gruber's in Univ.‌ Cardiff, and Y. Fandakova's‌‌ in University of Trier, aiming to study the‌ joint development of curiosity‌ and metacognition in adolescents,‌‌ through a set of behavioural and neuro-imaging studies.‌ This project also aims‌ to leverage new insights‌‌ to be applied in educational technologies. We collaborated‌ with Alexandr Ten, Michiko‌ Sasaki and Kou Murayama‌‌ (Univ. Tuebingen) in producing a theoretical framework enabling‌ to integrate multiple theories‌ of curiosity developed across‌‌ the litterature, and relating them to the well-known‌ "Curious U" effect: this‌ framework was published in‌‌ the Open Mind journal 42. Collaborating with‌ A. Tricot (Univ. Montpellier),‌ we developped a theoretical‌‌ perspective to study the links between intrinsic motivation‌ and cognitive load in‌ the context of extended-reality‌‌ educational interventions 39, and published an associated‌ study about the use‌ of virtual reality for‌‌ optimizing cognitive load and‌ intrinsic motivation in educational technologies 38. Finally,‌ we collaborated with M. Derex (IAST, Toulouse), as‌ well as with Sheina Lew-Levy (Durham University) and‌ Sarah Pope-Caldwell (Georgia State University), in the design‌ and implementation of a study of cross-cultural similarities‌ and differences in curiosity-driven exploration, conducted in Congo‌ with Bayaka and Bandongo populations.
Autotelic curiosity and‌ open-ended learning in agentic generative AI. We continued‌ building the foundations of a new generation of‌ genAI systems that are open-ended, curious, autotelic, grounded‌ and continuously self-improving. To do so, we leveraged‌ GLAM, an approach we designed 3 years‌ ago to turn LLMs into agents that learn‌ to solve goals in interactive environments though online‌ RL (not produce texts that humans like, but‌ achieve practical goals!), as the basis for building‌ curious agents that sample their own goals. We‌ designed MAGELLAN (published at ICML 2025), a‌ method enabling genAI agents to navigate very large‌ spaces of goals, where millions of them may‌ be either two easy or difficult 47.‌ MAGELLAN makes it possible by leveraging the learning‌ progress hypothesis, which we developed to account for‌ human curiosity-driven learning: goals that are sampled in‌ priority are those with high expected learning progress.‌ Achieving this requires advanced metacognitive skills, which LLMs‌ lacked so far: MAGELLAN learns these metacognitives skills,‌ enabling to predict learning progress in goals that‌ were never sampled, using semantic information in embedding‌ spaces. Curious LLM agents can also enact artificial‌ scientists that explore the environment to hypothesize, experiment,‌ test, confirm or revise abstract rules to build‌ human-readable world models. First steps in this directions‌ were made in WorldLLM59. Imagining new‌ abstract goals that maximize learning progress is another‌ challenges for autotelic genAI systems. One approach is‌ to formulate them directly as code, such as‌ in the AutotelicLLM agent (40) enabling‌ open-ended exploration in Crafter (a 2D Minecraft). These‌ projects involved collaborations with S. Aissi, O. Sigaud,‌ L. Soulier, N. Tome (Sorbonne Université), S. Lamprier‌ (Univ. Angers), T. Wolf (Hugging Face) and G.‌ Pourcel (Univ. Amsterdam).
Autotelic Generative AI for Self-Improving‌ Program Synthesis and ARC-Prize. In the context of‌ the LLM4Code Inria challenge, the team started collaboration‌ on projects at the intersection of generative AI,‌ program synthesis and AI assisted discovery in mathematics.‌ In particular, we developped collaborations with N. Fijalkow‌ (Labri, CNRS), X. Hinault (Mnemosyne), G. Baudart (PiCube).‌ This year we continued leveraging the ACES method‌, enabling large language models to self-generate diverse‌ and challenging programming puzzles (using autotelic exploration), to‌ transpose and adapt it in the domain of‌ mathematics. In the domain of program synthesis, we‌ developed a new approach, called SOAR , enabling‌ continuous self-improvement of LLMs as operator of evolutionary‌ algorithms. This new method, published at ICML 2025,‌ enabled to push the state-of-the-art on the ARC-AGI‌ 1 benchmark (category of approaches based on open-source‌ models and program synthesis), and was awarded the 2nd place at the‌ ARC-Prize (paper category). We‌ also started to explore‌‌ how full-fledged generative AI agents can be used‌ to search and optimize‌ program controllers for simulated‌‌ agents 44.
Collective intelligence and social learning‌ in AI systems. We‌ continued exploring key questions‌‌ at the crossroads of AI and society, studying‌ how methods from human‌ sciences may (or not)‌‌ be used to understand socio-cultural properties of genAI.‌ In particular, this year‌ we focused on studying‌‌ fundamental properties of GenAI systems as cultural transmission‌ technologies: they massively (re)produce‌ cultural artifacts (e.g. texts)‌‌ which are in turn viewed by/influencing both humans‌ and other GenAI systems.‌ It's important to understand‌‌ the dynamics of the evolution of cultural artefacts‌ when GenAI are part‌ of the transmission chains.‌‌ As first steps in this direction, we adapted‌ the so-called "iterative chain‌ design" from the cultural‌‌ evolution community, where LLMs basically play a version‌ of the telephone game.‌ This allowed us to‌‌ identify dynamical properties like collapse or attractors that‌ depend on various properties‌ of data. This work‌‌ resulted in one paper published at ICLR 202‌60, and another‌ at EMNLP 202549‌‌. This involved collaborations with Maxime Derex (IAST‌ Toulouse), Cédric Colas (MIT),‌ Gaia Molinaro (UC Berkeley),‌‌ Eleni Nisioti and Sebastian Risi (ITU Copenhagen), Peter-Ford‌ Dominey (INSERM), Ida Momennejad‌ (Microsoft Research), Remy Portelas‌‌ (Ubisoft).
Education, generative AI and cognitive training We‌ started a large scale‌ collaborative project, called GAIMHE‌‌, to study the design of educational technologies‌ that combine the power‌ of pedagogically grounded ITS‌‌ for cross-exercise personalization, with the flexibility of generative‌ AI for pre-generation of‌ exercices and within-exercise personalization.‌‌ This project, funded by BPI, involves collaboration with‌ the EvidenceB company,‌ as well as ClassCode‌‌ and Café Pédagogique educational NGOs.

We also continued‌ working on developing and‌ evaluating in classrooms various‌‌ pedagogical interventions training curiosity and metacognition (both conceptually‌ and procedurally), and focused‌ on studying the comparative‌‌ impact of interventions when made by teachers themselves‌ as opposed to researchers.‌ Furthermore, we continued conducting‌‌ our series of experimentations in middle schools to‌ study whether schoolchildren understand‌ and know how to‌‌ use generative AI tools in the context of‌ educational exercices, showing strong‌ limits and pointing to‌‌ two needs: training their metacognition and their AI‌ litteracy (a first series‌ of results is available‌‌ here). Also, we developed a software library‌ tools (LLM4humanities library)‌ enabling to use LLMs‌‌ to partially automate qualitative analysis methods in social‌ sciences, leveraging our prior‌ work in this direction‌‌ 170, opening new perspectives for studying qualitatively‌ large text corpuses or‌ verbal data from psychology‌‌ or educational experiments. We also continued working on‌ evaluating the use of‌ adaptive personalization algorithms (in‌‌ particular ZPDES, based on the learning progress‌ theory) for cognitive training,‌ and with diverse populations.‌‌ This was associated to a review of AI-based‌ approaches to cognitive training,‌ published in Plos One‌‌.This involved collaborations with‌ R. Abdelghani and Kou Murayama (University of Tuebingen),‌ C. Kidd (Univ. Berkeley).

We also continued working‌ on developing frameworks and tools to support social‌ curiosity among the stakeholders working with children with‌ ASD 41, 35, 58. Finally,‌ we continued to develop and adapt AI-based personalization‌ technologies for supporting learning and well-being in aging‌ populations, for example through cognitive training 36 and‌ monitoring of daily activities 46.
Curiosity-driven AI‌ for assisted scientific discovery: We continued studying‌ how curiosity-driven AI algorithms can enable scientists (physicists,‌ chemists, biologists, etc) explore and map the space‌ of self-organized behaviours in diverse complex systems 96‌. We published a milestone article (34‌) in Science Advances presenting the results of‌ our multi-year projects using autotelic AI algorithm to‌ investigate the possibilities for guided self-organization of robust‌ sensorimotor agents from low-level interactions in continuous cellular‌ automata (see web site). We also started‌ to explore how generative AI semantic models can‌ be used to drive open-ended exploration of self-organized‌ patterns in cellular automata (Alife 2025 paper 48‌), how autotelic algorithms can explore full ecosystems‌ (Alife 2025 paper 51), to use autotelic‌ reinforcement learning techniques to control and grow in‌ an online manner such self-organized patterns (ALife 2025‌ paper 45), and to study human-guidance in‌ these loops 52. On this line of‌ research, we continued developing collaborations with M. Levin‌ at Tufts University. In particular, we studied how‌ autotelic AI systems (IMGEP algorithms) can enable cost‌ effective discovery of diverse sophisticated and robust behaviors‌ in gene regulatory networks, resulting in a milestone‌ paper published in eLife 99.
Clément Moulin-Frier‌ recently moved for personal reasons to Inria Lyon,‌ but we are still strongly collaborating with him.‌ He joined the BioTiC Inria team which focuses‌ on fundamental research in theoretical and computational biology,‌ with a specific a specific interest in modeling‌ evolutionary processes. We are exploring two main research‌ directions in collaboration with BioTiC: (1) Large-scale evo-evolutionary‌ simulations in cellular automata, which is an important‌ topic in both teams (from a Computational Biology‌ in BioTic 92 vs. an Artificial Life perspective‌ in Flowers 55) and (2) Studying the‌ similarities and differences between biological vs. cultural evolution,‌ both in the natural world and in computer‌ simulation (see Section 8.2.6).
Reverse engineering large‌ language models and cheaply predicting their performances in‌ benchmarks. In collaboration with N. Yax and Stefano‌ Palminteri (ENS), we developed a new algorithmic method,‌ called PhyloLM and published at ICLR 2025 56‌, that aims at reverse-engineering generative AI model‌ origins from only black-box access — which models‌ derive from which (e.g. reusing data or algorithm‌ or architecture or other features). It proved remarkably‌ powerful at reconstructing model evolutionary trees and predicting‌ benchmark performance cheaply. This approach opens new possibilities‌ for safety applications as hundreds of new models‌ appear daily.
Software The team continued to develop several key software libraries:‌ Lamorel, enabling LLMs‌ to be used as‌‌ agents in interactive environments; AdTool, enabling easy‌ use of autotelic exploration‌ algorithms for automated discoveries‌‌ in physics/chemistry/Alife; Vivarium, for building and running‌ multi-agents simulations using Jax,‌ with a focus on‌‌ educational use; LLM4Humanities, to enable researchers in‌ human sciences leverage generative‌ AI models for tasks‌‌ like annotations or analysis of texts corpuses, using‌ a solid methodological approach.‌
Outreach The team participated‌‌ to multiple events such as the science festival‌ at Cap Sciences (Bordeaux),‌ AI days for teachers‌‌ (Bordeaux), Main à la pâte foundation, Learning Show‌ (Rennes), Science with and‌ for society (Bordeaux), the‌‌ Chiche program (Nouvelle Aquitaine), Academic days of Poitiers‌ academy, or CogniForum, and‌ welcomed several middle and‌‌ high-school students for their internships. The team also‌ continued to produce the‌ pedagogical video series "ChatGPT‌‌ explained in 5 mn", aimed at training generative‌ AI literacy in a‌ wide diversity of students‌‌ (e.g. high school), available here. They are‌ under a Creative Commons‌ licence, CC-BY, enabling open‌‌ and free reuse. They were already integrated in‌ the MOOC AI4T (see‌ here), as well‌‌ as in an internal training platform of "Académie‌ du Numérique du Ministère‌ de la défense", in‌‌ a mobile app made by Inria with educational‌ materials related to AI‌ (see here), and‌‌ are being adapted and integrated in a training‌ platform for the whole‌ population of civil servants‌‌ in France, coordinated by DINUM.
Support to public‌ policy The team was‌ involved in several major‌‌ actions to support public policies on the topic‌ of AI and education.‌ Members of the team‌‌ designed and conducted training sessions in different academies‌ for supervisory staff and‌ teachers, e.g. ETAPP-IA day‌‌ in Nouvelle-Aquitaine (January 2025); departmental training of CPE‌ and documentary teachers of‌ Nouvelle-Aquitaine during a day‌‌ at the Lycée Les Iris in Lormont (May‌ 2025); Academic Days of‌ Innovation for teachers of‌‌ Nouvelle-Aquitaine, Spring Days of Education Research at INSPEs,‌ (June 2025); PhilosophIA Citizens'‌ Convention (April 2025), twin‌‌ conference of Cnesco/Cardie Charente-Maritime (January 2025), working group‌ Education and Cognitive Sciences‌ of the academies of‌‌ Créteil, Versailles and Paris, scheduled for March 2026.‌ H. Sauzéon and PY.‌ Oudeyer were interviewed and‌‌ wrote reports to contribute to the report of‌ French Senate on AI‌ and education. PY‌‌ Oudeyer was auditioned by the commission on cultural‌ and educational affairs in‌ the French parliament, to‌‌ discuss the major challenges and opportunities of AI‌ and education.
PY Oudeyer‌ was selected by the‌‌ French National Research Agency (ANR) as one of‌ 20 researchers across all‌ disciplines to highlight research‌‌ projects funded in the last 20 years, and‌ at the occasion of‌ celebrating the 20th anniversay‌‌ of ANR. He was also invited to give‌ a keynote talk on‌ curiosity-driven learning in humans:‌‌ learning progress, autotelic exploration and open-ended development, at‌ the Budapest Conference on‌ Cognitive Development, see‌‌ video.

6.1 Awards‌

Julien Pourcel, Cédric Colas and Pierre-Yves Oudeyer were‌ awarded the 2nd place ARC-Prize in the paper‌ category, for their article and method SOAR,‌ published at ICML 2025. This method introduced a‌ novel approach to enable self-improvement of LLMs when‌ used as operators of evolutionary search algorithms in‌ general program synthesis, and pushed the frontier of‌ state-of-the-art results on the ARC-AGI 1 benchmark (in‌ the category of approaches using open-source models and‌ program synthesis).
We received two Best Paper Awards‌ at the Evostar 2025 conference for our paper‌ Emergent kin selection of altruistic feeding via non-episodic‌ neuroevolution55 (Best paper of the EvoApp‌ track + Best student paper award to the‌ first author Max Taylor-Davis).
Didier Roy, Pierre-Yves‌ Oudeyer (authors) and Clémentine Latron (illustrator) obtained the‌ 38th Prize Roberval, fo the best popular‌ science book in the youth category, for their‌ book C'est pas (moi), c'est l'IA (Nathan),‌ and were selected among the 3 finalists of‌ Prize "Goût des Sciences" organized by the French‌ Ministry of Higher Education, Science and Space. D.‌ Roy and P-Y. Oudeyer gave many general public‌ presentations in the context of this book.
Matisse‌ Poupart was awarded the Best PhD prize from‌ R3NumEd, the research network on educational technologies‌ in Nouvelle-Aquitaine, for his PhD entitled Curious and‌ therefore not overloaded: Towards an integrated understanding of‌ curiosity and cognitive load in XR learning environments‌.
Leana Petitot, Hélène Sauzéon and Pierre Dragicevic‌ obtained an Honorable mention at the ACM-CHI2 conference‌ for their paper entitled "The Effect of Augmented‌ Reality on Involuntary Autobiographical Memory", on co-design of‌ an augmented reality (AR) application simulating a museum‌ visit in the context of the I-am Associated‌ Team, 2023, integrated with an evaluation of involuntary‌ and uncontrollable memory revival. This study confirmed our‌ hypothesis: AR enhances this type of memory compared‌ to 3D images 53, suggesting potential cognitive‌ manipulations.

6.2 PhD defenses

Grgur Kovac defended his‌ PhD entitled Building, evaluating and understanding socio-cultural AI:‌ leveraging concepts and methods from human sciences (see‌ also the video of his talk, as‌ well as slides and outlines).
Gauthier Hamon‌ defended his PhD entitled Towards open-ended dynamics in‌ Artificial Life and Artificial Intelligence: an eco-evo-devo perspective‌,
Matisse Poupart defended his PhD entitled Curious‌ and therefore not overloaded: Towards an integrated understanding‌ of curiosity and cognitive load in XR learning‌ environments (see also the video of his PhD‌ defense talk).
Thomas Carta defended his PhD‌ entitled Language as a cognitive tool for open-ended‌ agents (see also the video of his PhD‌ defense talk and slides and outline).
Nicolas‌ Yax defended his PhD entitled Are we smart‌ enough to understand Large Language Models? Tools for‌ studying and improving LLMs’cognition (see also the slides‌ and outline).

7 Latest software developments, platforms,‌ open data

7.1 Latest software developments

7.1.1 SocialAI‌

Name:
SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents
Keywords:
Artificial‌ intelligence, Deep learning, Reinforcement‌ learning, Large Language Models‌‌
Functional Description:

Source code for the paper https://arxiv.org/abs/2107.00956.‌

A suite of environments‌ for testing socio-cognitive abilities‌‌ of artificial agents. Environments can be used in‌ the multimodal setting (suitable‌ for RL agents) and‌‌ in the pure text setting (suitable for Large‌ Language Model-based agents). Also‌ contains RL and LLM‌‌ baselines.
URL:
https://gitlab.inria.fr/gkovac/act-and-speak
Contact:
Grgur Kovac

7.1.2 AutoDisc‌

Keyword:
Complex Systems
Functional‌ Description:
AutoDisc is a‌‌ software built for automated scientific discoveries in complex‌ systems (e.g. self-organizing systems).‌ It can be used‌‌ as a tool to experiment automated discovery of‌ various systems using exploration‌ algorithms (e.g. curiosity-driven). Our‌‌ software is fully Open Source and allows user‌ to add their own‌ systems, exploration algorithms or‌‌ visualization methods.
URL:
https://gitlab.inria.fr/cromac/AutomatedDiscoveryTool
Contact:
Clément Romac

7.1.3‌ ADTool

Keywords:
Machine learning,‌ Python, Cellular automaton, Physical‌‌ simulation, Pattern discovery, Exploration
Functional Description:

ADTool is‌ a versatile and open-source‌ Python framework designed to‌‌ explore complex parametric systems using IMGEP algorithms (Intrinsic‌ Motivation for Goal Exploration‌ Processes) as described in‌‌ https://arxiv.org/pdf/1708.02190. This curiosity-driven approach enables automatic exploration and‌ the discovery of new‌ behaviors across a wide‌‌ range of domains, offering a novel way to‌ study complex systems.

With‌ ADTool, users can explore‌‌ cellular automata such as Lenia, Particle Lenia, and‌ Flowlenia to uncover patterns‌ and emergent behaviors. Its‌‌ capabilities extend to drug discovery, exploring chemical spaces‌ to identify promising protein-ligand‌ affinity profiles. The framework‌‌ also ventures into physics, with applications such as‌ searching for trajectories in‌ the N-body problem, simulating‌‌ the Kuramoto model, exploring the Gray-Scott reaction-diffusion system,‌ and studying hypergraph rewriting‌ systems for Wolfram physics.‌‌ In digital art, ADTool fosters creativity by exploring‌ processes like subtractive sound‌ synthesis and other artistic‌‌ methods.

The framework is designed to be flexible‌ and extensible, allowing users‌ to define their own‌‌ systems and integrate custom exploration strategies. It includes‌ mechanisms for saving discoveries‌ to disk, making it‌‌ easier to resume experiments or share results with‌ collaborators. Additionally, an integrated‌ visualization tool provides a‌‌ user-friendly interface to track exploration progress, enhancing the‌ understanding and analysis of‌ results.

The scientific foundation‌‌ of ADTool lies in "curiosity-search" algorithms, which autonomously‌ explore behavioral spaces to‌ identify interesting phenomena without‌‌ predefined objectives. These algorithms, initially developed for robotic‌ learning, are now applied‌ to the study of‌‌ emergent behaviors in various systems.

Whether you are‌ a physicist, chemist, biologist,‌ or digital artist, ADTool‌‌ can help you explore and understand complex systems.‌

Reproducibility is guarantied with‌ a predifined Python environment‌‌ and experiments can be launched with a simple‌ command line: python3 run.py‌ –config_file examples/grayscott/gray_scott.json
Contact:
Zacharie‌‌ Bugaud

7.1.4 Kids Ask

Keywords:
Human Computer Interaction,‌ Cognitive sciences
Functional Description:‌
Kids Ask is a‌‌ web-based educational platform that involves an interaction between‌ a child and a‌ conversational agent. The platform‌‌ is designed to teach children how to generate‌ curiosity-based questions and use‌ them in their learning‌‌ in order to gain‌ new knowledge in an autonomous way.
URL:
https://github.com/RaniaAbdelghani/KidAsk‌
Contact:
Rania Abdelghani

7.1.5 ToGather

Keywords:
Education, Handicap,‌ Environment perception
Scientific Description:
With participatory design methods,‌ we have designed an interactive website application for‌ educational purposes. This application aims to provide interactive‌ services with continuously updated content for the stakeholders‌ of school inclusion of children with specific educational‌ needs.
Functional Description:
Website gathering information on middle‌ school students with neurodevelopmental disorders. Authentication is required‌ to access the site's content. Each user can‌ only access the student file(s) of the young‌ person(s) they are accompanying. A student file contains‌ 6 tabs, in which each type of user‌ can add, edit or delete information: 1. Profile:‌ to quickly get to know the student 2.‌ Skills: evaluation at a given moment and evolution‌ over time 3. Compendium of tips: includes psycho-educational‌ tips 4. Meetings: manager and reports 5. News:‌ share information over time 6. Contacts: contact information‌ for stakeholders The student only has the right‌ to view information about him/her.
Publication:
hal-03436355
Contact:‌
Cécile Mazon
Participant:
4 anonymous participants

7.1.6 mc_training‌

Name:
Platform for metacognitive training
Keywords:
Human Computer‌ Interaction, Education
Functional Description:

This is a web‌ platform for children between 9 and 11 years‌ old, designed to help children practice 4 metacognitive‌ skills that are thought to be involved in‌ curiosity-driven learning: - the ability to identify uncertainties‌ - the ability to generate informed hypotheses -‌ the ability to ask questions - the ability‌ to evaluate the value of a preconceived inference.‌

Children work on a reading-comprehension tasks and, for‌ each of these skills, the platform offers help‌ through a "conversation" with conversational agents that give‌ instructions to perform the task, with respect to‌ every skill, and can give suggestions if the‌ child asks for it.
Contact:
Rania Abdelghani

7.1.7‌ Evolution of adaptation mechanisms in complex environments

Name:‌
Plasticity and evolvability under environmental variability: the joint‌ role of fitness-based selection and niche-limited competition
Keywords:‌
Evolution, Ecology, Dynamic adaptation
Functional Description:

This is‌ the code accompannying our paper Plasticity and evolvability‌ under environmental variability: the joint role of fitness-based‌ selection and niche-limited competition" which is to be‌ presented at the Gecco 2022 conference.

In this‌ work we have studied the evolution of a‌ population of agents in a world where the‌ fitness landscape changes with generations based on climate‌ function and a latitudinal model that divides the‌ world in different niches. We have implemented different‌ selection mechanisms (fitness-based selection and niche-limited competition).

The‌ world is divided into niches that correspond to‌ different latitudes and whose state evolves based on‌ a common climate function.

We model the plasticity‌ of an individual using tolerance curves originally developed‌ in ecology. Plasticity curves have the form of‌ a Gaussian the capture the benefits and costs‌ of plasticity when comparing a specialist (left) with‌ a generalist (right) agent.

The repo contains the‌ following main elements :

folder source contains the main functionality for running‌ a simulation scripts/run/reproduce_gecco.py can‌ be used to rerun‌‌ all simulations in the paper scripts/evaluate contains scripts‌ for reproducing figures. reproduce_figures.py‌ will produce all figures‌‌ (provided you have already run scripts/run/reproduce_gecco.py to generate‌ the data) folder projects‌ contains data generated from‌‌ running a simulation How to run To install‌ all package dependencies you‌ can create a conda‌‌ environment as:

conda env create -f environment.yml

All‌ script executions need to‌ be run from folder‌‌ source. Once there, you can use simulate.py, the‌ main interface of the‌ codebase to run a‌‌ simulation, For example:

python simulate.py –project test_stable –env_type‌ stable –num_gens 300 –capacity‌ 1000 –num_niches 10 –trials‌‌ 10 –selection_type NF –climate_mean_init 2

will run a‌ simulation with an environment‌ with a climate function‌‌ whose state is constantly 2 consisting of 100‌ niches for 300 generations‌ and 10 independent trials.‌‌ The maximum population size will be 1000*2 and‌ selection will be fitness-based‌ (higher fitness means higher‌‌ chances of reproduction) and niche limited (individuals reproduce‌ independently in each niche‌ and compete only within‌‌ a niche),

You can also take a look‌ at scripts/run/reproduce_gecco.py to see‌ which flags were used‌‌ for the simulations presented in the paper.

Running‌ all simulations requires some‌ days. You can instead‌‌ download the data produced by running scripts/run/reproduce_gecco.py from‌ this google folder and‌ unzip them under the‌‌ projects directory.
URL:
https://github.com/eleninisioti/ClimateAndLearning
Contact:
Eleni Nisioti

7.1.8‌ SAPIENS

Name:
SAPIENS: Structuring‌ multi-Agent toPology for Innovation‌‌ through ExperieNce Sharing
Keywords:
Reinforcement learning, Multi-agent
Functional‌ Description:

SAPIENS is a‌ reinforcement learning algorithm where‌‌ multiple off-policy agents solve the same task in‌ parallel and exchange experiences‌ on the go. The‌‌ group is characterized by its topology, a graph‌ that determines who communicates‌ with whom.

All agents‌‌ are DQNs and exchange experiences have the form‌ of transitions from their‌ replay buffers.

Using SAPIENS‌‌ we can define groups of agents that are‌ connected with others based‌ on a a) fully-connected‌‌ topology b) small-world topology c) ring topology or‌ d) dynamic topology.

Install‌ required packages You can‌‌ install all required python packages by creating a‌ new conda environment containing‌ the packages in environment.yml:‌‌

conda env create -f environment.yml

And then activating‌ the environment:

conda activate‌ sapiens

Example usages Under‌‌ notebooks there is a Jupyter notebook that will‌ guide you through setting‌ up simulations with a‌‌ fully-connected and a dynamic social network structure for‌ solving Wordcraft tasks. It‌ also explains how you‌‌ can access visualizations of the metrics produced during‌ th$

Reproducing the paper‌ results Scripts under the‌‌ scripts directory are useful for reproducing results and‌ figures appearing in the‌ paper.

With scripts/reproduce_runs.py you‌‌ can run all simulations presented in the paper‌ from scratch.

This file‌ is useful for looking‌‌ at how the experiments were configured but better‌ avoid running it: simulations‌ will run locally and‌‌ sequentially and will take months to complete.

Instead,‌ you can access the‌ data files output by‌‌ simulations on this online‌ repo.

Download this zip file and uncompress it‌ under the projects directory. This should create a‌ projects/paper_done sub-directory.

You can now reproduce all visualization‌ presented in the paper. Run:

python scripts/reproduce_visuals.py

This‌ will save some general plots under visuals, while‌ project-specific plots are saved under the corresponding project‌ in projects/paper_done
URL:
https://github.com/eleninisioti/SAPIENS
Contact:
Eleni Nisioti

7.1.9‌ architect-builder-abig

Name:
Architect-Builder Iterated Guiding
Keyword:
Artificial intelligence‌
Functional Description:

Codebase for the paper Learning to‌ guide and to be guided in the Architect-Builder‌ Problem

ABIG stands for Architect-Builder Iterated Guiding and‌ is an algorithmic solution to the Architect-Builder Problem.‌ The algorithm leverages a learned model of the‌ builder to guide it while the builder uses‌ self-imitation learning to reinforce its guided behavior.
URL:‌
https://github.com/flowersteam/architect-builder-abig
Contact:
Tristan Karch

7.1.10 EAGER

Name:
Exploit‌ question-Answering Grounding for effective Exploration in language-conditioned Reinforcement‌ learning
Keywords:
Reinforcement learning, Language, Question Generation Question‌ Answering, Reward shaping
Functional Description:
A novel QG/QA‌ framework for RL called EAGER In EAGER, an‌ agent reuses the initial language goal sentence to‌ generate a set of questions (QG): each of‌ these self-generated questions defines an auxiliary objective. Here,‌ generating a question consists in masking a word‌ of the initial language goal. Then the agent‌ tries to answer these questions (guess the missing‌ word) only by observing its trajectory so far.‌ When it manages to answer a question correctly‌ (QA) it obtains an intrinsic reward proportional to‌ its confidence in the answer. The QA module‌ is trained using a set of successful example‌ trajectories. If the agent follows a path too‌ different from correct ones at some point in‌ its trajectory, the QA module will not answer‌ the question correctly, resulting in zero intrinsic reward.‌ The sum of all the intrinsic rewards measures‌ the quality of a trajectory in relation to‌ the given goal. In other words, maximizing this‌ intrinsic reward incentivizes the agent to produce behaviour‌ that unambiguously explains various aspects of the given‌ goal.
URL:
https://github.com/flowersteam/EAGER
Contact:
Thomas Carta

7.1.11 Flow-Lenia‌

Name:
Flow Lenia: Mass conservation for the study‌ of virtual creatures in continuous cellular automata
Keywords:‌
Cellular automaton, Self-organization
Functional Description:

This repo contains‌ the code to run the Flow Lenia system‌ which is a continuous parametrized cellular automaton with‌ mass conservation. This work extends the classic Lenia‌ system with mass conservation and allows to implement‌ new feature like local parameter, environment components etc‌

Several declination of the system (1 or several‌ channels etc ) are available

Please refer to‌ the associated paper for the details of the‌ system

Implemented in JAX
URL:
https://github.com/erwanplantec/FlowLenia
Contact:
Gautier‌ Hamon

7.1.12 Kidlearn: money game application

Functional Description:‌
The games is instantiated in a browser environment‌ where students are proposed exercises in the form‌ of money/token games (see Figure 1). For‌ an exercise type, one object is presented with‌ a given tagged price and the learner has‌ to choose which combination of bank notes, coins or abstract tokens need‌ to be taken from‌ the wallet to buy‌‌ the object, with various constraints depending on exercises‌ parameters. The games have‌ been developed using web‌‌ technologies, HTML5, javascript and Django.

Figure 1:‌ Four principal regions are‌ defined in the graphical‌‌ interface. The first is the wallet location where‌ users can pick and‌ drag the money items‌‌ and drop them on the repository location to‌ compose the correct price.‌ The object and the‌‌ price are present in the object location. Four‌ different types of exercises‌ exist: M : customer/one‌‌ object, R : merchant/one object, MM : customer/two‌ objects, RM : merchant/two‌ objects.
URL:
https://flowers.inria.fr/research/kidlearn/
Contact:‌‌
Benjamin Clement

7.1.13 cognitive-testbattery

Name:
Cognitive test battery‌ of human attention and‌ memory
Keywords:
Open Access,‌‌ Cognitive sciences
Scientific Description:
Cognitive test batteries are‌ widely used in diverse‌ research fields, such as‌‌ cognitive training, cognitive disorder assessment, or brain mechanism‌ understanding. Although they need‌ flexibility according to the‌‌ objectives of their usage, most of the test‌ batteries are not be‌ available as open-source software‌‌ and not be tuned by researchers in detail.‌ The present study introduces‌ an open-source cognitive test‌‌ battery to assess attention and memory, using a‌ javascript library, p5.js. Because‌ of the ubiquitous nature‌‌ of dynamic attention in our daily lives, it‌ is crucial to have‌ tools for its assessment‌‌ or training. For that purpose, our test battery‌ includes seven cognitive tasks‌ (multiple-objects tracking, enumeration, go/no-go,‌‌ load-induced blindness, task-switching, working memory, and memorability), common‌ in cognitive science literature.‌ By using the test‌‌ battery, we conducted an online experiment to collect‌ the benchmark data. Results‌ conducted on two separate‌‌ days showed the high cross-day reliability. Specifically, the‌ task performance did not‌ largely change with the‌‌ different days. Besides, our test battery captures diverse‌ individual differences and can‌ evaluate them based on‌‌ the cognitive factors extracted from latent factor analysis.‌ Since we share our‌ source code as open-source‌‌ software, users can expand and manipulate experimental conditions‌ flexibly. Our test battery‌ is also flexible in‌‌ terms of the experimental environment, i.e., it is‌ possible to experiment either‌ online or in a‌‌ laboratory environment.
Functional Description:
The evaluation battery consists‌ of 6 cognitive activities‌ (serious games: multi-object tracking,‌‌ enumeration, go/no-go, Corsi, load-induced blindness, taskswitching, memorability). Easily‌ deployable as a web‌ application, it can be‌‌ re-used and modified for new experiments. The tool‌ is documented in order‌ to facilitate the deployment‌‌ and the analysis of results.
URL:
https://github.com/flowersteam/cognitive-testbattery
Publication:‌
hal-03723887
Contact:
Maxime Adolphe‌
Participant:
4 anonymous participants‌‌

7.1.14 Sensorimotor-lenia

Keywords:
Cellular automaton, Gradient descent, Curriculum‌ Learning
Functional Description:
Source‌ code for the search‌‌ of sensorimotor agency in cellular automata associated to‌ this blogpost https://developmentalsystems.org/sensorimotor-lenia/. The‌ code allows to find‌‌ rules in the cellular automata Lenia (through gradient‌ descent, curriculum learning and‌ diversity search) that lead‌‌ to the self-organization of moving agents robust to‌ perturbation by obstacles.
URL:‌
https://github.com/flowersteam/sensorimotor-lenia-search
Contact:
Gautier Hamon‌‌

7.1.15 Lamorel

Keywords:
Large‌ Language Models, Reinforcement learning, Distributed computing
Scientific Description:‌

Lamorel allows for seamless scaling of LLMs when‌ using embodied artificial agents such as Reinforcement Learning‌ agents. One can use and modify the LLM‌ in any part of such agents (policy, goal‌ sampler, social peer...). Lamorel is particularly useful when‌ performing large-scale experiments on clusters.

It was already‌ used in several papers, notably leading to the‌ first paper performing online RL on an LLM-based‌ agent in an embodied environment (Carta et. al,‌ 2023).
Functional Description:

Lamorel was initially designed to‌ easily use LLMs in interactive environments. It is‌ especially made for high throughput using a distributed‌ architecture. The philosophy of *Lamorel* is to be‌ very permissive and allow as much as possible‌ usage of LLMs while maintaining scaling: the application‌ should run with 1 or N LLMs.

For‌ this reason, it is not specialised neither in‌ RL nor in particular in RLHF. Our examples‌ illustrate how *Lamorel* can be used for various‌ applications including RLHF-like finetuning. However, one must understand‌ that *Lamorel*'s philosophy means that users must implement‌ themselves what they want to do with the‌ LLM(s).

This is why we advise users knowing‌ in advance they want to do RLHF, especially‌ without any modification of classic implementations, to use‌ libs specialised in RLHF that already come with‌ RL implementations (e.g. RL4LMs, TRL). On the other‌ hand, users more inclined to experiment with implementations‌ or looking for an LLM lib they can‌ use in different projects may prefer Lamorel.

Here‌ are Lamorel's key features: 1. Abstracts the use‌ of LLMs (e.g. tonekization, batches) into simple calls‌

2. Provides a method to compute the log‌ probability of token sequences (e.g. action commands) given‌ a prompt 3. Is made for scaling up‌ your experiments by deploying multiple instances of the‌ LLM and dispatching the computation thanks to a‌ simple configuration file 4. Provides access to open-sourced‌ LLMs from the Hugging Face's hub along with‌ Model Parallelism to use multiple GPUs for an‌ LLM instance 5. Allows one to give their‌ own PyTorch modules to compute custom operations (e.g.‌ to add new heads on top of the‌ LLM) 6. Allows one to train the LLM‌ (or part of it) thanks to a Data‌ Parallelism setup where the user provides its own‌ update method
URL:
https://github.com/flowersteam/lamorel
Publications:
hal-03970122, hal-04844089‌, hal-04844077
Contact:
Clément Romac

7.1.16 GLAM

Name:‌
Grounding LAnguage Models
Keywords:
Large Language Models, Reinforcement‌ learning
Scientific Description:
Recent works successfully leveraged Large‌ Language Models' (LLM) abilities to capture abstract knowledge‌ about world's physics to solve decision-making problems. Yet,‌ the alignment between LLMs' knowledge and the environment‌ can be wrong and limit functional competence due‌ to lack of grounding. In this paper, we‌ study an approach (named GLAM) to achieve this‌ alignment through functional grounding: we consider an agent‌ using an LLM as a policy that is‌ progressively updated as the agent interacts with the environment, leveraging online Reinforcement‌ Learning to improve its‌ performance to solve goals.‌‌ Using an interactive textual environment designed to study‌ higher-level forms of functional‌ grounding, and a set‌‌ of spatial and navigation tasks, we study several‌ scientific questions: 1) Can‌ LLMs boost sample efficiency‌‌ for online learning of various RL tasks? 2)‌ How can it boost‌ different forms of generalization?‌‌ 3) What is the impact of online learning?‌ We study these questions‌ by functionally grounding several‌‌ variants (size, architecture) of FLAN-T5.
Functional Description:
GLAM‌ is a new approach‌ to achieve alignment between‌‌ a Large Language Model (LLM) and a considered‌ environment/world through functional grounding:‌ we consider an agent‌‌ using an LLM as a policy that is‌ progressively updated as the‌ agent interacts with the‌‌ environment, leveraging online Reinforcement Learning to improve its‌ performance to solve goals.‌
URL:
https://github.com/flowersteam/Grounding_LLMs_with_online_RL
Publication:
hal-03970122‌‌
Contact:
Clément Romac

7.1.17 SBMLtoODEjax

Keywords:
SBML, JAX,‌ Python, Numerical simulations, Numerical‌ optimization, Automatic differentiation, Ordinary‌‌ differential equations, Biomedical data
Scientific Description:
Advances in‌ bioengineering and biomedicine demand‌ a deep understanding of‌‌ the dynamic behavior of biological systems, ranging from‌ protein pathways to complex‌ cellular processes. Biological networks‌‌ like gene regulatory networks and protein pathways are‌ key drivers of embryogenesis‌ and physiological processes. Comprehending‌‌ their diverse behaviors is essential for tackling diseases,‌ including cancer, as well‌ as for engineering novel‌‌ biological constructs. Despite the availability of extensive mathematical‌ models represented in Systems‌ Biology Markup Language (SBML),‌‌ researchers face significant challenges in exploring the full‌ spectrum of behaviors and‌ optimizing interventions to efficiently‌‌ shape those behaviors. Existing tools designed for simulation‌ of biological network models‌ are not tailored to‌‌ facilitate interventions on network dynamics nor to facilitate‌ automated discovery. Leveraging recent‌ developments in machine learning‌‌ (ML), this paper introduces SBMLtoODEjax, a lightweight library‌ designed to seamlessly integrate‌ SBML models with ML-supported‌‌ pipelines, powered by JAX. SBMLtoODEjax facilitates the reuse‌ and customization of SBML-based‌ models, harnessing JAX's capabilities‌‌ for efficient parallel simulations and optimization, with the‌ aim to accelerate research‌ in biological network analysis.‌‌
Functional Description:
SBMLtoODEjax extends SBMLtoODEpy, a python library‌ developed in 2019 for‌ converting SBML files into‌‌ python files written in Numpy/Scipy. The chosen conventions‌ for the generated variables‌ and modules are slightly‌‌ different from the standard SBML conventions (used in‌ the SBMLtoODEpy library) with‌ the aim here to‌‌ accommodate for more flexible manipulations while preserving JAX-like‌ functional programming style.
URL:‌
https://developmentalsystems.org/sbmltoodejax/
Publication:
hal-04317246
Contact:‌‌
Mayalen Etcheverry
Partner:
Tufts University

7.1.18 Vivarium

Name:‌
Large-scale simulator for research‌ and teaching in Artificial‌‌ Intelligence and Artificial Life
Keywords:
Simulation, Artificial intelligence,‌ Artificial Life, Multi-Agents System,‌ Teaching of programming, Research‌‌
Functional Description:

This project aims to seize these‌ opportunities through the design‌ and implementation of a‌‌ software platform providing an integrated simulation environment for‌ research, teaching, and dissemination‌ in the fields of‌‌ Artificial Intelligence (AI) and Artificial Life (AL). The‌ project is titled The‌ Vivarium, which reflects a‌‌ fundamental aspect of the‌ convergence between these two domains: the emergence of‌ complex behaviors, whether in the natural or artificial‌ world, necessarily relies on a need for adaptation‌ to a complex environment in which many autonomous‌ entities interact.

It will be used as an‌ educational software in a course from CISC Master‌ at UPF-Barcelona in January 2025.
Release Contributions:

This‌ release corresponds to the state of the repo‌ after all fixes were made following the SDIC‌ course at Universitat Pompeu Fabra of Barcelone (CSIM‌ Master) in January 2025 .

This version mostly‌ focuses on educational purposes, with ready-to-use practical sessions‌ in notebooks/sessions. Corentin Léger was the main contributor‌ over the last year.
News of the Year:‌
Corentin Léger, ingénieur de recherche recruté sur l'ANR‌ JCJC ECOCURL (porté par Clément Moulin-Frier) a mené‌ un gros travail de développement du logiciel au‌ cours de l'année 2024. Ses applications pour l'enseignement‌ sont maintenant validées. Le logiciel a notamment été‌ utilisé pendant 10 heures de travaux pratiques dans‌ le Master CSIC de Universitat Pompeu Fabra à‌ Barcelone, Espagne.
URL:
https://github.com/flowersteam/vivarium
Contact:
Clément Moulin-Frier
Participant:‌
3 anonymous participants

7.1.19 LLM_Culture

Keywords:
LLM, Multi-Agents‌ System, Natural language processing
Functional Description:

Code for‌ the 'Cultural evolution in populations of Large Language‌ Models' paper. This repository provides a comprehensive framework‌ for studying the cultural evolution of linguistic content‌ in populations of Large Language Models (LLM).

It‌ allows organizing LLM agents into networks wherein each‌ agent interacts with neighboring agents by exchanging texts.‌ Each agent can be assigned specific personalities and‌ transmission instructions, serving as prompts for generating new‌ texts from their neighbors’ narratives. Once the network‌ structure and agent characteristics are defined, you can‌ simulate the cultural evolution of texts across generations‌ of agents. We also provide built-in metrics and‌ vizualizations to analyze the results.
URL:
https://github.com/flowersteam/LLM-Culture
Contact:‌
Jeremy Perez
Participant:
2 anonymous participants

7.1.20 TelephoneGameLLMs‌

Keywords:
Large Language Models, Multi-Agents System, Cultural Evolution‌
Functional Description:
Code for the paper "When LLMs‌ Play the Telephone Game: Cumulative Changes and Attractors‌ in Iterated Cultural Transmissions" https://arxiv.org/abs/2407.04503 In this paper,‌ we introduce conceptual and methodological tools for evaluating‌ Large Language Models in multi-turn settings. Those tools‌ are inspired by cultural evolutionary theory, and in‌ particular by the concepts of cultural attractors.
URL:‌
https://github.com/flowersteam/TelephoneGameLLM
Publication:
hal-04714994
Contact:
Jeremy Perez

7.1.21 styr‌

Name:
Stick To Your Role
Keywords:
LLM, Cognitive‌ sciences
Functional Description:

Code for our paper https://arxiv.org/abs/2402.14846‌ and leaderboard https://huggingface.co/spaces/flowers-team/StickToYourRoleLeaderboard.

Enables evaluating LLMs using personal‌ value questionnaires (PVQ, SVS). More precisely, it instructs‌ the LLM to simulated various personas and exposes‌ it to different contexts (e.g. long reddit posts).‌ Then it evaluates the value stability of the‌ simulated population between those contexts. Additionally, it computes‌ confirmatory factor analysis (CFI, SRMR, RMSEA), and the‌ structure of expressed values (stress metric).
URL:
https://github.com/flowersteam/value_stability‌
Contact:
Grgur Kovac

7.1.22 transformerXL_PPO_JAX

Keywords:
Reinforcement learning,‌ Transformer
Functional Description:

This repository provides a JAX‌ implementation of TranformerXL with PPO in a RL setup following : "Stabilizing‌ Transformers for Reinforcement Learning"‌ from Parisotto et al.‌‌ (https://arxiv.org/abs/1910.06764).

The code uses the PureJaxRL template for‌ PPO and copied some‌ of the code from‌‌ hugging face trasnformer XL repo transferring it to‌ JAX. We also took‌ inspiration from the pytorch‌‌ code in https://github.com/MarcoMeter/episodic-transformer-memory-ppo, which has some simplification of‌ gradient propagation and positional‌ encoding compared to transformerXL‌‌ as it is described in the original paper‌ (https://arxiv.org/abs/1901.02860). The training handles‌ [Gymnax](https://github.com/RobertTLange/gymnax) environment.

We also‌‌ tested it on Craftax, on which it‌ beat the baseline presented‌ in the paper (https://arxiv.org/abs/2402.16801)‌‌ including PPO-RNN, training with unsupervised environment design and‌ intrinsic motivation. Notably we‌ reach the 3rd level‌‌ (the sewer) and obtain several advanced advancements, which‌ was not achieved by‌ the methods presented in‌‌ the paper. See Craftax Results for more informations.‌

The training of a‌ 5M transformer on craftax‌‌ for 1e9 steps (with 1024 environments) takes about‌ 6h30 on a single‌ A100.
Contact:
Gautier Hamon‌‌

7.1.23 ER-MRL

Keywords:
Reinforcement learning, Evolutionary Algorithms, Recurrent‌ network
Functional Description:

Code‌ for the "Evolving-Reservoirs-for-Meta-Reinforcement-Learning" (ER-MRL)‌‌ paper (https://arxiv.org/abs/2312.06695).

We adopt a computational framework based‌ on meta reinforcement learning,‌ modeling the interplay between‌‌ evolution and development. At the evolutionary scale, we‌ evolve reservoirs, a family‌ of recurrent neural networks‌‌ generated from hyperparameters. These evolved reservoirs are then‌ utilized to facilitate the‌ learning of a behavioral‌‌ policy through reinforcement learning. This is done by‌ encoding the environment state‌ through the reservoir before‌‌ providing it to the agent's policy.
Contact:
Corentin‌ Leger

7.1.24 LLM4Humanities

Keywords:‌
LLM, Python, Data Generator,‌‌ Generative AI
Scientific Description:
Qualitative research in experimental‌ psychology and the humanities‌ often relies on manual‌‌ annotation of textual data using defined codebooks. This‌ process is indispensable but‌ time-consuming and costly. Moreover,‌‌ best practices require at least two independent annotators‌ in order to compute‌ inter-rater reliability (IRR), which‌‌ further increases the required resources. IRR is crucial‌ to distinguish variance due‌ to coder subjectivity from‌‌ variance due to the phenomenon under study, yet‌ in practice it is‌ frequently omitted, misreported, or‌‌ computed using inadequate metrics (e.g., raw percentage agreement‌ or simple correlations). The‌ objective of the LLM4Humanities‌‌ project is to design an open-source, Python-based toolkit‌ and web application that‌ leverages large language models‌‌ (LLMs) to support, accelerate, and improve the methodological‌ rigor of qualitative annotation‌ workflows. In addition to‌‌ annotation assistance, LLM4Humanities includes a generation mode designed‌ to support the creation‌ of experimental material. In‌‌ this mode, users can select one or several‌ template items (e.g., a‌ mathematics exercise) and specify‌‌ a set of constraints. The system then generates‌ multiple new variants of‌ the item. These generated‌‌ items can subsequently be passed through the same‌ annotation and evaluation pipeline,‌ providing a first automated‌‌ assessment of the quality and consistency of the‌ generated content.
Functional Description:‌
LLM4Humanities is an open-source,‌‌ Python-based toolkit and web application that integrates LLM-assisted‌ annotation, inter-rater reliability analysis,‌ and the generation of‌‌ controlled variants of experimental‌ material within a single end-to-end workflow
URL:
https://github.com/flowersteam/LLM4Humanities‌
Contact:
Olivier Clerc
Participant:
Olivier Clerc

7.2 New‌ platforms

7.2.1 ToGather application

Participants: Cécile Mazon,‌ Hélène Sauzéon, Eric Meyer, Isabeau Saint-Supery‌.

Name:
Application for Specialized education
Keywords:
Parent-professional‌ relationships; user-centered design; school inclusion; autism spectrum disorder;‌ ecosystemic approach
Participants:
Isabeau Saint-supery, Cécile Mazon, Hélène‌ Sauzéon, Agilonaute
Scientific Description:
With participatory design methods,‌ we have designed an interactive website application for‌ educational purposes. This application aims to provide interactive‌ services with continuously updated content for the stakeholders‌ of school inclusion of children with specific educational‌ needs. Especially, the services provide: 1) the student's‌ profile with strengths and weaknesses; 2) an evaluation‌ and monitoring over time of the student's repertoire‌ of acquired, emerging or targeted skills; 3) a‌ shared notebook of effective psycho-educational solutions for the‌ student ; 4) a shared messaging system for‌ exchanging "news" about the student and his/her family‌ and, 5) a meeting manager allowing updates of‌ evaluations (student progress). This application is currently assessed‌ with a field study. Then, it will be‌ transferred to the Academy of Nouvelle-Aquitaine-Bordeaux of the‌ National Education Ministery.
URL:
The website is not‌ online yet, but all informations such as tutorials‌ are here.
Publication:
hal-03436355

8 New results‌

The team's research program, within the domain of‌ developmental artificial intelligence, aims to study mechanisms of‌ open-ended learning, and in particular the role of‌ curiosity-driven autotelic learning and the role of language‌ as a cognitive tool. We study these topics‌ both in humans and AI systems, both at‌ the level of individuals and at the level‌ of cultural groups, and both at the fundamental‌ and application levels.

Here, we present our recent‌ results along the following research dimensions:

Open-ended learning‌ and autotelic AI with large language models;
Models‌ of cultural evolution in humans and AI systems;‌
An Eco-Evo-Devo perspective on Artificial Intelligence;
Generative AI‌ and educational technologies;
Theories of human curiosity-driven learning‌
Curiosity-driven learning in educational technologies;
Curiosity-driven AI for‌ assisted scientific discovery;

8.1 Open-ended learning and autotelic‌ AI with large language models

The team continued‌ to lay the foundations of autotelic AI 89‌, i.e. the science stuyding mechanisms enabling artificial‌ agents to learn to represent and sample their‌ own goals and achieve open-ended learning.

8.1.1 ACES:‌ Generating a Diversity of Challenging Programming Puzzles with‌ Autotelic Generative Models

Participants: Julien Pourcel [correspondant],‌ Cédric Colas, Gaia Molinaro, Pierre-Yves Oudeyer‌, Laetitia Teodorescu.

Motivation.

In this project,‌ we examine how one can generate an interesting‌ diversity of programming puzzles (same domain as Codeplay).‌ We recall that this is an important case‌ study for linguistic autotelic agents because it is‌ a first step towards generalist agents inventing their‌ own problems. Inspired by the Evolution Through Large‌ Models (ELM) method where authors evolve robot morphologies‌ expressed as Sodarace programs using a Large Language‌ Model as a mutation operator, we aim to develop an evolutionary method‌ to create a diverse‌ population of problems using‌‌ pretrained Language Models. We remark that diversity-producing methods‌ (such as Map-Elites) need‌ a Behavioral Characterization (BC)‌‌ space in which to measure the diversity of‌ their evolved populations; this‌ is feasible with virtual‌‌ creatures but seems pretty hard with programming puzzles.‌ We thus introduce the‌ notion of a Semantic‌‌ BC space, composed of abstract categories, and labelling‌ inside this space is‌ done through LLM responses.‌‌ In our case, we introduce 10 programming descriptors:‌

0 - Sorting and‌ Searching
1 - Counting‌‌ and Combinatorics
2 - Trees and Graphs
3‌ - Mathematical Foundations
4‌ - Bit Manipulation
5‌‌ - String Manipulation
6 - Geometry and Grid‌ Problems
7 - Recursion‌ and Dynamic Programming
8‌‌ - Stacks and Queues
9 - Optimization Algorithms‌

We then define an‌ archive of generated programming‌‌ puzzles and their solutions, and the position of‌ a puzzle in the‌ archive is given by‌‌ the combination of descriptors that the puzzle-solution pair‌ belongs to (the semantic‌ representation of a puzzle‌‌ thus being a 10-dimensional vector). The semantic archive‌ is used to store‌ puzzles.

We then perform‌‌ experiments with the following algorithms:

ACES: our‌ proposed method samples a‌ target cell (combination of‌‌ descriptors) in the archive at random and populates‌ a few-shot prompt for‌ the language model with‌‌ puzzles from neighboring cells in the archive. See‌ Figure 2 for an‌ illustration.
ELM Semantic:‌‌ based on ELM, example puzzles and solutions are‌ given as few-shot in-context‌ examples and a puzzle‌‌ sampled from the archive is then mutated.
ELM‌: same as the‌ previous one, except we‌‌ do not use the semantic archive for sampling:‌ instead we build an‌ archive with centroidal voronoi‌‌ tessellations, from the embedding of puzzles inside the‌ latent space of a‌ Language Model. This baseline‌‌ allows us to compare the semantic archive with‌ a more classical one;‌
Static Gen: In‌‌ this method, puzzles are sampled from the train‌ set and added as‌ few-shot examples in the‌‌ prompt;

For all experiments we seed the archive‌ with the P3 train‌ set.

Results.

We report‌‌ results of our runs in Figure 3.‌ Overall, the methods based‌ on semantic archives, ACES‌‌ and ELM-Semantic, achieve the highest diversity in the‌ semantic space. We report‌ diversity measures inside the‌‌ embedding spaces of various smaller language models in‌ Figure 4. In‌ these figures we see‌‌ that overall ACES outperforms other methods in this‌ measure of diversity. We‌ additionally perform tests of‌‌ the suitability of generated puzzles as finetuning data‌ for smaller LMs. For‌ all methods, we finetune‌‌ a smaller model (OpenLlama-3b) on the‌ generated set and we‌ test the pass@k metric‌‌ for different values of k on the P3‌ test set; we report‌ the scores in Figure‌‌ 5. From that figure we see that‌ we encounter a tradeoff‌ between how diverse the‌‌ data is and how‌ useful it is to get a high score‌ on the P3 test set. Further work is‌ needed to get data that is both diverse‌ and useful.a

Figure 2:‌ Overview of ACES. ACES maintains an archive of‌ discovered puzzles grouped into cells indexed by their‌ semantic representation (skill combination). ACES runs in several‌ steps: 1) sample a target semantic goal and‌ relevant examples from the archive. 2) given these,‌ generate a puzzle f and its solution g‌ with the puzzle generator. 3) test the validity‌ of that pair by running assert(f(g()) in the‌ interpreter. 4) if the pair is valid, obtain‌ its semantic representation with the puzzle labeler. 5)‌ add the new pair to its corresponding cell‌ in the archive.

Figure 3.a — Figure 3: Diversity of generated‌ puzzles in semantic space. We report the evolution‌ of several diversity metrics computed in the semantic‌ space as a function of the number of‌ puzzle-solution pairs generated by the puzzle generator. Semantic‌ algorithms (algname and ELM semantic) achieve higher diversity‌ in the semantic space.

Figure 3.b — Figure 3: Diversity of generated‌ puzzles in semantic space. We report the evolution‌ of several diversity metrics computed in the semantic‌ space as a function of the number of‌ puzzle-solution pairs generated by the puzzle generator. Semantic‌ algorithms (algname and ELM semantic) achieve higher diversity‌ in the semantic space.

Figure 4.a — Figure 4: Diversity of generated‌ puzzles in embedding spaces. We report the evolution‌ of the pairwise distance between puzzle-solution pair embeddings‌ as a function of the number of generated‌ puzzle-solution pairs, for three different embedding representation spaces‌ (average across seeds).

Figure 4.b — Figure 4: Diversity of generated‌ puzzles in embedding spaces. We report the evolution‌ of the pairwise distance between puzzle-solution pair embeddings‌ as a function of the number of generated‌ puzzle-solution pairs, for three different embedding representation spaces‌ (average across seeds).

Figure 5: Downstream‌ performance on the P3 test set. Pass@k is‌ the fraction of puzzles solved after $k$ attempts‌ ( $k \in$ [1:10]). Green overlaps with yellow.‌

8.1.2 MAGELLAN: Metacognitive Generalization of Learning Progress for‌ Online RL in LLM agents

Participants: Loris Gaven‌ [correspondant], Thomas Carta, Clément Romac,‌ Cédric Colas, Pierre-Yves Oudeyer, Olivier Sigaud‌ [ISIR Sorbonne Université, Paris, France], Sylvain Lamprier [Univ Angers, LERIA].‌

We are developing MAGELLAN‌47, a method‌‌ designed to enable LLM-based reinforcement learning (RL) agents‌ to estimate their own‌ Learning Progress (LP) and‌‌ use it to dynamically organize their training curriculum.‌ By leveraging the LLM's‌ rich semantic representations, MAGELLAN‌‌ allows agents to generalize LP estimations to unseen,‌ language-defined goals, overcoming limitations‌ of classical methods that‌‌ require direct evaluation of each goal.

MAGELLAN uses‌ the LLM to generate‌ latent representations of goals‌‌ and tasks, capturing their semantic relationships. It continuously‌ monitors the agent's performance‌ over time, estimating LP‌‌ as the change in success rates for specific‌ goals. This approach enables‌ the agent to identify‌‌ goals where it is making progress and focus‌ its training on those‌ areas. MAGELLAN's integration ensures‌‌ that the LLM-based agent can simultaneously refine its‌ policy and competence estimations,‌ adapting both to new‌‌ tasks in real time.

Our experiments in the‌ Little-Zoo environment, which‌ features hierarchical and commonsense-driven‌‌ tasks, demonstrate that MAGELLAN effectively prioritizes high-LP goals,‌ even when faced with‌ novel or unseen tasks.‌‌ Unlike traditional LP estimation methods, which rely on‌ direct evaluations and struggle‌ with generalization, MAGELLAN enables‌‌ the agent to quickly identify meaningful learning opportunities.‌ This results in faster‌ adaptation, improved sample efficiency,‌‌ and more effective curriculum organization, paving the way‌ for truly autonomous agents‌ capable of navigating vast‌‌ and complex goal spaces.

8.1.3 When goals are‌ beyond reach: Metacognitive monitoring‌ guides autonomous discovery of‌‌ frugal assistance-seeking in LLMs

Participants: Clément Romac [correspondant]‌, Pierre-Yves Oudeyer.‌

Enhancing LLMs with metacognitive‌‌ capabilities has been identified as a key challenge‌ for improving the trustworthiness‌ and interpretability of these‌‌ models. In this work, we investigate how such‌ metacognitive abilities can be‌ leveraged to trigger external‌‌ assistance when the model’s own capabilities are insufficient.‌ While improving LLMs’ is‌ essential, it is equally‌‌ critical that models learn to recognize their own‌ limitations—and to seek or‌ rely on external support‌‌ in real-world settings where functional competence may be‌ partial or underdeveloped. This‌ ability forms a crucial‌‌ part of a broader learning loop: requesting help‌ when needed, then internalizing‌ the knowledge or skills‌‌ acquired through that assistance.

Augmenting LLMs with external‌ assistance and, in particular,‌ what has been named‌‌ "tools", has become a well-established practice. These augmentations‌ range from calculators and‌ retrieval systems to code‌‌ interpreters, and even other LLMs. This shift has‌ led to a rethinking‌ of the role of‌‌ LLMs—not as general-purpose solvers, but as assistants (often‌ referred to as action‌ models) that must learn‌‌ to orchestrate the use of external resources and‌ integrate their outputs into‌ coherent, human-readable responses.

This‌‌ reframing introduces a new class of decision-making problems:‌ LLMs must determine when‌ and which external assistance‌‌ to invoke. However, the optimal assistance strategy is‌ not known in advance.‌ Some tasks may be‌‌ solvable independently by the LLM, while others may‌ require external help. Additionally,‌ the tools themselves may‌‌ be fallible—for instance, even‌ large or specialized LLMs can return suboptimal results.‌ To address this, most prior approaches rely on‌ supervised learning, fine-tuning LLMs on curated datasets containing‌ examples of effective tool use. More recently, several‌ works have begun exploring how RL can be‌ used to learn assistance-seeking strategies from scratch, without‌ requiring predefined tool-use demonstrations.

While both RL and‌ more conventional supervised learning approaches have shown promise,‌ an important dimension of the assistance-seeking problem remains‌ largely understudied: external assistance comes at a cost.‌ This cost may take the form of increased‌ latency in the LLM’s response, financial charges for‌ calling APIs, or computational overhead. Although early work‌ on tool use—such as—acknowledged this issue, it has‌ received limited attention since. A recent exception is,‌ which introduced a first approach to this multi-objective‌ problem: maximizing task performance while minimizing assistance costs.‌ Their method involves a multi-stage learning pipeline: (1)‌ an estimator of LLM performance is trained using‌ interaction data between the LLM and the task‌ space; (2) a separate model is trained to‌ simulate the outputs of both the LLM and‌ its assistance sources; and (3) given a predefined‌ cost budget, Dynamic Programming is used to derive‌ the optimal assistance strategy. While effective, this method‌ is computationally intensive and requires extensive data collection‌ and training across multiple stages.

In this work,‌ we propose a fully online approach (see Figure‌ 6) based on multi-objective contextual multi-armed bandits.‌ Given a task, we frame the decision of‌ whether to keep the task with the LLM‌ or delegate it to external assistance as the‌ selection of an arm. Given a task, we‌ consider the dual objective of maximizing the answer's‌ performance $R$ while minimizing its cost $C$ ,‌ and we adopt scalarization—i.e., combining the two objectives‌ into a single weighted sum $U = β‌ R + (1 - β) C‌$ that our approach aims to maximize. Crucially, our‌ method naturally adapts to any specified user-specified budget‌ by treating the budget as the scalarization weight‌ that balances the two objectives. A central challenge‌ of this approach lies in efficiently estimating the‌ performance and cost associated with each option (i.e.,‌ the LLM and all available assistance sources), using‌ as few interactions as possible. To address this,‌ we draw inspiration from MAGELLAN and leverage the‌ LLM itself to learn these estimations.

We first‌ evaluate our method on a set of carefully‌ designed math problems with calculator tools as assistance,‌ for which the optimal strategy is known. This‌ setting enables us to investigate how the strategy‌ discovered by our method compares to the optimal‌ one, as well as the sample efficiency of‌ our approach (i.e., the number of interactions required‌ to converge). We notably show that our LLM-based‌ estimation of performance and cost reaches similar or‌ even better performance than a classic moving average‌ approach which has access to privileged information—namely, the‌ problem category. Finally, we demonstrate the broader applicability of our method by‌ applying it to real-world‌ problems faced by LLMs.‌‌ In particular, we apply it to a standard‌ question-answering benchmark: MMLU-Pro. The‌ results show that our‌‌ approach is scalable to complex natural language tasks‌ without access to any‌ external expert knowledge.

Figure‌‌ 6: We frame the decision of whether‌ an LLM should trigger‌ external assistance as a‌‌ multi-objective contextual multi-armed bandit problem. For each task,‌ the LLM estimates the‌ performance and cost of‌‌ all available options. These estimates are combined into‌ a single utility score‌ using a user-defined scalarization‌‌ weight, which specifies the trade-off between maximizing performance‌ and minimizing assistance cost.‌ The estimator is continuously‌‌ updated through online interactions.

8.1.4 LLM-based goal generation‌ for autotellic exploration with‌ goal-conditioned RL

Participants: Guillaume‌‌ Pourcel [correspondant], Grgur Kovač, Thomas Carta‌, Cédric Colas,‌ Pierre-Yves Oudeyer.

Designing‌‌ autotelic agents capable of autonomously generating and pursuing‌ their own goals represents‌ a promising endeavor for‌‌ open-ended learning and skill acquisition in reinforcement learning.‌ This challenge is especially‌ difficult in open worlds‌‌ that require inventing new previously unobserved goals. In‌ this work, we propose‌ an architecture where a‌‌ single generalist autotelic agent is trained on an‌ automatic curriculum of goals.‌ We leverage large language‌‌ models (LLMs) to generate goals as code for‌ reward functions based on‌ learnability and difficulty estimates.‌‌ The goal-conditioned RL agent is trained on those‌ goals sampled based on‌ learning progress. We compare‌‌ our method to an adaptation of OMNI-EPIC to‌ goal-conditioned RL. Our preliminary‌ experiments imply that our‌‌ method generates a higher proportion of learnable goals,‌ suggesting better adaptation to‌ the goalconditioned learner. This‌‌ work is described in this technical report.‌

8.1.5 Self-Improving Language Models for Evolutionary Program‌ Synthesis: A Case Study‌ on ARC-AGI

Participants: Julien‌‌ Pourcel [correspondant], Cédric Colas, Pierre-Yves Oudeyer‌.

In our work‌ on SOAR (Self-improving Operators‌‌ for Automated program Refinements), published at ICML 2025,‌ we address a fundamental‌ limitation in program synthesis:‌‌ while large language models struggle to solve complex‌ tasks in single attempts,‌ traditional evolutionary approaches are‌‌ constrained by the fixed capabilities of their underlying‌ generative models. We developed‌ a framework that integrates‌‌ language models into a self-improving evolutionary loop, enabling‌ continuous performance enhancement through‌ experience rather than relying‌‌ on static model capabilities.

Our‌ method operates through an‌‌ iterative two-phase process that we designed to create‌ a virtuous cycle of‌ improvement. First, an evolutionary‌‌ search phase employs a language model to sample‌ and refine candidate program‌ solutions. Second, a hindsight‌‌ learning phase converts these search attempts—both successful and‌ unsuccessful—into valid problem-solution pairs‌ that we use to‌‌ fine-tune the LLM's sampling and refinement capabilities. This‌ approach leverages positive transfer‌ between the sampling and‌‌ refinement fine-tuning tasks, allowing‌ the system to bootstrap its own improvement without‌ requiring human-engineered training data.

We evaluated SOAR on‌ the challenging ARC-AGI benchmark, which tests abstract reasoning‌ and program induction capabilities. Our framework solves 52%‌ of the public test set, establishing state-of-the-art results‌ for program synthesis using open-source language models. These‌ improvements compound through iterations, with models showing enhanced‌ abilities to both generate initial program ideas and‌ refine existing solutions. Notably, the gains carry over‌ to test-time adaptation, enabling continuous improvement on target‌ problems even after deployment.

Our research demonstrates that‌ program synthesis systems can transcend the limitations of‌ their base models through self-improvement, opening new possibilities‌ for autonomous AI development. By showing how iterative‌ model improvement can overcome performance plateaus inherent to‌ search methods, SOAR provides a drop-in upgrade for‌ existing systems like AlphaEvolve or ShinkaEvolve, transforming their‌ fixed LLM operators into continuously improving ones.

8.1.6‌ WorldLLM: Improving LLMs' world modeling using curiosity-driven theory-making‌

Participants: Guillaume Levy, Cedric Colas, Pierre-Yves‌ Oudeyer, Thomas Carta, Cément Romac [correspondant]‌.

Large Language Models (LLMs) possess broad knowledge‌ about the world, but leveraging this knowledge for‌ precise dynamics modeling remains challenging. While LLMs can‌ engage in general reasoning, they struggle to make‌ accurate predictions in specific domains with structured observations‌ and dynamics, such as physics simulations or video‌ games. This limitation stems from the gap between‌ their general capabilities and the need for grounded,‌ domain-specific understanding.

In this paper, we present WorldLLM‌59, a framework for autonomous improvement of‌ an LLM's world modeling abilities. Our approach combines‌ 1) probabilistic theory induction to produce hypotheses that‌ are given in our LLM's prompt to improve‌ its predictions and 2) curiosity-driven RL to explore‌ the environment and collect transitions poorly predicted by‌ the current hypotheses (see Figure 9). Formally,‌ our LLM's world model is the conditional probability‌ $P (s_{t + 1} | {s‌}_{t}, a_{t}, H)$ ,‌ where $s_{t}$ represents a state, $a_{t‌}$ an action, and $H$ a set of natural‌ language hypothesized theories. This probability is computed by‌ the LLM by giving it $s_{t}$ ,‌ $a_{t}$ , and $H$ in its prompt‌ and taking the probability of $s_{t +‌ 1}$ to follow this prompt. Our key insight‌ is that natural language theories can help ground‌ an LLM's broad knowledge into precise predictive power‌ by providing domain-specific rules. Our approach consists of‌ three interacting components: (1) our LLM that computes‌ $P (s_{t + 1} | {s‌}_{t}, a_{t}, H)$ by‌ conditioning its predictions on both a state-action pair‌ and the current hypotheses, (2) a theory generator‌ that updates natural language hypotheses using Bayesian inference,‌ and (3) a curiosity-driven reinforcement learning agent trained‌ to collect evidence against the current hypotheses. Inspired‌ by how humans, from children to scientists, actively‌ update their internal world model by performing experiments, our agent's exploration provides‌ new evidence for hypothesis‌ refinement, creating a virtuous‌‌ cycle of improvement.

We demonstrate our approach‌ in a video game‌‌ environment where agents manipulate and combine objects, showing‌ that WorldLLM successfully learns‌ accurate predictive models while‌‌ generating human-interpretable theories about environment dynamics. This work‌ contributes to a growing‌ body of research on‌‌ improving LLMs' world modeling capabilities and grounding their‌ knowledge in specific domains.‌ By combining ideas from‌‌ theory-based RL, Bayesian inference, and active exploration, we‌ provide a framework for‌ learning structured, interpretable world‌‌ models that leverage both the broad knowledge of‌ LLMs and domain-specific experiences‌ without any costly gradient-based‌‌ learning.

8.1.7 HERAKLES: Hierarchical Skill Compilation for Open-ended‌ LLM Agents

Participants: Thomas‌ Carta [correspondant], Cément‌‌ Romac, Loris Gaven, Pierre-Yves Oudeyer,‌ Olivier Sigaud, Sylvain‌ Lamprier.

In our‌‌ work on HERAKLES (HiERarchicAl sKill compiLation for open-Ended‌ llm agentS), we address‌ a fundamental challenge in‌‌ open-ended AI: as goal spaces expand, increasingly complex‌ goals require composing multiple‌ elementary actions, leading to‌‌ combinatorial explosion that impedes learning progress. While existing‌ hierarchical reinforcement learning approaches‌ rely on expert-defined skill‌‌ spaces and pre-trained low-level policies, such designs are‌ inadequate for open-ended scenarios‌ where goal spaces naturally‌‌ diversify across a broad spectrum of difficulties. We‌ developed a framework that‌ enables continuous skill compilation,‌‌ dynamically expanding the agent's capabilities through experience rather‌ than relying on fixed,‌ predefined abstractions.

Our method‌‌ operates through a two-level hierarchical architecture designed to‌ create a virtuous cycle‌ of skill acquisition. A‌‌ high-level policy, instantiated as a Large Language Model,‌ decomposes complex goals into‌ subgoals and selects skills‌‌ from an evolving skill space. A low-level policy,‌ implemented as lightweight neural‌ networks, executes these skills‌‌ through primitive actions. Crucially, as the hierarchical agent‌ masters a goal, the‌ complete trajectory is compiled‌‌ into the low-level policy as a new reusable‌ skill. A competence estimator‌ predicts the low-level policy's‌‌ success probability for each skill, ensuring the high-level‌ policy only invokes skills‌ that can be reliably‌‌ executed. This approach leverages language's compositional and combinatorial‌ properties to structure the‌ skill space, enabling generalization‌‌ across semantically related goals.

We evaluated HERAKLES in‌ the Crafter environment, a‌ procedurally generated Minecraft-like world‌‌ designed to assess agent capabilities within a unified‌ open-ended framework. Our framework‌ achieves Crafter scores above‌‌ 70, while baselines plateau below 30. More importantly,‌ HERAKLES scales near-linearly with‌ goal difficulty, whereas non-hierarchical‌‌ methods exhibit exponential growth in learning time. The‌ framework also demonstrates strong‌ generalization: when tested on‌‌ synonymous goal formulations, HERAKLES shows only a 16%‌ performance drop compared to‌ 24-27% for baselines, and‌‌ maintains robust performance on compositional variants requiring repeated‌ skill execution.

Our research‌ demonstrates that open-ended agents‌‌ can transcend the limitations of fixed skill spaces‌ through continuous compilation, opening‌ new possibilities for lifelong‌‌ learning systems. By showing how mastered behaviors can‌ be recursively encoded at‌ lower levels for rapid‌‌ reuse—mirroring how humans overcome‌ complexity barriers through hierarchical learning—HERAKLES provides a principled‌ approach for building agents that autonomously expand their‌ competencies over time.

8.1.8 Software Engineering Agents for‌ Embodied Controller Generation : A Study in Minigrid‌ Environments

Participants: Timothé Boulet [correspondant], Xavier Hinaut‌ [Mnemosyne, Inria Bordeaux], Clément Moulin-Frier, Nathanaël‌ Fijalkow.

Motivation.

Software Engineering Agents (SWE-Agents) have‌ proven effective for traditional software engineering tasks with‌ accessible codebases, but their performance for embodied tasks‌ requiring well-designed information discovery remains unexplored. In this‌ paper 44, we present the first extended‌ evaluation of SWE-Agents on controller generation for embodied‌ tasks, adapting Mini-SWE-Agent (MSWEA) to solve 20 diverse‌ embodied tasks from the Minigrid environment. Our experiments‌ compare agent performance across different information access conditions:‌ with and without environment source code access, and‌ with varying capabilities for interactive exploration. We quantify‌ how different information access levels affect SWE-Agent performance‌ for embodied tasks and analyze the relative importance‌ of static code analysis versus dynamic exploration for‌ task solving. This work establishes controller generation for‌ embodied tasks as a crucial evaluation domain for‌ SWE-Agents and provides baseline results for future research‌ in efficient reasoning systems.

This work investigates a‌ fundamental question: How do SWE-Agents perform in controller‌ generation for embodied tasks ? Our approach involves‌ a code-agent (the SWE-Agent interacting with a code-environment‌ involving codebases and terminals) that generates controller-agents (Python‌ programs) to solve tasks in an embodied setup,‌ creating a two-level agency structure that differs from‌ direct LLM-environment interaction approaches common in embodied AI.‌ Figure 10 illustrates this two-level agency structure. The‌ agent can evaluate its proposed solution by executing‌ them in the environment and receiving feedback in‌ the form of success/failure and reward. Task terminates‌ either when the agent validates with a special‌ command or when the maximum number of steps‌ or cost is reached.

Figure‌ 10: Two-level agency structure: a code-agent interacts‌ with a code-environment to generates controller-agents (Python programs)‌ to solve the embodied task.

We evaluate the‌ challenge of controller generation for embodied tasks by‌ adapting Mini-SWE-Agent (MSWEA) to solve diverse Minigrid tasks‌ under different information access conditions:

Source Code Access‌: When the agent can read Minigrid environment‌ code, it can analyze environment mechanics, constraints, and‌ object interactions to inform controller design.
Interactive Exploration‌: When the agent can write and execute‌ scripts to probe the environment, where it can‌ discover dynamics through exploration, observing outcomes of actions‌ in various states.

Results.

The best@5 success rates‌ of MSWEA across different tasks and information access‌ conditions are summarized in Figure 11. We‌ display standard deviation as error bars in all‌ our plots.

Minigrid PO was very hard to‌ solve for MSWEA, with many tasks not being‌ solved even with full access. In Minigrid FO‌ however, all tasks except 1 are solved by at least MSWEA with‌ full access. Partial Observability,‌ as a component of‌‌ embodied tasks, is thus a hard step for‌ SWE Agents to solve.‌

Figure 11: Best@5‌‌ success rate of MSWEA across different tasks and‌ information access conditions in‌ Fully Observable Minigrid.

To‌‌ identify patterns in the influence of the type‌ of the task to‌ the performance of different‌‌ information access conditions, we grouped the average best@5‌ success rate metric into‌ 4 categories : navigation,‌‌ manipulation, hazard, memory, as well as the overall‌ average across all tasks.‌ The results are shown‌‌ in Figures 12 and Figure 13.

Figure‌‌ 12: Mean-by-category best@5 success rate in Fully‌ Observable Minigrid

Figure 13: Mean-by-category best@5 success‌ rate in Partially Observable‌ Minigrid

In the Fully‌‌ Observable benchmark, comparing MSWEA (blue bars) with its‌ fully ablated version (red‌ bars) without neither source‌‌ code read access nor interactive exploration, we observe‌ performance drop dramatically. An‌ agent with only the‌‌ Test-Access capability (i.e. being capable of testing its‌ solution to obtain the‌ success rate of its‌‌ controller solution on the task) obtain much worse‌ result, but surprisingly still‌ manages to solve some‌‌ tasks through iterated submissions.

If we try to‌ get back to the‌ MSWEA performance level by‌‌ adding only the code access (cyan bars), we‌ see very limited improvement,‌ which means reading only‌‌ help partially the agent and that the difficulty‌ lies elsewhere. If we‌ add only the interactive‌‌ execution capability however (orange bars), we observe the‌ performance get back to‌ a comparable level as‌‌ MSWEA. This pattern is consistent across all task‌ categories and particularly for‌ manipulation task, where the‌‌ very exact knowledge of how the environment operates‌ is required to solve‌ the task. This systematic‌‌ pattern means that the interactive access is an‌ essential capacity of SWE-Agents‌ that allows them to‌‌ perform significantly better in embodied tasks.

In the‌ Partially Observable benchmark, performance‌ is much lower than‌‌ in Minigrid FO, in particular for the complex‌ manipulation tasks. We can‌ note there are different‌‌ patterns depending on the task category, but we‌ will not try to‌ interpret them as these‌‌ may arise either from statistical variability given the‌ relatively high standard errors,‌ or from subtle hard‌‌ to infer and task-specific factors that bias the‌ agent’s behavior in ways‌ not observed in similar‌‌ tasks. The overall performance does not vary significantly‌ with the information access‌ conditions. We interpret this‌‌ as the PO tasks being inherently too hard‌ for MSWEA, such that‌ the agent only solve‌‌ the simplest tasks such as the easiest navigation‌ tasks, and can make‌ little use of different‌‌ information access to increase performances. This leads us‌ to believe that strongly‌ embodied tasks such as‌‌ Minigrid PO tasks represent‌ a good benchmark for SWE Agents : they‌ perform decently on some tasks, but on others,‌ even with good LLMs and access to source‌ code and execution access, they still have significant‌ room for improvement regarding the understanding of the‌ functioning of the environment. These results encourages the‌ use of embodied tasks for future software engineering‌ agents benchmarks.

8.2 Models of cultural evolution in‌ humans and AI systems

As generative AI systems‌ become powerful cultural transmission technologies that influence human‌ cultural evolution in important ways, and can also‌ have their own cultural processes through machine-machine large‌ scale interaction, the study of the dynamics of‌ cultural processes in populations of AI systems/humans becomes‌ crucial.

Participants: Eleni Nisioti [correspondant],‌ Mateo Mahaut, Pierre-Yves Oudeyer, Ida Momennejad‌, Sebastian Risi, Pierre-Yves Oudeyer, Clément‌ Moulin-Frier.

Innovations are a central component of‌ open-ended skill acquisition: they denote the emergence of‌ new solutions by the recombination of existing ones‌ and their presence is necessary to ensure a‌ continuous complexification of an agent's cultural repertoire. While‌ we often tend to attribute discoveries to certain‌ innovative individuals, if we shed a broad perspective‌ at the history of our species we see‌ that human innovation is primarily a collective process.‌ Fields such as psychology and anthropology have been‌ studying the ability of human groups to innovate‌ for some time, with studies indicating that the‌ social network structure has a significant impact: fully-connected‌ structures are better suited for quick convergence in‌ easy problems with clear global optima, while partially-connected‌ structures perform best in difficult tasks where local‌ optima may lure agents away from the globally‌ optimal solution 94. At the same time‌ a parallel story is unfolding in reinforcement learning‌ (RL): distributed RL is a sub-field where multiple‌ agents solve a task collectively 134. Compared‌ to the single-agent paradigm, distributed RL algorithms converge‌ quicker and often achieve superior performance. However, these‌ algorithms have only considered full connectivity. In this‌ inter-disciplinary project, we presented a novel learning framework‌ that augments distributed RL with the notion of‌ a social network structure and employed it to‌ study the hypothesis from human studies that partial‌ connectivity performs best in innovation tasks.

Cultural evolution‌ in populations of RL agents.

We implemented such‌ innovation tasks using Wordcraft, a recently introduced RL‌ playground inspired from the Little Alchemy 2 game‌ (see left of figure 14 for an illustration‌ of how this task works). We considered a‌ wide diversity of social network structures: static structures‌ that remain constant throughout learning (fully-connected, ring, small-world)‌ and a dynamic structure where the group oscillates‌ between phases of low and high connectivity (we‌ illustrate this dynamic structure on the right of‌ figure 14). Each agent in our implementation‌ employs the DQN learning algorithm and exchanges experiences‌ that have the form of sequences of state-action combinations with its neighbors.‌

Figure 14.a — Figure 14: (Left) Illustration of an innovation‌ task, consisting of an‌ initial set of elements‌‌ (Earth, Water) and a recipe book indicating which‌ combinations create new elements.‌ Upon creating a new‌‌ element the player moves up an innovation level‌ and receives a reward‌ that increases monotonically with‌‌ levels. (Right) Dynamic social network structures oscillate between‌ phases of low connectivity,‌ where experience sharing takes‌‌ place within clusters, and high connectivity, where experiences‌ spread between clusters.

Figure 14.b — Figure 14: (Left) Illustration of an innovation‌ task, consisting of an‌ initial set of elements‌‌ (Earth, Water) and a recipe book indicating which‌ combinations create new elements.‌ Upon creating a new‌‌ element the player moves up an innovation level‌ and receives a reward‌ that increases monotonically with‌‌ levels. (Right) Dynamic social network structures oscillate between‌ phases of low connectivity,‌ where experience sharing takes‌‌ place within clusters, and high connectivity, where experiences‌ spread between clusters.

A‌ central conclusion of our‌‌ empirical analysis was that the dynamic social network‌ structure performs best. In‌ addition to the performance‌‌ groups achieve we measured behavioral and mnemonic metrics‌ such as behavioral conformity‌ and mnemonic diversity. Such‌‌ metrics were inspired from human studies and helped‌ us further analyze the‌ behavior of groups. For‌‌ example, one empirical observation was that sharing experiences‌ did not help the‌ group learn quicker in‌‌ a very simple innovation task; instead the fully-connected‌ group was the slowest.‌ By looking at the‌‌ diversity in the memories of the agents we‌ observed that the fully-connected‌ structure had the highest‌‌ individual diversity (left of figure 15 ) and‌ the lowest group diversity‌ (right of figure 15‌‌): sharing experiences with others diversifies an individual's‌ experiences but also homogenizes‌ the group, which is‌‌ bad for its performance.

Figure 15.a — Figure 15‌: (Left) Illustration of‌‌ an innovation task, consisting of an initial set‌ of elements (Earth, Water)‌ and a recipe book‌‌ indicating which combinations create new elements. Upon creating‌ a new element the‌ player moves up an‌‌ innovation level and receives a reward that increases‌ monotonically with levels. (Right)‌ Dynamic social network structures‌‌ oscillate between phases of low connectivity, where experience‌ sharing takes place within‌ clusters, and high connectivity,‌‌ where experiences spread between clusters.

Figure 15.b — Figure 15‌: (Left) Illustration of‌‌ an innovation task, consisting of an initial set‌ of elements (Earth, Water)‌ and a recipe book‌‌ indicating which combinations create new elements. Upon creating‌ a new element the‌ player moves up an‌‌ innovation level and receives a reward that increases‌ monotonically with levels. (Right)‌ Dynamic social network structures‌‌ oscillate between phases of low connectivity, where experience‌ sharing takes place within‌ clusters, and high connectivity,‌‌ where experiences spread between clusters.

We see the‌ contribution of this project‌ as two-fold. From the‌‌ perspective of fields studying human intelligence, we have‌ shown that using RL‌ algorithms as computational tool‌‌ is a promising direction towards increasing the verisimilitude‌ of simulations and analyzing‌ both behavior and memory.‌‌ From the perspective of‌ RL, we have shown that distributed RL algorithm‌ should move beyond the fully-connected architecture and explore‌ groups with dynamic topologies. This work is currently‌ a preprint 136 and is about to be‌ submitted in PNAS. We open-source the code at‌ this link.

Cultural evolution in populations of‌ LLM agents.

In 2024, we have extended this‌ framework with agents equipped with Large Language Models‌ (LLMs) playing Little Alchemy 2, a creative video‌ game originally developed for humans (figure 16).‌ We, first, study an LLM in isolation and‌ discover that it exhibits both useful skills and‌ crucial limitations. We, then, study groups of LLMs‌ that share information related to their behaviour and‌ focus on the effect of social connectivity on‌ collective performance. In agreement with previous human and‌ computational studies (including the one described above), we‌ observe that groups with dynamic connectivity out-compete fully-connected‌ groups. Our work reveals opportunities and challenges for‌ future studies of collective innovation that are becoming‌ increasingly relevant as Generative Artificial Intelligence algorithms and‌ humans innovate alongside each other. We published this‌ work at the ALife 2024 conference 139.‌

Figure‌ 16: Studying collective innovation in groups of‌ LLMs: A) we experiment with Little Alchemy 2,‌ a game where players combine real-world items to‌ create new ones. A knowledge graph describes the‌ possible combinations (we only present a small sub-part‌ of the graph which contains 720 items in‌ total) B) Alice-LLM and Bob-LLM are two LLMs‌ playing the game together. They are provided with‌ the same intro prompt, explaining the rules of‌ the game, and the same task (they start‌ with the same set of items). Alice-LLM and‌ Bob-LLM have identical weights but behave differently because‌ the state prompt depends on their crafting history.‌ They are informed about the actions of others‌ through their prompt. In this paper, we study‌ how groups of such LLM agents are able‌ to efficiently explore a knowledge graph, focusing in‌ particular on the effect of different social structures specifying with whom and‌ when they can share‌ information

8.2.2 When LLMs‌‌ Play the Telephone Game: Cultural Attractors as Conceptual‌ Tools to Evaluate LLMs‌ in Multi-turn Settings

Participants:‌‌ Jérémy Perez [correspondant], Grgur Kovač, Corentin‌ Léger, Cédric Colas‌, Gaia Molinaro,‌‌ Maxime Derex, Pierre-Yves Oudeyer, Clément Moulin-Frier‌.

As large language‌ models (LLMs) start interacting‌‌ with each other and generating an increasing amount‌ of text online, it‌ becomes crucial to better‌‌ understand how information is transformed as it passes‌ from one LLM to‌ the next. While significant‌‌ research has examined individual LLM behaviors, existing studies‌ have largely overlooked the‌ collective behaviors and information‌‌ distortions arising from iterated LLM interactions. Small biases,‌ negligible at the single‌ output level, risk being‌‌ amplified in iterated interactions, potentially leading the content‌ to evolve towards attractor‌ states.

In this project,‌‌ we ran a series of telephone game experiments,‌ applying a transmission chain‌ design borrowed from the‌‌ human cultural evolution literature: LLM agents iteratively receive,‌ produce, and transmit texts‌ from the previous to‌‌ the next agent in the chain.

Figure 17:‌‌ Method for estimating attractor strength and position.

Our‌ main contributions are:

We‌ propose that there might‌‌ be a gap in current LLM evaluations methods‌ (single-turn evaluations might not‌ be suited to assess‌‌ the properties of multi-turn interactions)
We empirically confirm‌ this hypothesis by showing‌ that multi-turn interactions indeed‌‌ often lead to distributions of text properties that‌ are significantly different from‌ what is observed after‌‌ a single interaction.
We introduce novel conceptual and‌ methodological tools to fill‌ this gap, grounded in‌‌ research in cultural evolution, and in particular the‌ concept of cultural attractor.‌
We showcase the potential‌‌ of this method by applying it to compare‌ the effect of different‌ tasks, of different models,‌‌ of temperature, and of fine-tuning on the properties‌ of multi-turn interactions.
We‌ find several robust effects,‌‌ such as the fact that less constrained tasks‌ lead to stronger attractors,‌ that some properties posses‌‌ stronger attractors than others, and that fine-tuning can‌ shift the position and‌ modify the strength of‌‌ attractors.

Figure 18: Attractors strength and‌ position.

These findings highlight the importance of accounting‌ for multi-step transmission dynamics and represent a first‌ step towards a more comprehensive understanding of LLM‌ cultural dynamics.

This work was presented during a‌ 15-minutes talk given at the 2024 Cultural Evolution‌ Society conference, and was accepted as a conference‌ paper at the International Conference on Representation Learning‌ 2025 (ICLR 2025) Conference 144 The code is‌ available at here. We also created a‌ website featuring a Data Explorer tool, allowing to‌ directly inspect the texts generated during our experiments.‌

8.2.3 Recursive Training Loops in LLMs: How training‌ data properties modulate distribution shift in generated data?‌

Participants: Grgur Kovač [correspondant], Jérémy Perez [correspondant]‌, Remy Portelas, Peter Ford Dominey,‌ Pierre-Yves Oudeyer.

Large language models (LLMs) are‌ increas- ingly used in the creation of online‌ content, creating feedback loops as subsequent gener- ations‌ of models will be trained on this syn-‌ thetic data. Such loops were shown to lead‌ to distribution shifts - models misrepresenting the true‌ underlying distributions of human data (also called model‌ collapse). However, how hu- man data properties affect‌ such shifts remains poorly understood. In this paper,‌ we provide the first empirical examination of the‌ effect of such properties on the outcome of‌ recursive training. We first confirm that using differ-‌ ent human datasets leads to distribution shifts of‌ different magnitudes. Through exhaustive manipulation of dataset properties‌ combined with regression analyses, we then identify a‌ set of properties associated with distribution shift magnitudes.‌ Lexical diversity is found to am- plify these‌ shifts, while semantic diversity and data quality mitigate‌ them. Furthermore, we find that these influences are‌ highly modular: data scrapped from a given internet‌ domain has little influence on the content generated‌ for an- other domain. Finally, experiments on political‌ bias reveal that human data properties affect whether‌ the initial bias will be amplified or re-‌ duced. Overall, our results portray a novel view,‌ where different parts of internet may undergo different‌ types of distribution shift.

The main contributions of‌ this work are:

We propose and experimentally confirm‌ the hypothesis that different training datasets lead to‌ different distribution shift dynamics, motivating an investigation on‌ the underlying causes.
Through an extensive set of‌ experiments (four datasets over three domains), we outline‌ several data properties as influencing distribution shift dynamics.‌
We reveal that these influences are highly modular,‌ with generated content being mostly influenced by human‌ data properties from the same domain.
We find‌ that distribution shifts also occur in terms of‌ political lean, and that the type of shift‌ (bias amplification, reduction or inversion) depends on the‌ political lean of the human data.

Figure‌‌ 19: Iterative chain In each generation, a‌ fresh base model is‌ fine-tuned on texts sampled‌‌ from the Accumulated data pool (except generation 0,‌ where it's trained only‌ on human posts). The‌‌ model generates posts, which are added to the‌ pool alongside some newly‌ sampled human posts.

This‌‌ work was published as a conference paper at‌ the EMNLP2025 conference 49‌.

8.2.4 Intrinsic motivation‌‌ is key to understanding peer cultures

Participants: Jérémy‌ Perez [correspondant], Maxime‌ Derex, Pierre-Yves Oudeyer‌‌, Clément Moulin-Frier.

This paper 64 is‌ a commentary to 123‌, as part of‌‌ the call for open-peer commentary on this target‌ article in Behavioral and‌ Brain Sciences, 1–68. In‌‌ the target paper, the authors make an intriguing‌ case that peer cultures‌ could play a key‌‌ role in cultural adaptation by generating qualitatively different‌ cultural variation compared to‌ adult cultures. However, the‌‌ mechanisms responsible for this distinction remain unclear. In‌ out commentary, we discuss‌ how accounting for the‌‌ role of intrinsic motivation in shaping the content‌ of peer cultures may‌ help explain their evolutionary‌‌ dynamics.

8.2.5 Cultural variation and regularities in intrinsically‌ motivated exploration: investigating autonomous‌ goal selection in BaYaka‌‌ foragers and Bandongo fisher-farmers

Participants: Jérémy Perez [correspondant]‌, Sarah Pope-Caldwell,‌ Sheina Lew-Levy, Pierre-Yves‌‌ Oudeyer, Maxime Derex, Clément Moulin-Frier.‌

TLDR: This study investigates‌ how recent performance and‌‌ recent progress influence autonomous goal selection in children‌ and adults from two‌ cultural groups in the‌‌ Congo Basin. All data necessary for this project‌ has been collected during‌ Jérémy Perez's mission in‌‌ Congo in July-August 2025, and analyses are still‌ ongoing. This project was‌ made possible through a‌‌ collaboration with Sheina Lew-Levy from Durham University and‌ Sarah Pope-Caldwell from Georgia‌ State University. An abstract‌‌ for this project will be submitted to the‌ 2026 conference of the‌ European Human Behaviour &‌‌ Evolution Association.

Objective

: By influencing which goals‌ individuals set for themselves,‌ intrinsic motivation plays a‌‌ central role in structuring autonomous learning trajectories. Grounded‌ in theoretical work, recent‌ empirical studies have uncovered‌‌ the features that make an activity intrinsically motivating.‌ For instance, having experienced‌ recent progress towards a‌‌ goal was found to influence the probability of‌ selecting it, reflecting curiosity-driven‌ exploration. However, these studies‌‌ have exclusively focused on humans from Western cultures.‌ The role of the‌ cultural environment in determining‌‌ the strategies used during intrinsically-motivated goal exploration thus‌ remains unclear.

Method:

In‌ the present study, we‌‌ investigated how recent performance and recent progress influence‌ autonomous goal selection in‌ 60 Congolese BaYaka foragers‌‌ (30 children, 30 adults) and 57 Bandongo fisher-farmers‌ (29 children, 28 adults).‌ To do so, we‌‌ adapted the free-choice paradigm‌ used in the previous studies, in which participants‌ are free to select, and switch between, learning‌ activities of different difficulties. Pre-registered analyses were used‌ to uncover how recent performance and recent progress‌ predict activity choices.

Preliminary results:

Preliminary results indicate‌ that the strategies used by participants in the‌ present study are qualitatively similar to those previously‌ observed in western participants. Specifically, many Bandongo and‌ BaYaka participants rely on recent progress to guide‌ their activity choices. However, clear cross-cultural differences exist:‌ for instance, recent performance had a greater influence‌ on goal choices in Bandongo participants than in‌ BaYaka participants. Our results also indicate noticeable heterogeneity‌ within cultural groups with respect to the strategies‌ guiding self-directed learning.

Conclusion:

By taking a cross-cultural‌ perspective on intrinsic motivation, this study highlights the‌ role of the cultural niche in shaping the‌ mechanisms underlying self-directed learning, and contributes to building‌ a more representative picture of human curiosity-driven exploration.‌

8.2.6 The cultural evolution of human goals: How‌ individuals generate, select, and transmit goals

Participants: Jérémy‌ Perez [correspondant], Cédric Colas, Gaia Molinaro‌, Pierre-Yves Oudeyer, Maxime Derex, Clément‌ Moulin-Frier.

This work has been submitted to‌ the special issue on Goal Dynamics in Cognition‌ of the journal Topics in Cognitive Science, and‌ is currently under review.

Abstract:

Humans pursue goals‌ that are remarkably diverse and vary over time‌ and cultures. These goals shape which behaviors are‌ explored, valued, and socially transmitted, yet most theories‌ of cultural evolution focus on how behaviors evolve‌ while leaving the origins of goals unexamined. We‌ argue that a complete understanding of cultural evolution‌ requires explaining how goals themselves emerge, vary, and‌ persist across generations. Building on studies of motivation‌ and curiosity in cognitive science and artificial intelligence,‌ we introduce the notion of cultural autotelic agents—individuals‌ who actively generate, select, and transmit their own‌ goals within social environments. By highlighting the cognitive‌ and motivational mechanisms that drive goal formation and‌ selection, this framework extends existing models of cultural‌ evolution and helps explain the open-ended, self-propelling character‌ of human culture.

Figure 20: Humans are cultural autotelic‌ agents. We introduce‌ the notion of cultural‌‌ autotelic agents, i.e. agents that combine individual and‌ social learning to represent,‌ generate, select, and transmit‌‌ their own goals. This model departs from the‌ historical conceptualization as problem-solvers‌ (left column). This standard‌‌ perspective focuses on how agents optimize behaviors toward‌ goals that are externally‌ imposed. This view is‌‌ largely present in research on individual cognition (top-left)‌ and has inspired most‌ experimental paradigms in cultural‌‌ evolution (bottom left, e.g. transmission chains). Research on‌ motivation, developmental psychology, and‌ developmental artificial intelligence has‌‌ extended this view toward the concept of autotelic‌ agents, i.e. agents‌ able to self-generate and‌‌ purse their own goals (top-right). This conceptualization affords‌ a more complete understanding‌ of proactively exploratory behaviors‌‌ in humans, in particular how their past behavior‌ influence their goal generation‌ and selection mechanisms. Here,‌‌ we propose integrating such insights from cultural evolution‌ and autotelic learning to‌ introduce the concept of‌‌ cultural autotelic agents (bottom-right). Under this view, agents‌ are active in the‌ generation and selection of‌‌ the goals they pursue. These goal generation and‌ selection mechanisms are influenced‌ both by social information‌‌ and individually collected information. We argue that this‌ conceptualization is necessary to‌ think about the cultural‌‌ evolutionary dynamics of goals.

8.2.7 Evolving Interaction Protocols‌ for Open-Ended Collective Innovation‌

Participants: Akhi Mocherla,‌‌ Jérémy Perez, Eleni Nisioti, Cédric Colas‌.

In exploratory domains‌ such as science, art,‌‌ and design, progress emerges not from achieving predefined‌ objectives but from accumulating‌ novel and meaningful discoveries‌‌ 140. Lab and field studies of human‌ collective innovation have shown‌ that a group's exploration‌‌ and, thus, innovation abilities critically depend on how‌ individuals communicate with each‌ other 94, 130‌‌, 124. For example, increasing group connectivity‌ speeds up innovation in‌ the short-term but reduces‌‌ diversity within the collective, negatively impacting long-term innovation.‌ Partially connected groups thus‌ accumulate the most innovations‌‌ in deceptive search spaces. Computational studies have confirmed‌ this in groups of‌ evolving agents 121,‌‌ 78, reinforcement learning agents 137, and‌ Large Language Models (LLMs)‌ 138, highlighting the‌‌ key role of collective dynamics in engineered multi-agent‌ systems. Despite this, systematic‌ approaches for optimising how‌‌ groups interact remain underdeveloped. In this work, we‌ propose an approach for‌ designing interaction protocols (IPs)‌‌ that govern who communicates with whom, what is‌ communicated, and when. Similarly‌ to past computational studies‌‌ 137, 138, we use the text-based‌ game Little Alchemy 2‌ (LA2) as a test-bed‌‌ of collective innovation. To‌ explore the IP space systematically, we employ a‌ Quality-Diversity (QD) algorithm 152 discovering repertoires with high‌ performance and behavioral diversity. We maintain an archive‌ of IPs, each evaluated via multiple trials. Similarly‌ to previous works 100 , our approach follows‌ the Novelty Search with Local Competition (NS-LC) paradigm‌ and employs LLMs within QD for solution generation‌ and novelty estimation.

This work has been presented‌ as a poster at the 2025 workshop on‌ Intrinsically Motivated Open-ended Learning (IMOL 2025).

Figure 21: (Left) Overview of the‌ framework used to evolve Interaction Protocols for groups‌ of agents playing the Little Alchemy 2 game.‌ The system iteratively generates new IPs using a‌ language model (LLM), evaluates their performance, and maintains‌ an archive of candidate solutions. Candidate IPs are‌ debugged and tested, then evaluated for fitness and‌ novelty relative to archived solutions. The archive is‌ updated based on fitness, and novel or improved‌ protocols are used to guide further LLM generations,‌ enabling continual improvement and diversity in discovered solutions.‌ (Right) Comparison of the performance of an evolved‌ IP to dynamic and fully-connected IPs from past‌ studies.

8.2.8 Inferring the Phylogeny of Large Language‌ Models and Predicting their Performances in Benchmarks

Participants:‌ Nicolas Yax [correspondant], Pierre-Yves Oudeyer, Stefano‌ Palminteri.

In recent month the number of‌ Large Language Models (LLMs) released has never been‌ that high. On one hand, multiple private companies‌ such as OPENAI, Claude, Google, Mistral, etc. are‌ making cutting-edge models that have a lot of‌ visibility in our modern society and science. However,‌ as the number of LLMs is raising, the‌ training methods are becoming more secretive making the‌ field increasingly obscur to science. On the other‌ hand, everyday, a few hundreds of open-access language‌ models are uploaded on the hugging face hub‌ which is far too much to keep track‌ of the evolution of LLMs in the field.‌ Knowing that not all of these open models‌ are perfectly transparent about the training methods and‌ only very few of them are benchmarked (due‌ to the high cost of benchmarking) there is‌ an increasing need for methods to help keep‌ track of the progress and evolution of these‌ models in the field.

We developped an algorithm,‌ named PhyloLM, inspired from phylogenetics to compute evolutionary‌ trees in LLMs. We show this method efficient in reconstructing the evolutionary‌ history of LLMs within‌ families 22, in‌‌ discriminating the different families, and also in finding‌ similarities between these families.‌ Additionaly, the genetic information‌‌ can be used to predict LLM capabilities like‌ benchmark scores showing a‌ very significant correlation between‌‌ predicted and true scores. These advances could be‌ instrumental in our way‌ to navigate the field‌‌ of LLMs by making the world of LLM‌ more transparent at a‌ very low cost. This‌‌ was published at ICLR 2025.

Figure 22‌: Phylogenetic tree reconstruction.‌‌ On the left it is shown the ground‌ truth concerning the relation‌ of some LLMs of‌‌ the Mistral family. Right is the reconstruction from‌ the phylogenetic algorithm for‌ the five latest models‌‌ of this family ("leaves" of the phylogenetic tree)‌ on which we run‌ PhyloLM. On the right,‌‌ it is shown the reconstructed phylogenetic tree PhyloLM‌ on the 5 "leafs"‌ models. The numerical labels‌‌ (0:3) map the true common ancestors (on the‌ right, "ground truth") to‌ the inferred ones (on‌‌ the left, "reconstructed"). It can be seen that‌ the true and the‌ reconstructed trees are topologically‌‌ equivalent

8.3 An Eco-Evo-Devo perspective on Artificial Intelligence‌

8.3.1 Research perspective: The‌ Ecology of Open-Ended skill‌‌ Acquisition

Participants: Clément Moulin-Frier [correspondant], Eleni Nisioti‌, Pierre-Yves Oudeyer.‌

An intriguing feature of‌‌ the human species is our ability to continuously‌ invent new problems and‌ to proactively acquiring new‌‌ skills in order to solve them: what is‌ called Open-Ended Skill Acquisition‌ (OESA). Understanding the mechanisms‌‌ underlying OESA is an important scientific challenge in‌ both cognitive science (e.g.‌ by studying infant cognitive‌‌ development) and in artificial intelligence (aiming at computational‌ architectures capable of open-ended‌ learning). Both fields, however,‌‌ mostly focus on cognitive and social mechanisms at‌ the scale of an‌ individual’s life. It is‌‌ rarely acknowledged that OESA, an ability that is‌ fundamentally related to the‌ characteristics of human intelligence,‌‌ has been necessarily shaped by ecological, evolutionary and‌ cultural mechanisms interacting at‌ multiple spatiotemporal scales.

Figure 23:‌ The ORIGINS framework identifies central components (boxes) and‌ their interactions (arrows) driving Open-Ended Skill Acquisition, both‌ in terms of its evolution from environmental complexity‌ (roughly: left to right arrows) as well its‌ open-ended aspect through feedback mechanisms (right to left‌ arrows). The employed terminology reflects a diversity of‌ mechanisms considered in both Artificial Intelligence and Human‌ Behavioral Ecology.

We have recently initiated a new‌ research direction aiming at understanding, modeling and simulating‌ the dynamics of OESA in artificial systems, grounded‌ in theories studying its eco-evolutionary bases in the‌ human species. For this aim, we have proposed‌ a conceptual framework, called ORIGINS (illustrated Fig. 23‌ and developed in 131), expressing the complex‌ interactions between environmental, adaptive, multi-agent and cultural dynamics.‌ This framework raises three main research questions:

What‌ are the ecological conditions favoring the evolution of‌ autotelic agents?
How to bootstrap the formation of‌ a cultural repertoire in populations of adaptive agents?‌
What is the role of cultural feedback effects‌ in the open-ended dynamics of human skill acquisition?‌

The contributions described below are addressing some aspects‌ of these research questions. Note that there might‌ be a thematic overlap between the two last‌ research questions outlined above and the previous section‌ on Models of Cultural Evolution 8.2, where‌ we also present related results.

8.3.2 Eco-evolutionary Dynamics‌ of Non-episodic Neuroevolution in Large Multi-agent Environments

Participants:‌ Gautier Hamon [correspondant], Eleni Nisioti, Clément‌ Moulin-Frier.

This work was published in 2023‌ but we keep it in this report as‌ it introduces a general computational framework, called non-episodic‌ neuroevolution, that forms the basis of the two‌ next contributions.

This contribution focuses on eco-evolutionary dynamics‌ where "organisms are not solely products but, by‌ modifying their niche and therefore its associated fitness‌ landscape, are also causes of evolution" 120.‌ The main objective of this paper is to‌ propose a method for studying large-scale eco-evolutionary dynamics‌ in agent-based simulations with a reasonable level of‌ biological and ecological plausibility. For this aim, we‌ implement a system with the following properties (see‌ Fig. 24 for illustration):

Non-episodic simulation environment with‌ complex intrinsic dynamics. We model our environment‌ after common-pool resource (CPR) appropriation problems, where a‌ group of agents competes for finite resources. We‌ extend an existing environment of CPR appropriation 145‌ with the presence of multiple niches, where resources‌ regrow proportionally to the density of nearby resources‌ at different rates in different regions of the‌ environment (Fig 24). We prevent any environment‌ or population reset during a whole simulation run,‌ enabling coupled environmental and population dynamics leading to‌ complex eco-evolutionary feedback effects.
Continuous neuroevolution in a‌ large, size-varying agent population The environment contains thousands‌ of agents, each controlled by a neural network‌ whose weights are optimized using neuroevolution 161
Physiology-driven‌ death and reproduction There is no notion of‌ rewards, agents are instead equipped with a physiological system modulating their energy‌ level according to the‌ resources they consume, in‌‌ a non-linear way. At the evolutionary scale, agents‌ reproduce as long as‌ they are able to‌‌ maintain their energy level within a reasonable range‌ and die if this‌ level goes below a‌‌ minimum threshold. This is departure from the notion‌ of fitness-based selection and‌ more in line with‌‌ a minimal criterion selection 76. Note that‌ the population size can‌ vary with time.

Figure‌ 24: Our simulation‌‌ environment (Left) is an extension of the Common‌ Pool Resource (CPR) environment‌ 145, 122 :‌‌ a two-dimensional grid-world where some cells contain resources‌ (in green) that the‌ agents (in black) can‌‌ collect. Resources grow depending on the presence of‌ other resources around them‌ (local growth, Middle) with‌‌ an additional very sparse spontaneous growth, which means‌ that over-consumption may lead‌ to their local depletion.‌‌ We introduce a latitudinal model of resource regrowth.‌ We prevent any environment‌ and population reset during‌‌ a whole simulation, enabling continual eco-evolutionary dynamics to‌ take place. Each agent‌ may reproduce or die‌‌ according to a physiological model modulating its energy‌ level as a function‌ of life time and‌‌ resource consumption (Top-Right). The population size varies during‌ the simulation according to‌ the current amount of‌‌ available resources and the current ability of agents‌ to collect them. Evolution‌ occurs through the mutation‌‌ of a parent's network weights when it produces‌ an offspring.

In addition‌ to experiments conducted in‌‌ the large environment presented, we also conduct experiments‌ in "lab environment" (as‌ opposed to the "natural‌‌ environment") to isolate the study of certain behavior‌ (which are often intertwined‌ with a lot of‌‌ dynamics in the natural environment).

One interesting results‌ of these simulation is‌ the emergence of sustainable‌‌ foragers which as shown in lab environment Fig.‌25 tends to not‌ overconsume when there is‌‌ enough resource in their neighbourhood. This allows to‌ keep a certain amount‌ of resource to spread‌‌ which is therefore beneficial‌ for their future survival as well as the‌ survival of their offspring. (as there is no‌ reset of the environment)

Figure 25: Greediness‌ of a sustainable forager agent across evaluation environments‌ that differ in the amount of resources. Sustainable‌ agents are far less greedy in environments where‌ there is a certain amount of resources available.‌ This strategy allows to keep resources so that‌ they spread and avoid overdepletion of resources.

This‌ work was published at the Genetic and Evolutionary‌ Computation Conference (GECCO) 2023. The computational framework it‌ introduced led to the two next recent contributions.‌

8.3.3 Emergent kin selection of altruistic feeding via‌ non-episodic neuroevolution

Participants: Max Taylor-Davies, Gautier Hamon‌, Timothe Boulet, Clément Moulin-Frier [correspondant].‌

This work extends the project presented in previous‌ contribution Sec.8.3.2. It is the result‌ from the visit in the team of Max‌ Taylor-Davies doing his PhD at School of Informatics,‌ University of Edinburgh, Scotland. It has been accepted‌ at the EvoStar conference the International Conference on‌ the Applications of Evolutionary Computation (Part of EvoStar)‌ 55.

At first glance, it seems difficult‌ to square the phenomenon of purely altruistic behaviour‌ (acts which confer a benefit to the recipient‌ at a cost to the actor) with the‌ basic principle of natural selection: how can a‌ gene be selected for when it decreases, rather‌ than increases, the fitness of its host? One‌ plausible account can be made through the theory‌ of inclusive fitness. Key to this theory is‌ the recognition that individual organisms within a social‌ environment are not isolated from their conspecifics in‌ terms of fitness. Whether a given gene is‌ selected for is thus determined by its effect(s)‌ on the fitness of any bearers of copies‌ of that gene. Under this view, we can‌ think of an altruistic act as an exchange‌ of fitness from one agent to another. If‌ the exchange is positive-sum and both sides are‌ bearers of the gene in question, then from‌ the gene's perspective the behaviour confers a fitness‌ benefit–even while it decreases the fitness of the‌ acting individual.

Kin selection theory 160 has proven‌ to be a popular and widely accepted account‌ of how altruistic behaviour can evolve under natural‌ selection. Hamilton's rule, first published in 1964 107‌, 108, has since been experimentally validated‌ across a range of different species and social‌ behaviours. In contrast to this large body of‌ work in natural populations, however, there has been‌ relatively little study of kin selection in silico‌. In the current work, we offer what‌ is to our knowledge the first demonstration of kin selection emerging naturally‌ within a population of‌ agents undergoing continuous neuroevolution.‌‌ Specifically, we find that zero-sum transfer of resources‌ from parents to their‌ infant offspring evolves through‌‌ kin selection in environments where it is hard‌ for offspring to survive‌ alone. In an additional‌‌ experiment, we show that kin selection in our‌ simulations relies on a‌ combination of kin recognition‌‌ and population viscosity. We believe that our work‌ may contribute to the‌ understanding of kin selection‌‌ in minimal evolutionary systems, without explicit notions of‌ genes and fitness maximisation.‌

Figure 26:‌ The relationship between the‌‌ estimated benefit to infants of being fed and‌ both the amount and‌ selectivity of feeding observed,‌‌ shown separately for each of the three experimental‌ parameters we varied (and‌ combined in the rightmost‌‌ column). Each scatterplot point represents a single 500k-timestep‌ simulation run (with values‌ averaged over the final‌‌ 50k timesteps); regression lines (with 95% confidence intervals)‌ are shown in green.‌ Note that the $y‌‌$ -axis shows $log$ (measure) for both amount and‌ selectivity.

This paper was‌ accepted at The International‌‌ Conference on the Applications of Evolutionary Computation (EvoAPPS)‌ 2025 (part of EvoStar).‌

8.3.4 Evolving large populations‌‌ of adaptive neural agents in ecologically plausible environments‌

Participants: Timothé Boulet [correspondant]‌, Gautier Hamon,‌‌ Clément Moulin-Frier.

This work continues the project‌ presented in the previous‌ paragraph, with a focus‌‌ on the ability of agents to develop adaptability‌ behaviors. Specifically, we extend‌ the framework by adding‌‌ fruits, a spatially variable ressource, and a memory‌ of the values of‌ each type of fruits‌‌ for the agents. The goal is to observe‌ whether the agents manage‌ to exploit the knowledge‌‌ of the fruits values to decide which fruit‌ exploit.

Results : the‌ agent were able to‌‌ exploit the fruit value information to optimize their‌ behavior. There were also‌ some results that we‌‌ were not necessarily expecting and that comes from‌ our choice of model.‌ Notably, it seems the‌‌ agents choice for exploiting a cluster is heavily‌ influenced by social criteria‌ (the number of agents‌‌ already exploiting it) and cultural criteria (whether the‌ cluster is empty or‌ full of fruits). This‌‌ effect exceeds the adapatability effect in the latest‌ stages of the simulations.‌

8.4 Theories and experiments‌‌ on human curiosity-driven learning

8.4.1 DevCur Project: studying‌ the co-development of curiosity,‌ metacognition and agency in‌‌ adolescents

Participants: Julien Rosenberger [correspondent], Pierre-Yves Oudeyer‌, Hélène Sauzéon.‌

Under the scope of‌‌ the DevCur project, the‌ PhD of Julien Rosenberg was started on the‌ following topic: “How curiosity enhances learning across childhood‌ and adolescence: Models and experimentation of the role‌ of metacognition and agency”. After exploring the literature,‌ a specific project was settled that aims to‌ compare the personality constructs around the intellect. The‌ investigated personality constructs are metacognitive skills (eg, 148‌), curiosity traits (eg, 115), sense of‌ agency (eg, 162) and intellectual humility (eg,‌ 73). Intellectual humility is about correctly setting‌ one’s cognitive limitations 169. The self-report scale‌ of Alfano et al 73 hinges intellectual humility‌ on other intellectual traits: open-mindedness (recognizing one’s cognitive‌ limitations and having appetite for knowledge without concerns‌ for social status), intellectual modesty (having low concern‌ for being deemed smart), engagement (being able to‌ confront oneself to what one doesn’t understand or‌ is different from one’s perspective) and corrigibility (being‌ emotionally stable when one is intellectually challenged). The‌ labels are slightly odd but emphasize the diversity‌ of intellectual traits one could consider in learning.‌

A first axis is to understand the organization‌ of those constructs. For instance, intellectual humility has‌ been linked to greater general knowledge and a‌ tendency to underestimate one’s cognitive ability 119.‌ Those dependent variables are also respectively related to‌ curiosity trait 167 and low self-esteem and low‌ metacognition 165. A second axis is to‌ obtain behavioral markers of those constructs. This need‌ for situationally-bounded measures is crucial for intellectual humility‌ 164. It is currently measured through self-reports‌ or other-reports. Yet, self-reports pose an issue because‌ being intellectually humble is socially desirable 119,‌ requires some recall to form that self-referenced attribute‌ and faces the paradox of self-attribution (ie some‌ humility is required to say if one is‌ humble). The other reports are alternatively resource intensive‌ and brings other factors (context, relationship…).

8.5 Generative‌ AI and educational technologies

8.5.1 Investigating the‌ use of LLM in middle school.

Participants: Pierre-Yves‌ Oudeyer, Hélène Sauzéon [correspondant], Rania Abdelghani‌.

ChatGPT, one of the most widely used‌ generative AI (LLM) tools, has made accessing mass‌ and personalized information easy and straightforward, even for‌ users without expertise in AI. More particularly, recent‌ reports indicate that the majority of surveyed students‌ aged nine and older have already used this‌ tool for school-related tasks. However, while we know‌ that students are using ChatGPT, there is limited‌ understanding of how they use it and its‌ effects on their learning processes and outcomes, particularly‌ among middle and high school students and in‌ subjects outside programming.

Investigating these patterns of use‌ is a critical step toward identifying the necessary‌ educational interventions to mitigate risks associated with misuse‌ or harmful interactions with ChatGPT, which are particularly‌ likely among non-expert users. To address this, we‌ recruited 63 students aged 14 to 15 and‌ asked them to solve science problems using ChatGPT.‌ We examined their prompt choices, evaluations of ChatGPT's responses, and final problem-solving‌ outcomes. Overall, our results‌ indicate that students are‌‌ still inefficient users of AI tools such as‌ ChatGPT and are vulnerable‌ to incorporating its misinformation,‌‌ even when they report high domain knowledge and‌ previous experience with generative‌ AI. This highlights potential‌‌ misconceptions about these tools’ capabilities and the skills‌ required to use them‌ effectively. Furthermore, domain knowledge‌‌ alone appears insufficient to shield students from adopting‌ misinformation generated by ChatGPT.‌ Implementing formal educational interventions‌‌ to correct these misconceptions and train students for‌ informed usage thus seems‌ both timely and essential,‌‌ given the growing reliance on generative AI tools‌ in education. On the‌ longer term, fostering metacognitive‌‌ skills may further promote responsible and effective use‌ of such tools (paper‌ in preparation)

8.5.2 Study‌‌ impact of a pedagogical intervention on GenAI in‌ middle school students

Participants:‌ Pierre-Yves Oudeyer, Hélène‌‌ Sauzeon, Olivier Clerc [correspondant], Chloé Desvaux‌, Rania Abdelghani,‌ Eliott Poisson, Kan‌‌ Yao, Didier Roy.

Context and Objective.‌

Generative AI (GenAI) systems‌ such as ChatGPT are‌‌ increasingly used by students, including for schoolwork. A‌ pilot study conducted in‌ 2024 by the Flowers‌‌ team with 63 students aged 14–15 showed that‌ students experience major difficulties‌ in formulating effective prompts‌‌ and in evaluating the quality of AI-generated answers,‌ which negatively impacts their‌ performance in scientific problem-solving‌‌ tasks. Building on this work, we evaluated the‌ impact of a short‌ pedagogical intervention (2 hours)‌‌ aimed at improving students’ ability to formulate and‌ critically evaluate prompts before‌ querying a large language‌‌ model (LLM).

Task

Students had to solve six‌ middle-school science problems using‌ ChatGPT (or a similar‌‌ system such as DuckDuckAI). Each problem included a‌ statement, an image, a‌ question, and a suggested‌‌ prompt. Students could choose to use or ignore‌ the suggested prompt. Two‌ types of prompts were‌‌ provided: valid prompts (clear context and precise instructions)‌ and invalid prompts (insufficient‌ or vague context).

Pedagogical Intervention.

For the experimental group,‌ the study took place‌ in two phases. Two‌‌ days before the task session, students participated in‌ a two-hour classroom workshop‌ designed to strengthen their‌‌ theoretical and practical understanding of GenAI. The workshop‌ consisted of three parts:‌

an introduction explaining how‌‌ generative AI systems work,
a discussion of their‌ limitations, risks, and biases,‌
a practical session in‌‌ which students trained to‌ analyze and reformulate prompts.

Main Results.

Overall, students‌ who benefited from the pedagogical intervention achieved higher‌ performance than those in the control group. In‌ quantitative terms, the mean score (out of 20)‌ was approximately 10.3 in the control group—comparable to‌ the 2024 pilot study—and approximately 11.4 in the‌ experimental group. This difference is statistically significant (p‌ < .05). The experimental group not only obtained‌ significantly higher scores, but also demonstrated more strategic‌ use of the AI system. In particular, they‌ rejected invalid prompts more frequently and were more‌ likely to reformulate or refine their queries when‌ the initial answer was unsatisfactory. Moreover, formulating their‌ own prompts tended to maintain or improve performance,‌ even in cases where the suggested prompt was‌ already valid. Importantly, self-reported prior knowledge about AI‌ was not associated with better performance, suggesting that‌ explicit instruction and practice played a more decisive‌ role than familiarity with the technology alone.

Figure 28‌: Workshop effects on performance and prompt acceptance.‌ (A) Students in the experimental group achieved higher‌ scores than those in the control group on‌ the science exercises. (B) Sensitivity ( $d^{'‌}$ ) was positively associated with performance, accounting for‌ differences between groups. (C) Predictors of prompt acceptance‌ from a generalized linear model, showing the effects‌ of condition, confidence, and prompt validity.

Future Research.‌

Future work will aim to extend and consolidate‌ these findings in several directions. First, longitudinal studies‌ will be needed to assess whether the strategies‌ acquired during the workshop are retained over time‌ and whether students continue to apply them beyond‌ the immediate post-intervention period. Second, the present study‌ focused on science problem solving with middle-school students.‌ Future research should examine whether similar short pedagogical‌ interventions yield comparable benefits in other subjects (e.g.,‌ mathematics, history, writing) and with learners of different‌ age groups.

8.5.3 LLM4Humanities: An Open-Source Toolkit for‌ LLM-Assisted Qualitative Research

Participants: Olivier Clerc [correspondant],‌ Grgur Kovač, Chloé Desvaux, Gaia Molinaro‌, Pierre-Yves Oudeyer.

Context and Objective.

Qualitative‌ research in experimental psychology and the humanities often‌ relies on manual annotation of textual data using‌ defined codebooks. This process is indispensable but time-consuming‌ and costly. Moreover, best practices require at least‌ two independent annotators in order to compute inter-rater‌ reliability (IRR), which further increases the required resources.‌ IRR is crucial to distinguish variance due to‌ coder subjectivity from variance due to the phenomenon‌ under study, yet in practice it is frequently‌ omitted, misreported, or computed using inadequate metrics (e.g.,‌ raw percentage agreement or simple correlations). The objective‌ of the LLM4Humanities project is to design an‌ open-source, Python-based toolkit and web application that leverages‌ LLMs to support, accelerate, and improve the methodological‌ rigor of qualitative annotation workflows.

System and Workflow.‌

LLM4Humanities provides an end-to-end pipeline combining manual annotation,‌ automated classification, and statistical evaluation. In a typical‌ workflow, researchers first manually annotate a small subset of the dataset. An‌ LLM is then used‌ to automatically classify the‌‌ remaining data. The system subsequently compares the model’s‌ predictions to the human-annotated‌ subset using appropriate IRR‌‌ metrics, confidence intervals, and decision guidance, allowing researchers‌ to assess both annotation‌ reliability and model performance.‌‌

Generation Mode.

In addition to annotation assistance, LLM4Humanities‌ includes a generation mode‌ designed to support the‌‌ creation of experimental material. In this mode, users‌ can select one or‌ several template items (e.g.,‌‌ a mathematics exercise) and specify a set of‌ constraints. The system then‌ generates multiple new variants‌‌ of the item. These generated items can subsequently‌ be passed through the‌ same annotation and evaluation‌‌ pipeline, providing a first automated assessment of the‌ quality and consistency of‌ the generated content.

8.5.4‌‌ GAIMHE: Generative AI and Hybrid Models for Education‌

Participants: Pierre-Yves Oudeyer,‌ Olivier Clerc, Hélène‌‌ Sauzéon, EvidenceB .

Context and Objective.

Recent‌ advances in generative AI‌ have opened new possibilities‌‌ for personalized education, but fully LLM-based educational systems‌ raise major concerns in‌ terms of cost, scalability,‌‌ robustness, pedagogical control, and environmental impact. At the‌ same time, classical Intelligent‌ Tutoring Systems (ITS) offer‌‌ strong pedagogical structure and efficiency, but lack flexibility‌ for open-ended interaction and‌ content generation. The GAIMHE‌‌ project aims to design and evaluate hybrid educational‌ architectures that combine the‌ strengths of both approaches:‌‌ frugal and pedagogically robust ITS for macro-level orchestration,‌ and generative AI models‌ (LLMs/SLMs) for micro-level personalization,‌‌ feedback, and content generation. Beyond technical integration, the‌ project also aims to‌ structure an open ecosystem‌‌ of methods, data, and benchmarks to support reproducible‌ and scalable uses of‌ generative AI in education.‌‌

Project Architecture

The proposed architecture is organized around‌ two complementary modes of‌ use of generative models.‌‌ First, large banks of pedagogical exercises are pre-generated‌ using large language models‌ and then validated by‌‌ human experts and orchestrated by structured teaching algorithms‌ within ITS. This content‌ is stored and reused‌‌ in order to minimize live calls to large‌ models during learner interactions.‌ Second, smaller language models,‌‌ or external APIs to larger proprietary models when‌ needed, are used in‌ real time to provide‌‌ specific feedback. This design ensures personalized support while‌ preserving computational efficiency, pedagogical‌ control, and scalability.

Data‌‌ Generation and Evaluation Strategy.

A central component of‌ the project concerns the‌ large-scale generation, structuring, and‌‌ validation of pedagogical datasets. This work relies on‌ two existing software tools:‌ Sphinx, an internal platform‌‌ used for the annotation and creation of pedagogical‌ content, and LLM4Humanities, an‌ open-source toolkit providing similar‌‌ functionalities through a Streamlit-based interface. In parallel, the‌ project is developing unified‌ data structures for representing‌‌ exercises and real student learning trajectories collected from‌ educational platforms, with the‌ goal of sharing these‌‌ resources as digital commons through open repositories such‌ as GitHub and Hugging‌ Face. We are also‌‌ developing a web-based visualization platform for exploring learning‌ trajectories and learner profiles,‌ aimed at both researchers‌‌ and non-technical stakeholders. A‌ first prototype of this platform has already been‌ implemented.

8.6 Curiosity-driven learning in educational technologies

Since‌ 2019 (Idex cooperation fund between the University of‌ Bordeaux and the University of Waterloo, Canada) and‌ the recent creation of CuriousTECH associate team in‌ 2022 (led by the Flowers team and involving‌ F. Lotte from the Potioc team and M.‌ Fernendes and E. Law from the Waterloo University),‌ we continue our work on the development of‌ new curiosity-driven interaction systems. Substantial progress has been‌ made in this area of application of FLOWERS‌ works (see the website of CuriousTECH team.)‌

8.6.1 New digital approaches for studying curiosity-driven learning‌

Participants: Hélène Sauzeon [correspondant], Pierre-Yves Oudeyer [correspondant]‌, Rania Abdelghani, Mehdi Alaimi, Fabien‌ Lotte, Aurélien appriou, Myra Fernandes,‌ Edith Law, Yadurshana Sivashankar.

As curiosity‌ is a recent research topic, we studied some‌ basic mechanisms of curiosity-based learning, thanks to three‌ studies have been completed.

The first one regards‌ a new interactive educational application to foster curiosity-driven‌ question-asking in children. Determined to improve children’s curiosity,‌ we developed a new interactive system aiming to‌ foster curiosity-related question-asking from texts and their perception‌ of curiosity. To assess its efficiency, we conducted‌ a study with 95 fifth grade students of‌ Bordeaux elementary schools. Two types of interventions were‌ designed, one trying to focus children on the‌ construction of low-level question (i.e. convergent) and one‌ focusing them on high-level questions (i.e. divergent) with‌ the help of prompts or questions starters models.‌ We observed that both interventions increased the number‌ of divergent questions, the question fluency performance, while‌ they did not significantly improve the curiosity perception‌ despite high intrinsic motivation scores they have elicited‌ in children. The curiosity-trait score positively impacted the‌ divergent question score under divergent condition, but not‌ under convergent condition. The overall results supported the‌ efficiency and usefulness of digital applications for fostering‌ children’s curiosity that we need to explore further.‌ The overall results are published in CHI'20 72‌. In parallel to these first experimental works,‌ we wrote this year a review of the‌ existing works on the subject 80.

The‌ second study investigates the neurophysiological underpinnings of curiosity‌ and the opportunities of their use for Brain-computer‌ interactions 74. Understanding the neurophysiological mechanisms underlying‌ curiosity and therefore being able to identify the‌ curiosity level of a person, would provide useful‌ information for researchers and designers in numerous fields‌ such as neuroscience, psychology, and computer science. A‌ first step to uncovering the neural correlates of‌ curiosity is to collect neurophysiological signals during states‌ of curiosity, in order to develop signal processing‌ and machine learning (ML) tools to recognize the‌ curious states from the non-curious ones. Thus, we‌ ran an experiment in which we used electroencephalography‌ (EEG) to measure the brain activity of participants‌ as they were induced into states of curiosity,‌ using trivia question and answer chains. We used two ML algorithms, i.e.‌ Filter Bank Common Spatial‌ Pattern (FBCSP) coupled with‌‌ a Linear Discriminant Algorithm (LDA), as well as‌ a Filter Bank Tangent‌ Space Classifier (FBTSC), to‌‌ classify the curious EEG signals from the non-curious‌ ones. Global results indicate‌ that both algorithms obtained‌‌ better performances in the 3-to-5s time windows, suggesting‌ an optimal time window‌ length of 4 seconds‌‌ to go towards curiosity states estimation based on‌ EEG signals. These results‌ have been published 74‌‌.

Thanks to Virtual reality device, a third‌ study investigates the role‌ of intrinsic motivation in‌‌ spatial learning in children 159. In this‌ study, the state curiosity‌ is manipulated as a‌‌ preference for a level of uncertainty during the‌ exploration of new virtual‌ environments. To this end,‌‌ a series of virtual environments have been created‌ and is presented to‌ children. During encoding, participants‌‌ explore routes in environments according the three levels‌ of uncertainty (low, medium,‌ and high), thanks to‌‌ a virtual reality headset and controllers and, are‌ later asked to retrace‌ their travelled routes. The‌‌ exploration area and the wayfinding. ie the route‌ overlap between encoding and‌ retrieval phase, (an indicator‌‌ of spatial memory accuracy) are measured. Neuropsychological tests‌ are also performed. The‌ results showed that there‌‌ are better performances under the medium uncertainty condition‌ in terms of exploration‌ area and wayfinding score.‌‌ These first results supports the idea that curiosity‌ states are a learning‌ booster. In Sivashankar et‌‌ al. study, 10-year-old children (20 females; 22 males)‌ with low to high‌ trait curiosity actively explored‌‌ virtual environments 29 containing varying levels of uncertainty‌ (low, medium, high) (Fig.‌ 30), after which‌‌ memory for the route travelled was assessed 159‌.

Figure 29‌‌: First-person view and bird’s-eye view of the‌ three styles of virtual‌ environments. Participants only experienced‌‌ the environments from a first-person perspective.

Figure 30: From left to right: Condition‌ 1 with Low Uncertainty‌ (1 character); Condition 2‌‌ with Medium Uncertainty (3 characters); and Condition 3‌ with High uncertainty (7‌ characters)

As trait curiosity‌‌ increased (31), so did memory performance‌ in the high uncertainty‌ condition, suggesting that children‌‌ with high levels of curiosity can better recruit‌ cognitive resources within such‌ environments. Children with high‌‌ compared to low curiosity also had higher feelings‌ of presence during the‌ immersive experience. Importantly, in‌‌ environments with medium uncertainty, children with low trait‌ curiosity were able to‌ perform as well as‌‌ those with high curiosity. Results suggest that individual‌ differences in trait curiosity‌ influences route memory in‌‌ environments with varying levels of uncertainty.

Figure 31:‌ Route Memory Score (black circles) and Intrinsic Motivation‌ Score (white circles) in Low-and High-curiosity Groups as‌ a Function of the Three Uncertainty Conditions (Low,‌ Medium and High)

8.6.2 Fostering curiosity and metacognition‌ in classrooms

Participants: Pierre-Yves Oudeyer, Hélène Sauzéon‌ [correspondant], Rania Abdelghani, Chloé Desvaux.‌

Promoting curiosity by supporting divergent thniking

Previous work‌ aimed to propose new educational technologies driven by‌ epistemic curiosity. A central question of this work‌ was to specify the impact of self-questioning aroused‌ by states of curiosity (i.e., the identification of‌ knowledge gaps and formulation of learning goals) on‌ student performance. To this end, a web platform‌ called "Kids Ask" was designed, developed, and tested‌ in primary schools. The tool offered an interaction‌ with a conversational agent that trained children's abilities‌ to generate curiosity-driven questions and use these questions‌ to explore a learning environment and acquire new‌ knowledge. Results from this study suggested that the‌ configuration helped enhance children's questioning and exploratory behaviors;‌ they also showed that learning progress differences in‌ children can be explained by differences in their‌ curiosity-driven behaviors 69.

Figure 32: Illustration of a conversational‌ agent's strategies in the different work spaces of‌ the "Kids Ask" platform

The ability to formulate‌ curiosity-driven questions (i.e., new learning goals)‌ likely relies upon divergent thinking mechanisms, as suggested‌ by literature highlighting links between curiosity and creativity‌ 117156. In this regard, a novel‌ version of the Kids Ask training was proposed‌ and tested in a field study involving a‌ total of 130 children aged 9 to 11‌ years. These experiments aimed to further assess the‌ interplay between curiosity and creativity in question-asking behaviors.‌ Drawing from creativity literature, we examined the process‌ of question formulation through associative thinking involved in‌ creativity. To do so, the conversational agent's behavior‌ in "Kids Ask" was modified to prompt children‌ to identify important keywords from a text, then‌ generate free associations based on their prior knowledge.‌ Given the intricate interplay between curiosity and creativity,‌ it was hypothesized that this associative guidance would‌ further enhance children's ability to formulate divergent, curiosity-driven‌ questions (as shown in figure 33)

Figure 33: Illustration of a‌ conversational agent's behavior in the different question-asking workspace‌ of the "Kids Ask" platform

Promoting curiosity and‌ metacognition in authentic settings

Curiosity-driven learning is crucial‌ for academic achievement and autonomous learning, yet remains‌ scarce in primary classrooms. Building on our previous work with the IGSA‌ framework (Identify-Guess-Seek-Assess) introduced in‌ 68, we developed‌‌ a training paradigm that teaches curiosity-driven learning through‌ metacognitive skills training. This‌ approach leverages Murayama's framework‌‌ 133 by personifying the four basic metacognitive skills‌ as animated characters: the‌ referee (identify knowledge gaps),‌‌ the detective (formulate predictions), the explorer (seek information),‌ and the second referee‌ (assess information quality).

Figure 34.a — Figure 34: Curiosity-driven learning framework‌ and link with the‌ metacognitive skills we propose‌‌ to train as facilitators during our IGSA-based intervention‌

The two-part intervention combined‌ declarative knowledge about curiosity‌‌ and metacognition with procedural training of the four‌ metacognitive strategies. The first‌ step consisted of animated‌‌ videos explaining key concepts related to curiosity, metacognition,‌ and the four skills‌ through 2D characters. The‌‌ second step involved the "Kids Reflect" web-based platform,‌ where conversational agents with‌ the same appearance and‌‌ roles as the video characters prompted children to‌ use these skills appropriately‌ during reading-comprehension tasks (see‌‌ figure below).

Figure 35.a — Figure 35‌‌: Screenshot of the ”Kids Reflect” platform during‌ the training, given one‌ text

Our earlier pilot‌‌ studies with small classroom samples demonstrated the accessibility‌ and positive impact of‌ this training on metacognitive‌‌ efficiency, curiosity-driven question-asking, and learning outcomes. These promising‌ initial results motivated a‌ larger-scale validation study to‌‌ assess both the intervention's effectiveness and its scalability‌ in authentic educational settings.‌

Study design and implementation‌‌

This implies considering the interventions' effectiveness when teachers‌ implement it themselves with‌ their classroom. Therefore, in‌‌ a field study conducted with 159 students aged‌ 9-10 years across five‌ elementary schools in Bordeaux‌‌ Métropole and 4 teachers, the multimedia-based metacognitive intervention‌ was tested using a‌ pseudo-RCT design in collaboration‌‌ with the Académie de Bordeaux. Three main experimental‌ conditions were compared: intervention‌ led by researchers, intervention‌‌ led by trained in-service teachers, and a control‌ group. Additionally, complete and‌ partial versions of the‌‌ intervention were contrasted. Prior to the intervention, teachers‌ underwent short training sessions‌ delivering curiosity and metacognitive‌‌ concept knowledge and to familiarize themselves with the‌ format and content, enabling‌ them to autonomously implement‌‌ the intervention in their classrooms during regular school‌ hours.

Main findings

Results‌ demonstrated that intervention groups‌‌ significantly improved their divergent question-asking abilities and developed‌ more positive perceptions of‌ curiosity compared to the‌‌ control group. Importantly, this was the case in‌ the ecological setting of‌ classrooms where teachers managed‌‌ the intervention themselves, but also with a lighter‌ easy-to-implement version of the‌ training (see figure below).‌‌

Figure 36.a — Figure 36‌: Post-interventions results scoring‌‌ of a divergent question-asking fluency test

However, nuanced‌ findings emerged regarding teacher‌ delivery conditions. These groups‌‌ showed lower performance during the intervention and poorer‌ learning outcomes, alongside higher‌ cognitive load, compared to‌‌ researcher-led groups. This suggests‌ that while the intervention can be effectively scaled‌ to teacher-led implementations, some avenues for improvement have‌ been identified. This point was further informed by‌ qualitative interviews conducted with volunteered teachers who were‌ animators in the study. Teachers rated the intervention‌ highly on acceptability and usefulness, recognizing its pedagogical‌ value. However, they provided lower ratings on usability,‌ citing the complexity of metacognitive concepts and digital‌ interface challenges as primary obstacles. The impact of‌ these lower usability reports on students' performance highlights‌ critical considerations for scaling educational interventions. While teachers‌ appreciated the theoretical foundations and goals of the‌ training, the cognitive demands of simultaneously managing complex‌ pedagogical concepts and digital tools during classroom implementation‌ appeared to affect their delivery quality, which in‌ turn influenced student outcomes.

Implications and future directions‌

Together, these findings demonstrate that the metacognitive intervention‌ can enhance curiosity-driven learning in authentic classroom settings.‌ However, successful scaling requires strengthened teacher training. Future‌ iterations of this work will focus on simplifying‌ the intervention, providing more comprehensive teacher training programs,‌ and developing materials increasing perceived usability for teachers‌ as a way to favor adoption of such‌ workshops. In response to these identified needs, we‌ initiated in 2025 the creation of comprehensive resources‌ for teachers around metacognitive interventions, motivation, and curiosity-driven‌ learning. This development work focuses on providing teachers‌ with accessible, evidence-based materials that bridge the gap‌ between research findings and classroom practice. The resources‌ include short, evidence-based exercises designed for direct implementation‌ in the classroom, accompanied by detailed recommendations and‌ pedagogical guidance. These materials aim to reduce the‌ cognitive load on teachers by providing ready-to-use activities‌ while maintaining the theoretical rigor and pedagogical effectiveness‌ demonstrated in our research. The exercises are structured‌ to be modular and adaptable to different classroom‌ contexts, addressing the complexity concerns raised by teachers‌ in our scalability study.

This latter point contributes‌ to a broader research agenda of developing practical‌ teacher resources on curiosity-driven learning in educational settings‌ as a way to bridge the research-to-practice gap‌ in educational interventions focused on curiosity and metacognition.‌

8.6.3 Machine Learning for Adaptive Personalization in Intelligent‌ Tutoring Systems

Participants: Pierre-Yves Oudeyer [correspondant], Hélène‌ Sauzeon [correspondant], Benjamin Clément, Didier Roy‌, Cécile Mazon.

The Kidlearn project.

is‌ a research project studying how machine learning can‌ be applied to intelligent tutoring systems. It aims‌ at developing methodologies and software which adaptively personalize‌ sequences of learning activities to the particularities of‌ each individual student. Our systems aim at proposing‌ to the student the right activity at the‌ right time, maximizing concurrently his learning progress and‌ his motivation. In addition to contributing to the‌ efficiency of learning and motivation, the approach is‌ also made to reduce the time needed to‌ design ITS systems.

We continued to develop an‌ approach to Intelligent Tutoring Systems which adaptively personalizes‌ sequences of learning activities to maximize skills acquired‌ by students, taking into account the limited time and motivational resources. At‌ a given point in‌ time, the system proposes‌‌ to the students the activity which makes them‌ progress faster. We introduced‌ two algorithms that rely‌‌ on the empirical estimation of the learning progress,‌ RiARiT that uses information‌ about the difficulty of‌‌ each exercise and ZPDES that uses much less‌ knowledge about the problem.‌

The system is based‌‌ on the combination of three approaches. First, it‌ leverages recent models of‌ intrinsically motivated learning by‌‌ transposing them to active teaching, relying on empirical‌ estimation of learning progress‌ provided by specific activities‌‌ to particular students. Second, it uses state-of-the-art Multi-Arm‌ Bandit (MAB) techniques to‌ efficiently manage the exploration/exploitation‌‌ challenge of this optimization process. Third, it leverages‌ expert knowledge to constrain‌ and bootstrap initial exploration‌‌ of the MAB, while requiring only coarse guidance‌ information of the expert‌ and allowing the system‌‌ to deal with didactic gaps in its knowledge.‌ The system was evaluated‌ in several large-scale experiments‌‌ relying on a scenario where 7-8 year old‌ schoolchildren learn how to‌ decompose numbers while manipulating‌‌ money 87. Systematic experiments were also presented‌ with simulated students.

Kidlearn‌ Experiments 2018-2019: Evaluating the‌‌ impact of ZPDES and choice on learning efficiency‌ and motivation.

An experiment‌ was held between March‌‌ 2018 and July 2019 in order to test‌ the Kidlearn framework in‌ classrooms in Bordeaux Metropole.‌‌ 600 students from Bordeaux Metropole participated in the‌ experiment. This study had‌ several goals. The first‌‌ goal was to evaluate the impact of the‌ Kidlearn framework on motivation‌ and learning compared to‌‌ an Expert Sequence without machine learning. The second‌ goal was to observe‌ the impact of using‌‌ learning progress to select exercise types within the‌ ZPDES algorithm compared to‌ a random policy. The‌‌ third goal was to observe the impact of‌ combining ZPDES with the‌ ability to let children‌‌ make different kinds of choices during the use‌ of the ITS. The‌ last goal was to‌‌ use the psychological and contextual data measures to‌ see if correlation can‌ be observed between the‌‌ students psychological state evolution, their profile, their motivation‌ and their learning. We‌ first show that LP-based‌‌ personalization improves learning performance (reproducing and solidifying previous‌ results) while producing a‌ positive and motivating learning‌‌ experience. We then show that the addition of‌ self-choice as a playful‌ feature triggers intrinsic motivation‌‌ in the learner and reinforces the learning effectiveness‌ of the LP-based personalizing.‌ In doing so, it‌‌ strengthens the links between intrinsic motivation and performance‌ progress during the serious‌ game. Conversely, deleterious effects‌‌ of the playful feature are observed for hand-designed‌ linear paths. Thus, the‌ intrinsic motivation elicited by‌‌ a playful feature is beneficial only if the‌ curriculum personalization is effective‌ for the learner. Such‌‌ a result deserves great attention due to the‌ increased use of playful‌ features in non adaptive‌‌ educational technologies available in the market. Details of‌ these new results, as‌ well as the overall‌‌ results of this project,‌ are presented in Benjamin Clément PhD thesis 86‌ and are currently being processed to be published.‌

Kidlearn and Adaptiv'Math.

The algorithms developed during the‌ Kidlearn project and Benjamin Clement thesis 86 are‌ being used in an innovation partnership for the‌ development of a pedagogical assistant based on artificial‌ intelligence intended for teachers and students of cycle‌ 2. The algorithms are being written in typescript‌ for the need of the project. The expertise‌ of the team in creating the pedagogical graph‌ and defining the graph parameters used for the‌ algorithms is also a crucial part of the‌ role of the team for the project. One‌ of the main goal of the team here‌ is to transfer technologies developed in the team‌ in a project with the perspective of industrial‌ scaling and see the impact and the feasibility‌ of such scaling.

Kidlearn for numeracy skills with‌ individuals with autism spectrum disorders.

Few digital interventions‌ targeting numeracy skills have been evaluated with individuals‌ with autism spectrum disorder (ASD) 128127.‌ Yet, some children and adolescents with ASD have‌ learning difficulties and/or a significant academic delay in‌ mathematics. While ITS are successfully developed for typically‌ developed students to personalize learning curriculum and then‌ to foster the motivation-learning coupling, they are not‌ or fewly proposed today to student with specific‌ needs. The objective of this pilot study is‌ to test the feasibility of a digital intervention‌ using an STI with high school students with‌ ASD and/or intellectual disability. This application (KidLearn) provides‌ calculation training through currency exchange activities, with a‌ dynamic exercise sequence selection algorithm (ZPDES). 24 students‌ with ASD and/or DI enrolled in specialized classrooms‌ were recruited and divided into two groups: 14‌ students used the KidLearn application, and 10 students‌ received a control application. Pre-post evaluations show that‌ students using KidLearn improved their calculation performance, and‌ had a higher level of motivation at the‌ end of the intervention than the control group.‌ These results encourage the use of an STI‌ with students with specific needs to teach numeracy‌ skills, but need to be replicated on a‌ larger scale. Suggestions for adjusting the interface and‌ teaching method are suggested to improve the impact‌ of the application on students with autism. 125‌.

8.6.4 Machine learning for adaptive cognitive training‌

Participants: Pierre-Yves Oudeyer, Hélène Sauzéon [correspondant],‌ Masataka Sawayama, Benjamin Clément, Maxime Adolphe‌, Marion Pech, Juliette Deyts.

Because‌ of its cross-cutting nature to all cognitive activities‌ such as learning tasks, attention is a hallmark‌ of good cognitive health throughout life and more‌ particularly in the current context of societal crisis‌ of attention. Recent works have shown the great‌ potential of computerized attention training for an example‌ of attention training, with efficient training transfers to‌ other cognitive activities, and this, over a wide‌ spectrum of individuals (children, elderly, individuals with cognitive‌ pathology such as Attention Deficit and Hyperactivity Disorders). Despite this promising result,‌ a major hurdle is‌ challenging: the high inter-individual‌‌ variability in responding to such interventions. Some individuals‌ are good responders (significant‌ improvement) to the intervention,‌‌ others respond variably, and finally some respond poorly,‌ not at all, or‌ occasionally. A central limitation‌‌ of computerized attention training systems is that the‌ training sequences operate in‌ a linear, non-personalized manner:‌‌ difficulty increases in the same way and along‌ the same dimensions for‌ all subjects. However, different‌‌ subjects require in principle a progression at a‌ different, personalized pace according‌ to the different dimensions‌‌ that characterize attentional training exercises.

To tackle the‌ issue of inter-individual variability,‌ the present project proposes‌‌ to apply some principles from intelligent tutoring systems‌ (ITS) to the field‌ of attention training. In‌‌ this context, we have already developed automatic curriculum‌ learning algorithms such as‌ those developed in the‌‌ KidLearn project, which allow to customize the learner's‌ path according to his/her‌ progress and thus optimize‌‌ his/her learning trajectory while stimulating his/her motivation by‌ the progress made. ITS‌ are widely identified in‌‌ intervention research as a successful way to address‌ the challenge of personalization,‌ but no studies to‌‌ date have actually been conducted for attention training.‌ Thus, whether ITS, and‌ in particular personalization algorithms,‌‌ can optimize the number of respondents to an‌ attention training program remains‌ an open question.

Grounded‌‌ state-of-the-art.

To investigate this question, we first conducted‌ a systematic review aiming‌ at exploring existing methods‌‌ in computerized CT and analyzing their outcomes in‌ terms of learning mechanics‌ (intra-training performance) and effectiveness‌‌ (near, far and everyday life transfer effects of‌ CT) 71. A‌ search up to June‌‌ 2023 with multiple databases selecting 19 computerized CT‌ studies revealed that only‌ two studies emphasized the‌‌ favorable influence of individualization on CT effectiveness, while‌ five underscored its capacity‌ to enhance the training‌‌ experience by boosting motivation, engagement, and offering diverse‌ learning pathways. In sum,‌ despite promising results in‌‌ this new research avenue, more research is needed‌ to fully understand and‌ empirically support individualized techniques‌‌ in cognitive training.

Figure 37‌: Distribution of AI‌ techniques depending on type‌‌ of CT studied (multi or single domain) from‌ Adolphe et al., 2024‌

Complementing the study of‌‌ adaptive methods applied to cognitive training, we have‌ attempted through a review‌ of the subjective literature‌‌ to gain a better understanding of the Multiple‌ Object Tracking (MOT) task,‌ which seems to have‌‌ the best results in terms of attentional training‌ efficiency in young and‌ older adults. Our investigation‌‌ pursues three main objectives: (1) identifying the cognitive‌ processes influenced by each‌ adjustable parameter of the‌‌ MOT task; (2) determining which parameters, when progressively‌ adapted during repeated MOT‌ practice, produce the greatest‌‌ enhancements in task performance; and (3) evaluating how‌ improvements in MOT performance‌ translate into effective transfer‌‌ effects, including practical, real-world‌ outcomes. The evidence suggests that the MOT task‌ involves a nuanced interplay of visual processing, attentional‌ resources, and working memory, shaped by the intrinsic‌ properties of the objects and the task conditions.‌ The results of this work highlight that: (1)‌ Multiple cognitive mechanisms are identified as active in‌ the task (divided and sustained attention; foveal and‌ peripheric attention ; automatic and controlled inhibition, etc.‌ ); (2) a limited number of studies have‌ actually implemented the MOT task in computer-assisted cognitive‌ training; and (3) tIt's the near (attention tasks)‌ and far (other cognitive tasks) effects that are‌ well documented as positive outcomes of MOT-based training‌ while there is a scarcity of research that‌ has thoroughly analyzed the ecological effects of attentional‌ training, namely the potential transfer effects in everyday‌ life (paper in progress).

ZPDES calibration for MOT‌ training (Young participants).

In parallel to this, a‌ web platform has been designed for planning and‌ implementing remote behavioural studies. This tool provides means‌ for registering recruited participants remotely and executing complete‌ experimental protocols: from presenting instructions and obtaining informed‌ consents, to administering behavioural tasks and questionnaires, potentially‌ throughout multiple sessions spanning days or weeks. In‌ addition to this platform, a cognitive test battery‌ composed of seven classical behavioural tasks has been‌ developed. This battery aims to evaluate the evolution‌ of the cognitive performance of participants before and‌ after training. Fully open-source, it mainly targets attention‌ and memory. A preliminary study on a large‌ sample of 50 healthy participants showed that the‌ developed tasks reproduced the results of previous studies,‌ that there were large differences between individuals (no‌ ceiling effect) and that the results were significantly‌ reliable between two measurements taken on two days‌ separated by one night 4.

Randomized and‌ controlled Trial in Young and Olders adults :‌ Predifined vs. ZPDES condition.

Utilizing these tools, a‌ pilot study campaign was conducted to evaluate the‌ impact of our AI-based personalized cognitive training program.‌ The first pilot experiment involved n=27 participants and‌ aimed to compare the effectiveness of a cognitive‌ training program using a linear difficulty management procedure‌ (staircase procedure) to a program using an ITS‌ for difficulty manipulation. The online training lasted for‌ 10 hours over a period of 2 weeks.‌ The results indicated that the ITS-based intervention produced‌ diverse learning trajectories compared to the linear procedure‌ 38, leading to broader improvements in pre-post‌ cognitive assessment. However, no significant differences were observed‌ in subjective measures of motivation and engagement between‌ the two groups. Subsequent to this initial experiment,‌ two pilot studies (n=11 and n=10, respectively) were‌ conducted with the goal of enhancing motivation and‌ engagement in the game. The first study implemented‌ gamified components such as scores and feedback, while‌ the second study examined hyperparameter updates to the‌ ITS. The analysis of learning trajectories, learning outcomes,‌ and subjective measures yielded promising results in favor‌ of the AI-based personalized procedure.

Figure 38: Different‌‌ learning trajectories for a selected participant in the‌ staircase group (left) and‌ the ITS group (right).‌‌ The color of a dot indicates the initial‌ presentation of the parameter‌ value, while the size‌‌ of the dot represents the frequency of the‌ parameter value.

Building on‌ the preliminary findings, we‌‌ expanded our research scope with a more comprehensive‌ experimental setup involving two‌ distinct studies. The first‌‌ study encompassed 64 young adults, sourced through the‌ Prolific platform, while the‌ second study consisted of‌‌ 50 older adults, recruited from the "Université du‌ temps libre". Our experimental‌ methodology mirrored that of‌‌ our initial pilot studies, with a notable enhancement:‌ the integration of new‌ gamified elements (including mini-story‌‌ creation and new visual content) aimed at boosting‌ participant motivation and engagement.‌

Figure‌ 39: a) The‌‌ MOT task. (b) Several visual snapshots of intervention.‌ (c) Schedule proposed to‌ participants

The data analysis‌‌ encompassed three primary dimensions: initially, an exploratory phase‌ to delineate learning trajectories‌ between control and intervention‌‌ groups; subsequently, a comparative analysis of pre- and‌ post-test performance on the‌ cognitive battery; and lastly,‌‌ an examination of participants' self-reported experiences during training,‌ providing insights into their‌ subjective perceptions of the‌‌ experiment.

The pilot studies' preliminary outcomes were corroborated‌ in these larger sample‌ groups. Notably, learning trajectories‌‌ exhibited greater diversity in the group undergoing the‌ intervention procedure. This group‌ also demonstrated a more‌‌ pronounced improvement across a wider range of cognitive‌ assessment tasks. Although participants‌ engaging in the personalized‌‌ cognitive training reported a higher cognitive load via‌ questionnaires, the levels of‌ engagement and frustration did‌‌ not significantly differ between the two groups.

The‌ results showed that ZPDES‌ could be more effective‌‌ than a control condition, with improved performance on‌ trained tasks in both‌ studies, underlining the benefits‌‌ of individualized training paths. However, motivation and engagement‌ were lower in the‌ groups using ZPDES, probably‌‌ due to cognitive load and metacognitive factors. Overall,‌ individualizing cognitive training through‌ systems like ZPDES provides‌‌ a promising direction for future research by providing‌ automatic methods for taking‌ individual differences into account‌‌ in CT programs while respecting methodological standards for‌ evaluating the effectiveness of‌ CT. As a result,‌‌ our work contributes to the growing body of‌ knowledge in both ITS‌ and CT domains while‌‌ stressing the crucial role of challenges related to‌ motivation and engagement to‌ optimize the effectiveness of‌‌ these individualized approaches for cognitive and educational outcomes.‌

As part of the‌ creation of the new‌‌ University Hospital Institute (UHI) VBHI (VASCULAR BRAIN HEALTH‌ INSTITUTE), we aim to‌ develop and test a‌‌ personalized, multimodal digital therapeutic‌ approach to slow down the functional consequences of‌ small vessel disease. More specifically:

Evaluate the impact‌ of personalized cognitive training compared to non-personalized conditions‌ (comparative efficacy).
Identify potential ElectroEncephaloGraphic (EEG) biomarkers that‌ reflect cognitive activity impacted by small vessel disease‌ and could later (in a subsequent study) be‌ used as targets for exploratory EEG neurofeedback therapy.‌
Identify brain areas to target for delivering non-invasive‌ HD-tACS electrical stimulation, using previously acquired MRI data.‌
Evaluate the impact of this stimulation on brain‌ activity, neural synchronization, and cognitive performance.

To achieve‌ this, 80 participants from the SHIVA cohort (n=80)‌ will be divided into two subgroups according to‌ the severity of the disease:

Severe group: presenting‌ multiple lesions on MRI
Non-severe group: presenting a‌ few lesions

These groups will then be further‌ divided based on the type of training: personalized‌ tests (ZPDES) versus standard tests.

Figure 40.a — Figure 40: SHIVA study‌ protocol and materials.

Figure 40.b — Figure 40: SHIVA study‌ protocol and materials.

During the pre- and post-training‌ sessions, participants will perform cognitive tests on a‌ computer. Participants will be equipped with an EEG‌ headset, which, combined with a tACS stimulator, will‌ allow for both brain activity recording and stimulation.‌

We are carrying out an ancillary study with‌ Myra Fernandez's laboratory in Canada, thanks to my‌ participation with the Inria Curiositytech international associate team.‌ We have proposed to collaboratively analyze certain data‌ and dimensions of interest in our respective laboratories‌ (e.g. physical activities) associated with the cognitive training‌ proposed in the SHIVA-DTX-COG project.

Qualitative Analysis with‌ LLMs:

As it is well known that there‌ are more dropouts in older adults compared to‌ young ones, we aimed to better understand the‌ learning experience of trainees with feeback analyses. For‌ this, we designed a new way throught several‌ Large Language Models (LLM) enabling to extract hot‌ topics or main dropout's motivations in verbatim that‌ are related to pragmatic, hedonist and/or aesthetic dimensions‌ of cogntive training . The results analyzed through‌ various LLM are encouraging (paper in progress). To‌ support this new approach, we are exploring different‌ prompts on other data corpora in order to‌ ultimately propose a tutorial accessible to anyone wishing‌ to carry out a LLM-based thematic qualitative analysis.‌

8.6.5 ToGather : Interactive website to foster collaboration‌ among stakeholders of school inclusion for pupils with‌ neurodevelopmental disorders

Participants: Hélène Sauzéon [correspondant], Cécile‌ Mazon, Eric Meyer, Isabeau Saint-Supery,‌ Christelle Maillart [Uni. Liège, Belgium], Kamélia Belassel‌, Mathieu Périé, Valentin Strahm.

Sustain‌ and support the follow-up of the school inclusion‌ of children with neurodevelopmental disorders (e.g., autism, attention‌ disorders, intellectual deficiencies) has become an emergency :‌ the higher is the school level, the lower‌ is the amount of schooled pupils with cognitive‌ disabilities.

Technology-based interventions to improve school inclusion of‌ children with neurodevelopmental disorders have mostly been individual‌ centered, focusing on their socio-adaptive, and cognitive impairments‌ and implying they have to adapt themselves in order to fit in‌ our society's expectations. Although‌ this approach centered on‌‌ the normalization of the person has some advantages‌ (reduction of clinical, symptoms),‌ it carries social stereotypes‌‌ and misconceptions of cognitive disability that are not‌ respectful of the cognitive‌ diversity and intrinsic motivations‌‌ of the person, and in particular of the‌ student's wishes in terms‌ of school curriculum to‌‌ achieve his or her future life project 129‌.

The "ToGather" project‌ aims at enlightening the‌‌ field of educational technologies for special education by‌ proposing an approach centered‌ on the educational needs‌‌ of the students and bringing a concerted and‌ informed answer between all‌ the stakeholders including the‌‌ student and all their support spheres (family, school,‌ medico-social care). To this‌ end, ToGather project that‌‌ emanates from participatory design methods, primarily consists of‌ having developed a pragmatic‌ tool (interactive website) to‌‌ help students with cognitive disability and their caregivers‌ to formalize and to‌ visualize the repertoire of‌‌ academic skills of the student and to make‌ it evolve according to‌ his or her proximal‌‌ zone of development (in the sense of Vygotsky)‌ on the one hand,‌ and to the intrinsic‌‌ motivations of the student (his or her own‌ educational and life project)‌ on the other 126‌‌.

This project is in partnership with the‌ School Academy of Bordeaux‌ of the French Education‌‌ Minestery, the ARI association, the Centre of Autism‌ of Aquitaine. It is‌ funded by the FIRAH‌‌ (foundation) and the Nouvelle-Aquitaine Region (see the dedicated‌ webpages).

First, usability‌ studies have been conducted‌‌ for evaluating ergonomic qualities of the ToGather website,‌ yielding positive resultats in‌ French and Belgian contexts.‌‌ Then, we conducted a large field-study to assess‌ the effectiveness of the‌ tool in helping stakeholders‌‌ to support children with neurodevelopmental disorders (NDD) 155‌ 153 154.

The‌ study protocol consisted in‌‌ a longitudinal non-randomized controlled trial, with baseline, 3-months,‌ and 6-months fllow-up assessments.‌ The recruitment was conducted‌‌ across the entire French territory. Our local partners‌ facilitated the dissemination of‌ the call for participation‌‌ in Gironde and provided us with contacts to‌ extend it to other‌ regions. Additionally, a recruitment‌‌ campaign through social media was carried out to‌ communicate about the study‌ and encourage participants to‌‌ test the ToGather tool.

As the tool was‌ designed to support co-educational‌ process between parents and‌‌ professionals, a support team had to consist of‌ at least two stakeholders,‌ including at least one‌‌ of the parents. Initially, 157 participants were recruited‌ in 37 support teams,‌ but 30 individuals did‌‌ not answer to baseline questionnaire, leading to the‌ exclusion of 11 support‌ teams. After baseline assessment,‌‌ 13 support teams were allocated to the experimental‌ condition (ToGather app) and‌ 11 to the control‌‌ condition (usual follow-up).

Primary outcomes measures covered stakeholders’‌ relationships, self-efficacy, and attitudes‌ towards inclusive education, while‌‌ secondary outcomes measures were related to stakeholders’ burden‌ and quality of life,‌ as well as children’s‌‌ school well-being and quality‌ of life.

As the study ended recently, data‌ analysis is still ongoing. Preliminary results after 3‌ months of use showed encouraging results with an‌ improvement in communication between stakeholders and their respective‌ quality of life (paper in progress)

8.6.6 Curious‌ and therefore not overloaded : Study of the‌ links between curiosity and cognitive load in learning‌ mediated by immersive technologies

Participants: Hélène Sauzéon [correspondant]‌, Matisse Poupard, André Tricot [Cosupervisor -‌ Univ. Montpellier], Florian Larrue [Industrialist - Le‌ Catie].

Conducted in collaboration with CATIE (industrial‌ partner) and the EPSYLON laboratory at the University‌ of Montpellier (under the supervision of Prof. André‌ Tricot), this research program was initiated in April‌ 2022 and defended on September 11th,‌ 2025. It pursued two main objectives:

To establish‌ theoretical links between cognitive load theory and models‌ of curiosity-driven learning.
To experimentally examine how the‌ choice of educational technology modulates the relationship between‌ pedagogical approaches (guided instruction vs. exploration) and learner‌ expertise.

To address these objectives, the thesis was‌ structured into three main phases.

Literature Review.

A‌ systematic review examining the contributions and limitations of‌ Virtual Reality (VR) and Augmented Reality (AR) for‌ learning was conducted, with a specific focus on‌ their effects on cognitive load and intrinsic motivation.‌ This review identified both the pedagogical potential of‌ immersive technologies and persistent methodological limitations in the‌ field, particularly regarding the measurement of motivation and‌ cognitive processes. The results were published in the‌ British Journal of Educational Technology (BJET) 39.‌

Experimental Research in XR-Based Anatomy Learning.

Two experimental‌ studies were conducted in 2023 with 131 second-year‌ medical students and replicated in 2024 with 164‌ medical students from the second to fifth years.‌

The first experiment investigated whether supporting students’ drawing‌ activity during lectures using augmented and mixed reality‌ could reduce cognitive load and enhance motivation. Participants‌ followed a 20-minute neuroanatomy video lecture while simultaneously‌ reproducing drawings demonstrated by the instructor. Four experimental‌ conditions were compared:

Spatial Augmented Reality (SAR): A‌ digital overlay of the anatomical structure was projected‌ onto paper, allowing learners to trace it using‌ a projector and tracking system.
Mixed Reality (MR):‌ The digital overlay was displayed through a HoloLens‌ 2 headset.
Mixed Reality with 3D Model (MR+3D):‌ In addition to the digital overlay, learners could‌ manipulate a 3D anatomical model.
Control Condition: No‌ digital overlay was provided.

Figure‌ 41: Experimental conditions for experiment 1 :‌ Support Drawing with Augmented Reality

Results from the‌ 2023 dataset showed that both AR- and MR-supported‌ drawing conditions significantly reduced extraneous cognitive load, increased‌ intrinsic motivation, and improved drawing accuracy. However, no‌ significant differences in knowledge acquisition were observed between‌ conditions. Notably, in the stereoscopic 3D visualization condition,‌ learners with higher intrinsic motivation exhibited poorer learning‌ outcomes, possibly due to increased attentional focus on‌ system interaction rather than conceptual understanding. Visuospatial ability and prior knowledge moderated‌ the effectiveness of AR‌ and MR interventions, with‌‌ more experienced learners benefiting the most. These results‌ are reported in a‌ manuscript currently under review‌‌ in the Journal of Computing in Higher Education‌65.

The second‌ experiment explored a different‌‌ learning paradigm using virtual reality (VR), manipulating levels‌ of interactivity and instructional‌ guidance. This design enabled‌‌ the examination of how exploration and embodied interaction‌ with a 3D anatomical‌ model affect learning outcomes,‌‌ cognitive load, and curiosity.

Figure 42: Experimental conditions for‌ experiment 2 : Embodied‌ learning in virtual reality,‌‌ effect of interactivity

Analyses, published in Computers &‌ Education38, showed‌ that VR conditions led‌‌ to superior learning performance, particularly in the passive‌ and active interaction conditions.‌ These conditions were associated‌‌ with higher intrinsic motivation and a more optimized‌ cognitive load profile. Moreover,‌ intrinsic motivation was positively‌‌ correlated with germane cognitive load (i.e., cognitive resources‌ devoted to learning) and‌ negatively correlated with extraneous‌‌ cognitive load. In other words, highly motivated learners‌ experienced fewer irrelevant cognitive‌ demands, allowing them to‌‌ allocate more resources to meaningful learning processes.

Following‌ the systematic review, which‌ highlighted the lack of‌‌ reliable and context-sensitive measures of intrinsic motivation in‌ XR research, a third‌ study leveraged a key‌‌ affordance of VR: the continuous collection of behavioral‌ data. By analyzing head‌ and hand movements during‌‌ the neuroanatomy learning task, this study aimed to‌ identify implicit behavioral indicators‌ of curiosity and cognitive‌‌ engagement. Results showed that increased hand movement was‌ associated with lower intrinsic‌ motivation, whereas greater head‌‌ movement was positively associated with both germane cognitive‌ load and intrinsic motivation,‌ suggesting deeper cognitive engagement.‌‌ Additionally, movement entropy emerged as a significant predictor‌ of curiosity-driven learning, highlighting‌ its potential as an‌‌ implicit marker of learning-related behaviors in immersive environments.‌ These findings are presented‌ in a manuscript currently‌‌ under review in the International Journal of Human–Computer‌ Studies37.

Figure 43: Illustration‌ of movement entropy calculations‌ in the virtual environment.‌‌ The top section depicts the three spatial axes‌ used to track hand‌ and head positions, as‌‌ well as head rotation along the X, Y,‌ and Z axes. The‌ bottom section presents heatmaps‌‌ of time spent by participants' heads in different‌ spatial regions, visualizing entropy‌ patterns in 2D space.‌‌

Toward a Unified Model: Cognitive Load, Motivation, and‌ Expertise.

Building on the‌ empirical results of the‌‌ previous studies, which revealed dynamic interactions between cognitive‌ load and intrinsic motivation,‌ this final phase addressed‌‌ the second overarching objective of the thesis: the‌ empirical integration of cognitive‌ load theory and the‌‌ Learning Progress Hypothesis. Using structural equation modeling (SEM),‌ this study tested a‌ comprehensive model describing the‌‌ relationships between XR technologies, cognitive load, intrinsic motivation,‌ perceived learning progress, and‌ learning outcomes.

Results indicated‌‌ that both AR and‌ VR significantly reduced extraneous cognitive load and increased‌ intrinsic motivation. However, intrinsic motivation did not directly‌ predict immediate learning performance. Instead, extraneous cognitive load‌ negatively affected perceived learning progress and autonomy, which‌ in turn predicted intrinsic motivation, revealing a key‌ mediating pathway.

Figure‌ 44: Resulting model from the SEM.

Overall,‌ these findings demonstrate that unnecessary cognitive demands not‌ only hinder learning efficiency but also disrupt learners’‌ perceived progress and sense of control, thereby undermining‌ curiosity and intrinsic motivation. This work contributes to‌ a unified theoretical framework by showing how optimizing‌ extraneous cognitive load in XR environments supports both‌ cognitive efficiency and curiosity-driven learning. The results are‌ presented in a manuscript currently under review in‌ Educational Psychology Review66.

Effect of XR‌ technology displays on everyday memory

In addition to‌ this work, the co-design of an augmented reality‌ (AR) application simulating a museum visit (Co-led with‌ P. Dragicevic, Bivouac, under the I-am AEx project,‌ 2023) and integrated with an evaluation of involuntary‌ and uncontrollable memory revival has originally demonstrated that‌ AR enhances this type of memory compared to‌ 3D images 53, suggesting potential cognitive manipulations‌ with AR (Honorable Mention at CHI 2025).

Display‌ features and personal factors (e.g., intellectual curiosity/humility) are‌ being studied 54 to develop robust usage recommendations‌ (L. Petiot’s PhD thesis co-supervized by; H. Sauzéon).‌

8.6.7 Self-determination-driven digital services for supporting aging-in place‌ and well-being: a study of relationships between longitudinal‌ data from smart home and clinical data

Participants:‌ Hélène Sauzéon, Juliette Deyts, Lucile Dupuy‌, Rafik Belloum.

This work relies on‌ longitudinal data collected from frail older adults living‌ alone at home who used the HomeAssist ambient‌ assisted living platform for up to 24 months.‌ HomeAssist was designed according to a self-determination and‌ user-centered approach, covering three domains of need: daily‌ activities, home safety, and social participation. The objective‌ of this research is to analyze relationships between‌ clinical data (e.g., cognitive assessments, frailty, autonomy, self-determination)‌ and usage-related data (user experience questionnaires, usage diaries,‌ and actimetric data derived from environmental sensors), in‌ order to both assess the benefits of assistive‌ and monitoring services and explore the predictive value‌ of sensor-based data for explaining clinical outcomes.

A‌ first study focused on identifying factors influencing user‌ experience (UX) and long-term adoption of HomeAssist, based‌ on data from 131 participants. Despite a user-centered‌ design, long-term adoption remained limited, with only 18‌ users continuing after 24 months and 38 requesting‌ removal within the first six months. Regression analyses‌ showed that UX dimensions were mainly predicted by‌ other UX dimensions rather than by individual health‌ or psychosocial characteristics. In contrast, long-term adoption was‌ weakly predicted by level of education and computer‌ ownership, suggesting that while user-centered design may reduce‌ the impact of individual characteristics on user experience,‌ adoption remains influenced by digital literacy and social‌ inequalities.

Overall, these activities contribute to the design of a user-centered visualization‌ tool intended for clinicians‌ (psychologists and physicians), enabling‌‌ them to better understand the links between long-term‌ usage data and clinical‌ evolution, and to detect‌‌ early “weak signals” of decline (e.g., changes in‌ sleep patterns), thereby facilitating‌ timely and targeted interventions.‌‌

8.7 Curiosity-driven AI for assisted scientific discovery

8.7.1‌ Design of an Interactive‌ Software for Automated Discovery‌‌ in Complex Systems

Participants: Clément Romac [correspondant],‌ Zacharie Bugaud, Clément‌ Moulin-Frier, Pierre-Yves Oudeyer‌‌.

We further developed our Automated Discovery software‌ and particularly focused on‌ adding and experimenting with‌‌ new systems.

Our public software now features more‌ than ten examples ranging‌ from artificial life, to‌‌ physics or protein docking. The software was publicly‌ released in 2024: presentation‌ thread.

Figure 45: Technical architecture‌ of our software.

8.7.2‌ Discovering Sensorimotor Agency in‌‌ Cellular Automata using Diversity Search

Participants: Gautier Hamon‌ [correspondant], Mayalen Etcheverry‌, Bert Chan,‌‌ Clément Moulin-Frier, Pierre-Yves Oudeyer.

As a‌ continuation of the previous‌ projects in Automated Discovery‌‌ in Self-Organizing Systems, we have been working on‌ expanding the set of‌ discoveries of possible structures‌‌ in continuous CAs such as Lenia 82,‌ 81, and in‌ particular we have been‌‌ interested to search for emerging agents with sensorimotor‌ capabilities. Understanding what has‌ led to the emergence‌‌ of life and sensorimotor agency as we observe‌ in living organisms is‌ a fundamental question. In‌‌ our work, we initially only assume environments made‌ of low-level elements of‌ matter (called atoms, molecules‌‌ or cells) locally interacting via physics-like rules. There‌ is no predefined notion‌ of agent embodiment and‌‌ yet we aim to answer the following scientific‌ question: is it possible‌ to find environments in‌‌ which there exists/emerge a subpart that could be‌ called a sensorimotor agent‌?

We use Lenia‌‌ continuous cellular automaton as our artificial "world" 81‌. We introduce a‌ novel method based on‌‌ gradient descent and curriculum learning combined within an‌ intrinsically-motivated goal exploration process‌ (IMGEP) to automatically search‌‌ parameters of the CA rule that can self-organize‌ spatially localized 1 and‌ moving patterns 2 within‌‌ Lenia. The IMGEP defines an outer exploratory loop‌ (generation of training goal/loss)‌ and an inner optimization‌‌ loop (goal-conditioned). We use a population-based version of‌ IMGEP 12,91‌ but introduce two novel‌‌ elements compared to previous papers in the IMGEP‌ literature. First, whereas previous‌ work in 29 and‌‌ 10 used a very basic nearest-neighbor goal-achievement strategy,‌ our work relies on‌ gradient descent for the‌‌ local optimization of the (sensitive) parameters of the‌ complex system, which has‌ shown to be very‌‌ powerful. To do so we made a differentiable‌ version of the Lenia‌ framework, which is also‌‌ a contribution of this work. Secondly, we propose‌ to control subparts of‌ the environmental dynamics with‌‌ functional constraints (through predefined channels and kernels in‌ Lenia) to build a‌ curriculum of tasks; and‌‌ to integrate this stochasticity‌ in the inner optimization loop. This has shown‌ central to train the system to emerge sensorimotor‌ agents that are robust to stochastic perturbations in‌ the environment. In particular, we focus on modeling‌ obstacles in the environment physics and propose to‌ probe the agent sensorimotor capability as its performance‌ to move forward under a variety of obstacle‌ configurations. We also provide in this work tests‌ and metrics to measure the robustness of the‌ obtained agents.

Figure 46.a — Figure 46‌: Robustness test to harder/unseen obstacle configurations: straight‌ wall, bigger obstacle, dead ends.

Figure 46.b — Figure 46‌: Robustness test to harder/unseen obstacle configurations: straight‌ wall, bigger obstacle, dead ends.

Figure 47.a — Figure 47‌: Change of scale changing the kernel size‌ and initialization, the grid is the same size‌ in both

Figure 47.b — Figure 47‌: Change of scale changing the kernel size‌ and initialization, the grid is the same size‌ in both

While many complex behaviors have already‌ been observed in Lenia, among which some could‌ qualify as sensorimotor behaviors, they have so far‌ been discovered "by chance" as the result of‌ time-consuming manual search or with simple evolutionary algorithms.‌ Our method provides a more systematic way to‌ automatically learn the CA rules leading to the‌ emergence of basic sensorimotor structures, as shown in‌ Figure 48. Moreover, we investigated and provided‌ ways to measure the (zero-shot) generalization of the‌ discovered sensorimotor agents to several out-of-distribution perturbations that‌ were not encountered during training. Impressively, even though‌ the agents still fail to preserve their integrity‌ in certain configurations, they show very strong robustness‌ to most of the tested variations. The agents‌ are able to navigate in unseen and harder‌ environmental configurations while self-maintaining their individuality (Figure 46‌). Not only the agents are able to‌ recover their individuality when subjected to external perturbations‌ but also when subjected to internal perturbations: they‌ resist variations of the morphogenetic processes such that‌ less frequent cell updates, quite drastic changes of‌ scales as well as changes of initialization (Figure‌ 47). Furthermore, when tested in a multi-entity‌ initialization and despite hav,ing been trained alone, not‌ only the agents are able to preserve their‌ individuality but they show forms of coordinated interactions‌ (attractiveness and reproduction). Our results sug,gest that, contrary‌ to the (still predominant) mechanistic view on embodiment,‌ biologically-inspired embodiment could pave the way toward agents‌ with strong coherence and generalization to out-of-distribution changes,‌ mimicking the remarkable robustness of living systems to‌ maintain specific functions despite environmental and body perturbations‌ 116. Searching for rules at the cell-level‌ in order to give rise to higher-level cognitive‌ processes at the level of the organism and‌ at the level of the group of organisms‌ opens many exciting opportunities to the development of‌ embodied approaches in AI in general.

Figure 48:‌‌ Scatter plot of the agents as their measured‌ performances of robustness to‌ obstacles (y axis) and‌‌ speed in obstacles (x axis) obtained by IMGEP‌ (red), random search with‌ the same compute resources‌‌ as IMGEP(blue) and the one from the original‌ lenia paper (green)

The‌ work has been released‌‌ in 2022 as a distill-like article which is‌ currently hosted at this‌ link. This article‌‌ contains an interactive demo in webGL and javascript,‌ as well as many‌ videos and animations of‌‌ the results. A colab notebook with the source‌ code of the work‌ is publicly available at‌‌.

In 2024, additional quantitative experiments were conducted‌ as well as ablations.‌ This work was published‌‌ in 2025in the Science Advances journal.

8.7.3‌ Semantic Open-Endedness in Flow-Lenia‌ using Vision Language Models‌‌ and IMGEP

Participants: Sina Khajehabdollahi [correspondent], Gautier‌ Hamon, Marko Cvjetko‌, Cédric Colas,‌‌ Pierre-Yves Oudeyer, Clément Moulin-Frier.

Discovering diverse‌ visual patterns in continuous‌ cellular automata (CA) is‌‌ challenging due to the vastness and redundancy of‌ high-dimensional behavioral spaces. Traditional‌ exploration methods like Novelty‌‌ Search (NS) expand locally by mutating known novel‌ solutions but often plateau‌ when local novelty is‌‌ exhausted, failing to reach distant, unexplored regions. We‌ introduce Expedition & Expansion‌ (E&E), a hybrid strategy‌‌ where exploration alternates between local novelty-driven expansions and‌ goal-directed expeditions. During expeditions,‌ E&E leverages a Vision-Language‌‌ Model (VLM) to generate linguistic goals—descriptions of interesting‌ but hypothetical patterns that‌ drive exploration toward uncharted‌‌ regions. By operating in semantic spaces that align‌ with human perception, E&E‌ both evaluates novelty and‌‌ generates goals in conceptually meaningful ways, enhancing the‌ interpretability and relevance of‌ discovered behaviors. Tested on‌‌ Flow Lenia, a continuous CA known for its‌ rich, emergent behaviors, E&E‌ consistently uncovers more diverse‌‌ solutions than existing exploration methods. A genealogical analysis‌ further reveals that solutions‌ originating from expeditions disproportionately‌‌ influence long-term exploration, unlocking new behavioral niches that‌ serve as stepping stones‌ for subsequent search. These‌‌ findings highlight E&E's capacity to break through local‌ novelty boundaries and explore‌ behavioral landscapes in human-aligned,‌‌ interpretable ways, offering a promising template for open-ended‌ exploration in artificial life‌ and beyond. The project‌‌ was published at the Artificial Life 2025 conference‌ 48. A summary‌ and the result visualization‌‌ are available on the project website.

8.7.4‌ Exploring Flow-Lenia Universes with‌ a Curiosity-driven AI Scientist:‌‌ Discovering Diverse Ecosystem Dynamics

Participants: Thomas Michel [correspondent]‌, Marko Cvjetko [correspondent]‌, Gautier Hamon,‌‌ Pierre-Yves Oudeyer, Clément Moulin-Frier.

We present‌ a method for the‌ automated discovery of system-level‌‌ dynamics in Flow-Lenia—a continuous cellular automaton with mass‌ conservation and parameter localization—using‌ a curiosity-driven AI scientist.‌‌ This method aims to uncover processes leading to‌ self-organization of evolutionary and‌ ecosystemic dynamics in CAs.‌‌ We build on previous work which uses diversity‌ search algorithms in Lenia‌ to find self- organized‌‌ individual patterns, and extend‌ it to large environments that support distinct interacting‌ patterns. We adapt Intrinsically Motivated Goal Exploration Processes‌ (IMGEPs) to drive exploration of diverse Flow-Lenia environments‌ using simulation-wide metrics, such as evolutionary activity, compression-based‌ complexity, and multi-scale entropy. We test our method‌ in two experiments, showcasing its ability to illuminate‌ significantly more diverse dynamics compared to random search.‌ We show qualitative results illustrating how ecosystemic simulations‌ enable self-organization of complex collective behaviors not captured‌ by previous individual pattern search and analysis. We‌ complement automated discovery with an interactive exploration tool,‌ creating an effective human-AI collaborative workflow for scientific‌ investigation. Though demonstrated specifically with Flow-Lenia, this methodology‌ provides a framework potentially applicable to other parameterizable‌ complex systems where understanding emergent collective properties is‌ of interest.

This work was published at the‌ Artificial Life 2025 conference 51, with a‌ companion website containing videos of the discoveries, the‌ interactive exploration tool and source code.

Figure 49.a — Figure 49: A showcase of discovered diversity‌ while searching for ecosystemic dynamics.

Figure 49.b — Figure 49: A showcase of discovered diversity‌ while searching for ecosystemic dynamics.

8.7.5 Discovering and‌ Controlling Diverse Self-Organized Patterns in Cellular Automata Using‌ Autotelic Reinforcement Learning

Participants: Marko Cvjetko [correspondent],‌ Gautier Hamon, Pierre-Yves Oudeyer, Clément Moulin-Frier‌.

Autotelic AI algorithms, which pursue self-generated goals,‌ have proven to be effective as automated discovery‌ assistants in cellular automata. Previous work in this‌ domain focused on algorithms which produce diverse behaviors‌ by setting the automaton’s initial conditions. Here, we‌ extend these methods beyond initial-condition search and adapt‌ them to systems that support sequences of closed-loop‌ interventions. Using Lenia (a continuous cellular automaton) as‌ a test environment, we train goal-conditioned reinforcement learning‌ agents to perform targeted interventions during the system’s‌ evolution, guiding it towards desired states. The resulting‌ agent behaviors are robust and diverse, demonstrating the‌ potential of closed-loop interaction for discovery and control.‌ Furthermore, we show that goal-conditioned RL agents performing‌ interventions can discover novel self-organising patterns and generalize‌ to previously unseen and noisy environments. The project‌ was presented as a late-breaking abstract at the‌ Artificial Life 2025 conference 45, and disseminated‌ through a website.

9 Bilateral contracts and‌ grants with industry

9.1 Bilateral contracts with industry‌

CATIE: CIFRE PhD grant of Matisse Poupard with‌ CATIE and EPSYLON Lab (Univ. Montpellier) until April‌ 2025.
Hugging Face PhD of Clément Romac with‌ Hugging Face on "Augmenting curiosity-driven exploration with very‌ large language models in deep reinforcement learning agents"‌

We received a 70keuros grant from Google, as‌ a PhD fellowship for Julien Pourcel.

9.2 Bilateral‌ Grants with Fundation

CLEMENCE Cohort (Fondation de France‌ and Théa Pharma)

Participants: Hélène Sauzéon [correspondant],‌ Cécile Mazon, Cécile Delcourt.

The project‌ "Cohorte LongitudinalE sur la Myopie et le développement‌ oculaire dans l’ENfanCE(CLEMENCE) is led by C. Delcourt‌ from the lab of Bordeaux Populational Health (2M€).‌ Hélène Sauzéon and Cécile Mazon participate to the‌ research program with the study of developemental changes due to Myopa in‌ visual attention.

10 Partnerships‌ and cooperations

10.1 International‌‌ initiatives

10.1.1 Inria associate team not involved in‌ an IIL or an‌ international program

Participants: Helene‌‌ Sauzéon, Edith Law.

CuriousTECH

Title:
Curiosity-Driven‌ Learning Across the Lifespan‌
Duration:
2023 -> 2025‌‌
Coordinator:
Edith Law (edith.law@uwaterloo.ca)
Partners:
- University of Waterloo‌ Waterloo (Canada)
Inria contact:‌
Helene Sauzéon
Summary:
Since‌‌ several years, the HCI lab and the cognitive‌ neuroscience lab of the‌ University of Waterloo (Canada)‌‌ have been collaborating with researchers from the Bordeaux‌ site, especially the Flowers‌ team and the Flowers‌‌ team from inria, as well as the ACTIVE‌ team from the BPH‌ laboratory (Inserm-Uni. Bordeaux ).‌‌ This collaboration is motivated by a common desire‌ to better understand the‌ role of curiosity in‌‌ lifelong learning, and to constitute a new multidisciplinary‌ research avenue on the‌ design of original interactive‌‌ systems for (re)education. Several studies report that curiosity‌ is not only beneficial‌ to children and young‌‌ adults but also to older adults and neurodiverse‌ individuals. This field of‌ study is in its‌‌ infancy and deserves collaborative efforts to identify the‌ underlying cognitive mechanisms, the‌ learning situations that benefit‌‌ them in order to ultimately design and develop‌ curiosity-driven (re)educational technologies (ETs),‌ and then deploy them‌‌ in natural environments (school, home) to be reliably‌ and rigorously tested. For‌ this multidisciplinary purpose, the‌‌ consortium gathers competences in AI, HCI, cognitive science,‌ psychology in order to‌ cover the objectives given‌‌ by the proposed associated team, i.e. CuriousTech team.‌ In addition to the‌ scientific potential, this team‌‌ structuring also includes the will of a quick‌ transfer of the ET‌ in France and Canada‌‌ towards the socio-economic fields of Ed Tech but‌ also of e-health.

10.2‌ European initiatives

10.2.1 Horizon‌‌ Europe

Participants: Cédric Colas.

INTERACT

INTERACT project‌ on cordis.europa.eu

Title:
Help‌ Me Grow: Artificial Cognitive‌‌ Development via Human-Agent Interactions Supported by New Interactive,‌ Intrinsically Motivated Program Synthesis‌ Methods.
Duration:
From October‌‌ 1, 2022 to August 31, 2026
Partners:
- INSTITUT‌ NATIONAL DE RECHERCHE EN‌ INFORMATIQUE ET AUTOMATIQUE (INRIA),‌‌ France
- MASSACHUSETTS INSTITUTE OF TECHNOLOGY (MIT), United States‌
Inria contact:
Cedric Colas‌
Coordinator:
Summary:
Building machines‌‌ that interact with their world, discover interesting interactions‌ and learn open-ended repertoires‌ of skills is a‌‌ long-standing goal in AI. This project aims at‌ tackling the limits of‌ current AI systems by‌‌ building on three families of methods: Bayesian program‌ induction, intrinsically motivated learning‌ and human-machine linguistic interactions.‌‌ It targets three objectives: 1) building autonomous agents‌ that learn to generate‌ programs to solve problems‌‌ with occasional human guidance; 2) studying linguistic interactions‌ between humans and machines‌ via web-based experiments (e.g.‌‌ properties of human guidance, its impact on learning,‌ human subjective evaluations); and‌ 3) scaling the approach‌‌ to the generation of constructions in Minecraft, guided‌ by real players. The‌ researcher will collaborate with‌‌ scientific pioneers and experts in the key fields‌ and methods supporting the‌ project. This includes supervisors‌‌ Joshua Tenenbaum (program synthesis,‌ MIT) and Pierre-Yves Oudeyer (autonomous learning, Inria); diverse‌ collaborators, and an advisory board composed of an‌ entrepreneur and leading scientists in developmental psychology and‌ human-robot interactions. The 3rd objective will be pursued‌ via a secondment with Thomas Wolf (CSO) at‌ HuggingFace, a world-leading company in the open source‌ development of natural language processing methods and their‌ transfer to the industry. By enabling users to‌ participate in the training of artificial agents, the‌ project aims to open research avenues for more‌ interpretable, performant and adaptive AI systems. This will‌ result in scientific (e.g. interactive program synthesis approaches),‌ societal (e.g. democratized AI training) and economic impacts‌ (e.g. adaptive AI assistants). The dissemination, communication and‌ exploitation plans support these objectives by targeting scientific‌ (AI, cognitive science), industrial (video games, smart homes)‌ and larger communities (gamers, software engineers, large public).‌

10.2.2 Other european programs/initiatives

Participants: Helene Sauzéon,‌ Pierre-Yves Oudeyer, Mathias Grüber.

DEVCUR:

ORA‌ project 2024-2027 - Open Research Area (ORA) for‌ the Social Sciences 8th call for proposals

Title:‌
How curiosity enhances learning across childhood and adolescence:‌ The role of metacognition and agency.
Duration:
From‌ Sept 1, 2025 to December 31, 2027
Partners:‌
- INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET AUTOMATIQUE‌ (INRIA), France
- Cardiff University, UK
- MaxPlanck Institute, Berlin,‌ Germany
- Inria contact:
  Pierre-Yves Oudeyer and Hélène Sauzéon‌
- Coordinator:
  Mathias Grüber, Brain and Imagery centre, Cardiff‌ University, UK / Funds : 1,177 k€
- Summary:‌
  This project investigates the bidirectional relationship between curiosity-based‌ learning and metacognition during late childhood and adolescence,‌ a critical period when both abilities develop. Using‌ five experiments with behavioral, neuroimaging, training, and longitudinal‌ methods, three research teams from Cardiff, Bordeaux, and‌ Trier will examine how metacognition and agency enhance‌ curiosity-driven learning. The study will explore both individual‌ differences and developmental changes in how metacognitive awareness‌ strengthens curiosity's learning benefits. Findings will be translated‌ into classroom interventions to stimulate curiosity and metacognition‌ in educational settings. This interdisciplinary collaboration aims to‌ advance understanding of curiosity development with significant scientific‌ and societal impact.

10.3 National initiatives

GAIMHE project‌ (BPI France 2030):

GAIMHE is a strategic research‌ and innovation project funded by Bpifrance, coordinated by‌ EvidenceB in partnership with the Flowers AI &‌ Cognitive Science Laboratory at Inria, Café pédagogique, and‌ Association Class'Code. The project aims to develop next-generation‌ intelligent tutoring systems that combine the pedagogical rigor‌ of traditional adaptive learning algorithms with the flexibility‌ of generative AI. Current educational AI technologies present‌ a fundamental trade-off. Intelligent Tutoring Systems (ITS) offer‌ pedagogically grounded, personalized curricula through algorithms such as‌ ZPDES, but require substantial manual content development. Conversely,‌ generative AI provides interactional flexibility yet lacks pedagogical‌ structure, cannot support long-term curriculum personalization, and presents‌ significant computational costs. GAIMHE proposes a hybrid methodology‌ structured around three axes: automated content generation leveraging‌ generative AI for the rapid creation of pedagogically‌ validated exercises to populate ITS knowledge graphs; targeted‌ generative assistance deploying optimally-sized models to provide pedagogically principled guidance at key‌ learning moments; and advanced‌ personalization through compact student‌‌ models capable of predicting and adapting learning trajectories‌ across extensive exercise spaces,‌ building on prior MAGELLAN‌‌ research. The project benefits from EvidenceB's existing infrastructure,‌ which serves tens of‌ thousands of classrooms across‌‌ primary and secondary education in France. This enables‌ large-scale evaluation with authentic‌ learning data. In partnership‌‌ with Région Île-de-France, the consortium will release annotated‌ datasets, learning analytics tools,‌ and software components as‌‌ open-source digital commons to support France's educational technology‌ ecosystem.

ANR Chaire Individuelle‌ Deep Curiosity

- PY‌‌ Oudeyer continued to work on the research program‌ of this Chaire, funding‌ 2 PhDs and 3‌‌ postdocs for five years (until 2025).

ANR JCJC‌ ECOCURL

- C. Moulin-Frier‌ obtained an ANR JCJC‌‌ grant. The project is entitled "ECOCURL: Emergent communication‌ through curiosity-driven multi-agent reinforcement‌ learning". The project starts‌‌ in Feb 2021 for a duration of 48‌ months. It will fund‌ a PhD student (36‌‌ months) and a Research Engineer (18 months) as‌ well as 4 Master‌ internships (one per year).‌‌

Projet AIxIA: "Analyse d’Interférences par Intelligence Artificielle".

Pierre-Yves‌ Oudeyer and Clément Moulin-Frier‌ obtained a grant from‌‌ the call for project AIRSTRIP "L'intelligence Artificielle au‌ service de l'IngénieRie des‌ SysTèmes aéRonautIques et sPatiaux",‌‌ in collaboration with the IRT Saint Exupery. The‌ project was accepted in‌ 2023 and will fund‌‌ 18 months of a research engineer position starting‌ in 2024.

Inria Exploratory‌ Action AIDE

- Didier‌‌ Roy is collaborator of the Inria Exploratory Action‌ AIDE "Artificial Intelligence Devoted‌ to Education", ported by‌‌ Frédéric Alexandre (Inria Mnemosyne Project-Team), Margarida Romero (LINE‌ Lab) and Thierry Viéville‌ (Inria Mnemosyne Project-Team, LINE‌‌ Lab). The aim of this Exploratory Action consists‌ to explore to what‌ extent approaches or methods‌‌ from cognitive neuroscience, linked to machine learning and‌ knowledge representation, could help‌ to better formalize human‌‌ learning as studied in educational sciences. AIDE is‌ a four year project‌ started middle 2020 until‌‌ 2024 see.

Inria Exploratory Action I'AM

-‌ Hélène Sauzéon is co-PI‌ with P. Dragicevic of‌‌ the Inria Exploratory Action I'AM "Impact of Augmented‌ Reality on Autobiographical Memory:‌ Examining Involuntary Memories and‌‌ False Memories" (174,5k€). Starting in last september, the‌ aim of this Exploratory‌ Action consists to explore‌‌ to what extent augmented reality based devices can‌ produce erroneous autobiographical memories,‌ and more particularly in‌‌ vulnerable people (Children and older adults or yound‌ adults with low memory‌ abilities of source monitoring).‌‌

New collaboration with Maxime Derex from IAST Toulouse‌

for the co-direction of‌ the PhD thesis of‌‌ Jeremy Perez with Clément Moulin-Frier and Pierre-Yves Oudeyer‌ on "Interactions between intrinsically‌ motivated goal-exploration processes and‌‌ cummulative cultural evolution" (see section 8.2.2).

France‌ 2030 - PPR AUTONOMIE‌ : Vieillissement Et Situations‌‌ De Handicap - Projet INNOVCare (Lechevalier S., 3,5M€)‌ (2023-26)

- Hélène Sauzéon‌ and AS Rigaud will‌‌ supervize the WP5 dedicated to two care-led innovation‌ experiments with assistive technologies‌ (400k € for Bordeaux).‌‌ - Hélène Sauzéon is‌ responsible of the WP3 « Digital technology for‌ aging in place » (470k€/3,5M€), Défi 4 -‌ Numérique, Innovcare (PPR Autonomie PIA2030, 2023-28).

VBHI project(Vascular‌ Brain Health Institute -IHU, led by S. Debette,‌ 5M€)) (2023-26)

- Hélène Sauzéon will supervize the‌ WP4.3 dedicated to "Explore Digital Therapeutics To Slow‌ Down Cognitive Decline In Covert Csvd" (150k€)

11‌ Dissemination

11.1 Promoting scientific activities

11.1.1 Scientific events:‌ organisation

PY Oudeyer continued to be a member‌ of the organization committe of the Life, Structure‌ and Cognition symposium series at IHES, France. H‌ sauzeon continued to be a member of the‌ Technical Program committee of ACHI conference.

11.1.2 Reviewer‌ - reviewing activities

Matisse Poupard has reviewed for‌ Computers & Education, Education and Information Technologies, Frontiers‌ in Psychology and Frontiers in Virtual Reality. Jeremy‌ Perez has reviewed for the Judgment and Decision‌ Making Journal and Topics in Cognitive Science. Hélène‌ Sauzéon has reviewed for ACM-CHI25 , British Journal‌ of Psychology, Computer in human behavior, and for‌ Journal of Research in Science Teaching. Cécile Mazon‌ has reviewed for Nature Scientific Reports, Education and‌ Information Technology, and BMC Psychology.

PY Oudeyer was‌ a reviewer for the journals Developmental Science and‌ Child Development.

11.1.3 Invited talks

Matisse Poupard gave‌ 3 invited talks:

(December 2025) "Technologies immersives pour‌ l’enseignement : étude des relations entre charge cognitive‌ et curiosité des apprenants", ReSCi “Ma Recherche j’en‌ parle” - Education, outils numériques et intelligence artificielle"‌, ANRT, Paris
(November 2025) Round-table discussion to‌ mark the end of the Dem’UP project,‌ University of Poitiers
(October 2025) "Curieux et cognitivement‌ engagé", Séminaire « Numérique pour l’Education »,‌ R3NumED

Clément Romac gave 1 invited talks:

(May‌ 2025) Invited talk at the ISIR lab from‌ Sorbonne University on “Grounding LLMs through curiosity-driven online‌ RL”.
(Spetember 2025) Invited talk at the SMILES‌ workshop at ICDL on “Grounding LLMs through curiosity-driven‌ online RL”.

Marie-Sarah Desvaux gave 1 invited talk:‌

(November 2025) "Curiosity-driven learning as a ZPD window‌ for self-regulated learning" during International Society of Cultural-historical‌ Activity Research, Southern Europe and Middle Eastern Conference‌ 2025, Barcelona

Hélène Sauzéon gave 3 invited‌ talk:

(October 2025) "Supporting digital accessibility of MOOC‌ based-learning for individuals with cognitive impairments: The Aïana‌ project" Intersections: Translation, Accessibility, Inclusion, Forum des Savoirs,‌ MSH Dijon
(Mai 2025) Les interventions de santé‌ pour le bien-viellir des personnes âgés à l'aide‌ de technologies numériques Journée IFRATH - Troubles sociocognitifs‌ et technologies : Perspectives sur l'enfance et le‌ vieillissement, 'Institut National de Jeunes Sourds de Paris,‌ 254 Rue Saint-Jacques, 75005 Paris.
(November 2025) The‌ intrinsic motivations as design principles of technologies for‌ cognition : Examples about Educational Technologies and Technologies‌ for aging in place, Institut de Psychologie, Université‌ Paris Cité, Paris

Loris Gaven gave 2 invited‌ talks:

(January 2026) "Toward Artificial Curiosity" at University‌ of Padua (Online)
(August 2025) "MAGELLAN: Metacognitive predictions‌ of learning progress guide autotelic LLM agents in‌ large goal spaces" at the Metacognitive Satellite of CCN in Amsterdam.

PY‌ Oudeyer gave these invited‌ talks:

(Jan 2025) Curiosity-driven‌‌ learning in humans: learning progress, autotelic exploration and‌ open-ended development, Keynote lecture‌ at the Budapest Conference‌‌ on Cognitive Development, see video.
(Jan‌ 2025) IA générative et‌ éducation: enjeux sociétaux,‌‌ for the "Conférence jumelle du Cnesco" co-organized by‌ Cnesco, Canope and CARDIE‌ Charentes et Charentes-Maritime.
(Jan‌‌ 2025) Les enjeux sociétaux de l'IA dans l'éducation‌, at ETAPP-IA conference‌ on AI and education‌‌ organized by Nouvelle Aquitaine academy.
(May 2025) IA‌ générative, société et éducation:‌ les enjeux de la‌‌ formation des futures citoyens, for the conference‌ "Printemps de la recherche‌ en éducation", organized by‌‌ INSPE, Paris.
(Oct 2025) How curiosity drives human‌ learning, and how this‌ can be leveraged in‌‌ educational technologies, LEAD symposium organized by University‌ of Tuebingen.
(Oct 2025)‌ IA générative: enjeux, interventions‌‌ et résultats expérimentaux, for the AI seminar‌ organized by DRANE from‌ Grenoble academy.
(Dec 2025)‌‌ with Julien Pourcel and Cédric Colas, SOAR, a‌ self-improving LLM-based evolutionary algorithm‌, for the ARC-Prize.‌‌

11.1.4 Scientific expertise

Hélène Sauzéon was:

Vice-president of‌ Pluridisciplinary committee (Digital sciences‌ and Humanities) of French‌‌ National agency for Research (CES 38 -Interface ANR),‌ since 2025
Member of‌ Research Council of Finland‌‌ POC (12 projects on applied computer sciences) in‌ 2025
Member of the‌ Scientific Committee of Calyxis,‌‌ a center focused on research and development of‌ technological solutions to prevent‌ daily accidents through public‌‌ laboratory-enterprise collaborations, since 2019.
Expert for grant applications:‌ Evaluation of 2CIFRE-ANRT PhD‌ proposals ; Evaluation of‌‌ 1 GATES (Grenoble ATtractiveness and ExcellenceS) proposal for‌ the SHS Cluster of‌ Université Grenoble in 2025.‌‌
Member of committe for a permanent Professor position‌ in psychology at the‌ university of Bordeaux
Member‌‌ of committe for a permanent Assistant Professor position‌ in Occupational Science (91‌ section) at the university‌‌ of Limoges

PY Oudeyer was:

a reviewer and‌ expert for ANR (National‌ Research Agency) as well‌‌ as for the European Research Council (ERC), the‌ Cyprus Research Council and‌ for the Swedish Foundation‌‌ for Strategic Research.
an invited expert for the‌ "Curiosity Convening" event organized‌ by the Scratch Foundation‌‌ at OECD, Paris.
invited to be a member‌ of the scientific council‌ of the "Main à‌‌ la Pate" foundation.
a member of the‌ GT "IA et éducation"‌ at Conseil Scientifique de‌‌ l'Education Nationale.

Cécile Mazon reviewed one proposal for‌ ANR-AAPG JCJC.

11.1.5 Research‌ administration

Hélène Sauzéon was:‌‌

Member of the Research Committee of IMT Atlantique‌ since 2025, working to‌ promote Human and Social‌‌ Sciences (SHS) in engineering education.
Co-organizer (Inria) since‌ 2024 of the annual‌ "JS & GT 'Handicap'"‌‌ (Thematic Days & Working Groups on Disability) and‌ contributor to the consultation‌ for Inria's 2025 Disability‌‌ Roadmap.
Member of the extented "BCP" of BSO‌ Inria centre, since 2020.‌ Advisory roles for the‌‌ center's “surrounding” scientific policy and strategy, recruitment of‌ permanent researchers, especially ,‌ monitoring and assistance in‌‌ setting up Inria teams,‌ organization of intern scientific events, writting support to‌ communication staff for popularization contents on AI, disabilty,‌ health and Education, etc.
PIQ Referal for the‌ centre Inria of Univ. of Bordeaux covering 3‌ universities (Bordeaux, LaRochelle, Limoges), since 2024. My role‌ is twofold: 1) to follow up and help‌ site referents and applicants to define and draft‌ projects, while ensuring compliance with PIQ program policy,‌ i.e. close dialogue with PIQ staff, and 2)‌ to inform the center's scientific management of applications‌ in progress in New Aquitaine via a dedicated‌ "pad", and their positioning in relation to the‌ PIQ program's national results.
Referal of Education topic‌ for the centre Inria of Univ. of Bordeaux‌ (covering 3 teams : Bivouac, Flowers, Mnemosyne), and‌ for which I'm the centre proxis at RTP‌ CNRS Éducation.
Head of an Associate Inria Team-‌ CuriousTech Inria-UW–Univ. Waterloo (Canada), since 2023. The multi-disciplinary‌ program (Prof. M. Fernandes' psychology lab, and Edith‌ Law's HCI lab) involves designing innovative interactive systems‌ for education and cognitive health at all ages,‌ with the singularity of leveraging intrinsic motivations (self-determination‌ and curiosity) as reinforcers of human performance.
Member‌ of the Direction Committee of IFR Handicap (Inserm)‌ labelled Fedhra since 2023, since 2019
Member of‌ the Direction Committee of BIND - centre excellence‌ BIND de Bordeaux, since 2019
Member of the‌ scientific Committee of SOUND - centre excellence TND‌ Bordeaux, since 2025
Resp. of Research Axis on‌ Innovating Interventions at ACTIVE Team (BPH Lab), since‌ 2022

Cécile Mazon is:

Co-responsible of the Digital‌ Tools workpackage of the PIA Atypie-Friendly
Local contact‌ for Inria HandiTechLab
Member of the Digital Tools‌ axis of the Bordeaux Excellence center for Neurodevelopmental‌ disorders (SOUND project)

PY Oudeyer was:

head of‌ the Flowers AI & CogSci lab
member of‌ the piloting committee of the France 2030 BPI‌ project GAIMHE
representative of Inria in the piloting‌ committee of the Nouvelle Aquitaine Research Network on‌ Educational Technologies (R3NumEd)

11.2 Teaching - Supervision -‌ Juries - Educational and pedagogical outreach

11.2.1 Teaching‌

Cécile Mazon is responsible of:

Cognitive science curriculum‌ in MIASHS bachelor (Mathematics and Computer Science applied‌ to Social and Human Sciences) - since 2024‌
Technology, Ergonomy, Cognition, Disability curriculum in Cognitive Sciences‌ master, since 2022
Apprenticeship academic coordination for Technology,‌ Ergonomy, Cognition, Disability curriculum in Cognitive Sciences master‌ - since 2023

Leslie Tricoche , as ATER,‌ gave the following courses:

L2 MIASHS - UFR‌ Sciences and Technology, Bordeaux University: Neurobiology (lectures and‌ tutorials)
L3 MIASHS - UFR Sciences and Technology,‌ Bordeaux University: Neuropathology (lectures and tutorials)
M1 Cognitive‌ Sciences - UFR Sciences and Technology, Bordeaux University:‌ Cognitive functions in situations and disabilities (lectures and‌ tutorials)
M2 Cognitive Sciences - UFR Sciences and‌ Technology, Bordeaux University: Multiple forms of the profession‌ (lectures and tutorials)

Marie-Sarah Desvaux , as Teaching‌ Assistant, gave the following courses:

M2 Cognitive Sciences‌ - UFR Sciences and Technology, Bordeaux University: Multiple‌ forms of the profession - Project management
L3 MIASHS - UFR Sciences‌ and Technology, Bordeaux University:‌ Web Accessibility

Juliette Deyts‌‌ , as Teaching Assistant, gave the following courses:‌

M2 Cognitive Sciences -‌ UFR Sciences and Technology,‌‌ Bordeaux University: Disability, Autonomy, Cognition and Technology

Matisse‌ Poupard , as ATER,‌ gave the following courses:‌‌

L1 MIASHS - UFR Sciences and Technology, Bordeaux‌ University: Introduction to Cognitive‌ Science
L2 MIASHS -‌‌ UFR Sciences and Technology, Bordeaux University: Neurological Foundations,‌ Cognitive Fundamentals, and Learning‌
L3 MIASHS - UFR‌‌ Sciences and Technology, Bordeaux University: Knowledge and Representations,‌ Language, and Natural Language‌ Processing
M1 Cognitive Sciences‌‌ - UFR Sciences and Technology, Bordeaux University:
- Scientific‌ Foundations
- Cognitive Functions in‌ Situations and Disabilities
M2‌‌ Cognitive Sciences - UFR Sciences and Technology, Bordeaux‌ University:
- Disability, Activity, Cognition,‌ Technology
- Multiple Forms of‌‌ the Profession
- Virtual Reality, Interaction, and Health Applications‌

Cécile Mazon , as‌ assistant professor, gave lectures‌‌ and tutorials (280hETD) in cognitive sciences to students‌ in MIASHS bachelor (L1-2-3)‌ and Cognitive sciences master‌‌ (M1-M2). Key teaching topics include introduction to cognitive‌ sciences, cognitive psychology (main‌ cognitive functions, experimental methods),‌‌ cognitive sciences applied to disability and/or technology design,‌ as well as methodology‌ and statistics.

Helene Sauzéon‌‌ participated to the inria mentoring program as mentor‌ of one PhD student‌ from the centre Inria‌‌ of Paris

11.2.2 Supervision

PY Oudeyer (co-)supervised the‌ following PhD students:

PhD‌ defended in 2025: Grgur‌‌ Kovac, "Developmental training of socio-cognitive abilities in AI‌ systems", (supervisors:PF. Dominey and‌ PY. Oudeyer)
PhD defended‌‌ in 2025: Gauthier Hamon, "Open-endedness in artificial life‌ and articial intelligence: an‌ eco-evo-devo perspective" (supervisor: C.‌‌ Moulin-Frier)
PhD defended in 2025: Nicolas Yax, "Studying‌ cognitive and metacognitive skills‌ in foundation models" (supervisors:‌‌ S. Palminteri, PY. Oudeyer)
PhD defended in 2025:‌ Clément Romac, "Grounding LLMs‌ with online RL", (supervisors:‌‌ T. Wolf and PY. Oudeyer) item PhD in‌ progress: Julien Rosenberg, "Models‌ and experimental study of‌‌ the co-development of curiosity and metacognition in adolescents"‌ (supervisors: H. Sauzéon, PY‌ Oudeyer)
PhD in progress:‌‌ Paul Tabbara, "Autotelic generative AI systems for automated‌ discovery in mathematics" (supervisors:‌ G. Baudart, PY. Oudeyer)‌‌
PhD in progress: Julien Pourcel, "Autotelic LLMs that‌ learn how to code",‌ (supervisors: C. Moulin-Frier and‌‌ PY. Oudeyer)
PhD in progress: Thomas Carta, "LLM-based‌ Autotelic deep reinforcement learning‌ agents", (supervisors: O. Sigaud,‌‌ S. Lamprier and PY. Oudeyer)
PhD in progress:‌ Jeremy Perez, "Studying mechanisms‌ and roles of curiosity‌‌ in socio-cultural contexts" (supervisors: C. Moulin-Frier, M. Derex,‌ PY. Oudeyer)
PhD in‌ progress: Timothé Boulet, "Controller‌‌ synthesis for artificial agents in simulated environments using‌ generative AI" (supervisors C.‌ Moulin-Frier, X. Hinault, N.‌‌ Fijalkow)
PhD in progress: Marko Cvjetko, "Autotelic exploration‌ algorithms for automated search‌ of open-endedness in artificial‌‌ life" (supervisors: C. Moulin-Frier, PY. Oudeyer) item PhD‌ in progress: Loris Gaven,‌ "Metacognitive prediction of learning‌‌ progress for guiding autotelic agents" (supervisors: PY. Oudeyer‌ and C. Moulin-Frier)

H.‌ Sauzéon (co-)supervised the following‌‌ PhD students:

PhD defended in 2025: M. POUPARD‌ " Curious and thus‌ not overloaded !". (supervisors:H.‌‌ sauzeon and A. tricot‌ / CIFRE with CATIE)
PhD in progress: L.‌ PETIOT " AR effect on memory distorsions" (supervisors‌ : H. sauzeon and P. Dragicevic) ( AEx‌ IAM, 2023-25).
PhD in progress: C. DESVAUX "Design‌ and Asssement of metacognitive interventions supporting curiosity and‌ creativy at school" (Alloc. MESRI, ED SP2).
PhD‌ in progress: J. DEYTS "Self-determination driven technologies for‌ healthy aging" (Alloc. from Projet ANR Innovcare)
hD‌ in progress: J. ROSENBERGER Curiosity-driven learnig as developmental‌ function of metacognition in adolescents aged of 12‌ to 16 y/o. (supervisors : H. sauzeon and‌ PY oudeyer(Alloc. from ORA funds, ED SP2 -Univ.‌ Bordeaux)
PhD in progress: M. BOURDIL "A neurotechnological‌ approach using EEG for the characterising and the‌ therapeutic treatment of smal vessels syndrome. (supervisors: F‌ Lotte and H. sauzeon) (Alloc from IHU-VBHI project)‌

11.2.3 Juries

PY Oudeyer was a member of:‌

the selection committee of the Inria Prizes from‌ Académie des Sciences.
the PhD juries of Marie‌ Martin (Université Interdisciplinaire de Paris), Théo Cachet (Sorbonne‌ University) and J. Daly (Univ. Texas, Austin)
the‌ PhD "comité de suivi" of Reem al Najjar‌ (Sorbonne Université), Matthis Poupard (Univ. Bordeaux), Paul Pacaud‌ (Université Paris Sciences Lettres)

Hélène Sauzéon was a‌ part of 6 PhD boards :

"Conception, développement‌ et évaluation d'un exergame en réalité augmentée pour‌ la rééducation cognitivo-motrice d'enfants atteints de Paralysie Cérébrale‌ ou de Lésions Cérébrales Acquises : le projet‌ TERAPACE by Maxime Balloufaud - Limoges
"Optimizing sensory‌ feedback and manual interaction efficiency within XR experiments"‌ by Julien Cauquis - Ecole nationale supérieure Mines-Télécom‌ Atlantique Bretagne Pays de la Loire
"Careless or‌ care-led innovation? : socio-ethnography of social robots and‌ social tiesin eldercare settings in France and Japan‌ : tensions and contradictions in needs, temporalities and‌ representations" by Yuko Tamaki - Paris, EHESS
"Neurocognitive‌ mechanisms of self-referenced memory encoding: a naturalistic and‌ embodied approach to episodic memory" by Sylvain Penaud‌ - Université Paris Cité
"Intrinsic vs. Extrinsic Motivation:‌ Computational Modelling, Neural Bases, and Clinical Applications" by‌ Jade Seguin - Sorbonne université
"Prise de décision‌ lors de la planification d'itinéraires avec des applications‌ : une approche cognitive pour la régulation des‌ flux voyageurs dans les transports en commun" by‌ Archana Prabhakar - Université Paris Cité

Cécile Mazon‌ is permanent member of the jury for Cognitive‌ Sciences master thesis defenses (M1/M2) and for bachelor‌ undergraduate projects (L3 MIASHS).

11.2.4 Support to public‌ policies

PY. Oudeyer and H. Sauzéon and the‌ whole team were involved in several major actions‌ to support public policies on the topic of‌ AI and education. Members of the team designed‌ and conducted training sessions in different academies for‌ supervisory staff and teachers, e.g. ETAPP-IA day in‌ Nouvelle-Aquitaine (January 2025); departmental training of CPE and‌ documentary teachers of Nouvelle-Aquitaine during a day at‌ the Lycée Les Iris in Lormont (May 2025);‌ Academic Days of Innovation for teachers of Nouvelle-Aquitaine,‌ Spring Days of Education Research at INSPEs, (June‌ 2025); PhilosophIA Citizens' Convention (April 2025), twin conference of Cnesco/Cardie Charente-Maritime (January‌ 2025), working group Education‌ and Cognitive Sciences of‌‌ the academies of Créteil, Versailles and Paris, scheduled‌ for March 2026.

H.‌ Sauzéon and PY. Oudeyer‌‌ were interviewed and wrote reports to contribute to‌ the report of French‌ Senate on AI and‌‌ education.

PY Oudeyer was auditioned by the‌ commission on cultural and‌ educational affairs in the‌‌ French parliament, to discuss the major challenges and‌ opportunities of AI and‌ education.

11.2.5 Educational and‌‌ pedagogical outreach

Cécile Mazon participated to events for‌ promoting university programs in‌ cognitive science: the Salon‌‌ de l’Étudiant (January 2025), the University of Bordeaux‌ Open Days (January 2025),‌ and the Orientation Days‌‌ (May 2025).

11.3 Popularization

11.3.1 Specific official responsibilities‌ in science outreach structures‌

PY. Oudeyer collaborated with‌‌ the Pix organization as main scientific and editorial‌ design consultant for the‌ Pix IA training modules,‌‌ which will be dissemnated to all French students‌ in 4ème, 2nde and‌ CAP in 2026.

11.3.2‌‌ Productions (articles, videos, podcasts, serious games, ...)

PY.‌ Oudeyer gave several public‌ talks on AI and‌‌ education available on a youtube channel.

D. Roy‌ and P-Y. Oudeyer wrote‌ a popular science book‌‌ to introduce generative AI (mechanisms, applications, societal dimensions)‌ to adolescents, as well‌ as to their teachers‌‌ and families. It is entitled "C'est (pas) moi,‌ c'est l'IA", and was‌ published in september 2024‌‌ by Nathan. It was reviewed in widely distributed‌ magazines (e.g. Magazine de‌ l'APEL) and radios (e.g.‌‌ France Culture, RFI). The web page of the‌ book is here: link‌.

A. Torres-Leguet, C.‌‌ Romac, T. Carta and PY. Oudeyer produced the‌ pedagogical video series "ChatGPT‌ explained in 5 mn",‌‌ aimed at training generative AI literacy in a‌ wide diversity of students‌ (e.g. high school), available‌‌ here: link. They are under a Creative‌ Commons licence, CC-BY, enabling‌ open and free reuse.‌‌ They were already integrated in the MOOC AI4T‌ (link), as‌ well as in an‌‌ internal training platform of "Académie du Numérique du‌ Ministère de la défense",‌ in a mobile app‌‌ made by Inria with educational materials related to‌ AI (link),‌ and are being adapted‌‌ and integrated in a training platform for the‌ whole population of civil‌ servants in France, coordinated‌‌ by DINUM.

PY Oudeyer wrote a note for‌ the French educational institutions‌ on "IA générative, société‌‌ et éducation: En quoi l’IA générative représente-elle un‌ enjeu dans la formation‌ des citoyens ?",‌‌ in the context of the Conférence de Consensus‌ on Nouveaux Savoirs et‌ Nouvelles Compétences des Jeunes‌‌ of Cnesco, (Nov. 2024)

Hélène Sauzéon wrote a‌ web article on the‌ following topic: "Why agency‌‌ is a key ability in the workplace"

Hélène‌ Sauzéon participated to the‌ "mental health and Technology"‌‌ podcast organized by BPH -Inserm (october 2025)‌ in Bordeaux

Marie-Sarah Desvaux‌ walk interviewed by Curieux!‌‌ Live on Educational Technologies for learning

11.3.3 Participation‌ in Live events

Jeremy‌ Perez and Clément Romac‌‌ gave a presentation on‌ Artificial Intelligence to high school teachers as part‌ of the "Journée formation IA pour les enseignant.es"‌ at the Bordeaux INRIA Center on February 5th.‌

Clément Romac gave a talk on generative AIs‌ to La main à la pâte, a French‌ association promoting science in classrooms.

Hélène Sauzéon ,‌ Cécile Mazon , Sophie Lepennetier , Julien Rosenberger‌ , Loris Gaven , Paul Tabbara and Julien‌ Pourcel hosted a stand at the Village des‌ Sciences on October 11th and 12th. It gave‌ an opportunity to introduce curiosity to visitors of‌ CapScience, especially kids and parents.

Hélène Sauzéon and‌ Marie-Sarah Desvaux animated a workshop on curiosity-driven learning‌ to teachers and trainers during the "Learning Show"‌ 2025 (13th of October) in Rennes

Hélène Sauzéon‌ gave a talk at the event organized by‌ "Science with and for Society" by Université of‌ Bordeaux: Samedis Sciences #4 "Artificial Intelligence and Education‌ : the future of learning?"

Hélène Sauzéon gave‌ a talk at the event « Journée académique‌ de l’expérimentation" organized by CARDIE Grenoble, Grenoble (14‌ mai 2025)

Hélène Sauzéon participated in "CoAnimation" for‌ the Portes Fermées at INRIA, for a workshop‌ to promote dialogue between digital and social sciences‌

Hélène Sauzéon participated in "Circuit scientifique Hors les‌ murs" in October 2025

Hélène Sauzéon participated in‌ the Chiche program, visiting 2 to 3 classrooms‌

Marie-Sarah Desvaux gave a talk on the use‌ of Generative AI in classrooms to INSPE students‌ of University of Bordeaux (March 2025)

Marie-Sarah Desvaux‌ animated a workshop on the use of Generative‌ AI in classrooms during the Journée Académique (August‌ 2025) organized by CARDIE Poitiers

Marie-Sarah Desvaux gave‌ an interactive talk on curisoity-driven learning in classrooms‌ to teachers during the Cogni'Forum 2025 (October) organized‌ by "Apprendre et Former avec les Sciences Cognitives"‌

PY Oudeyer participated to several live events:

(Feb‌ 2025) Presentation of the researcher job to high‌ school students at La Sauque high-school (Nouvelle-Aquitaine)
(March‌ 2025) La créativité dans tous ses états,‌ with E. Koechlin, F. Guedy, P. Ribault, organized‌ by Institute of Advanced Studies, Paris.
(Nov 2025)‌ Presentation of the societal stakes of AI to‌ a group of high-school students in Compiègne, in‌ the context of the Roberval prize event.
(Jan‌ 2025) D. Roy and PY. Oudeyer, interview et‌ présentation du livre "C'est (pas) moi, c'est l'IA"‌, lors d'une rencontre avec des collégiens organisée‌ par la libraire Mollat.
(Feb. 2025) PY Oudeyer,‌ Curiosité, cognition et intelligence artificielle : Comment mieux‌ apprendre à apprendre ?, Série de conférence‌ MIA Seconde.
(Oct 2025) PY Oudeyer, IA générative,‌ société et éducation, for the conference on‌ AI organized by PolarIA at Fleurance, in the‌ context of the annual science festival.

11.3.4 Others‌ science outreach relevant activities

Press:

PY. Oudeyer was‌ interviewed, or the work of the team was‌ discussed, in various newspapers, magazines and radios/podcasts:

Espiloon‌ (oct. 2025), IA, et maintenant les robots !‌
Telerama (Aout 2025), Le risque, c’est l’affaiblissement de la pensée: quand l’IA‌ met l’école à l’épreuve‌
Version Femina (Aout 2025),‌‌ Comment leur apprendre à utiliser l’IA?
Télérama (Mai‌ 2025), Intelligence artificielle :‌ comment bien accompagner les‌‌ enfants ?
Le Monde (Feb. 2025), L’IA à‌ l’école, une révolution déjà‌ en marche
Magazine de‌‌ l'APEL (Jan 2025) PY. Oudeyer, L'IA, un outil‌ pour la différenciation pédagogique‌, entretien réalisé pour‌‌ le Magazing de l'APEL.

12 Scientific production

12.1‌ Major publications

1 inproceedings‌R.Rania Abdelghani,‌‌ E.Edith Law, C.Chloé Desvaux,‌ P.-Y.Pierre-Yves Oudeyer and‌ H.Hélène Sauzéon.‌‌ Interactive environments for training children’s curiosity through the‌ practice of metacognitive skills‌ : a pilot study‌‌.IDC 2023 - The 22nd annual ACM‌ Interaction Design and Children‌ ConferenceChicago IL, United‌‌ StatesACM; ACMNovember 2023, 495-501HAL‌DOI
2 articleR.‌Rania Abdelghani, P.-Y.‌‌Pierre-Yves Oudeyer, E.Edith Law, C.‌Catherine de Vulpillières and‌ H.Hélène Sauzéon.‌‌ Conversational agents for fostering curiosity-driven learning in children‌.International Journal of‌ Human-Computer Studies167November‌‌ 2022, 102887HALDOI
3 articleR.‌Rania Abdelghani, Y.-H.‌Yen-Hsiang Wang, X.‌‌Xingdi Yuan, T.Tong Wang, P.‌Pauline Lucas, H.‌Hélène Sauzéon and P.-Y.‌‌Pierre-Yves Oudeyer. GPT-3-driven pedagogical agents for training‌ children's curious question-asking skills‌.International Journal of‌‌ Artificial Intelligence in EducationJune 2023HAL DOI‌
4 articleM.Maxime‌ Adolphe, M.Masataka‌‌ Sawayama, D.Denis Maurel, A.Alexandra‌ Delmas, P.-Y.Pierre-Yves‌ Oudeyer and H.Helene‌‌ Sauzeon. An Open-Source Cognitive Test Battery to‌ Assess Human Attention and‌ Memory.Frontiers in‌‌ Psychology13June 2022HAL DOI back to‌ text
5 articleA.‌Adrien Baranes and P.-Y.‌‌Pierre-Yves Oudeyer. Active Learning of Inverse Models‌ with Intrinsically Motivated Goal‌ Exploration in Robots.‌‌Robotics and Autonomous Systems611January 2013‌, 69-73HAL DOI‌
6 inproceedingsT.Thomas‌‌ Carta, C.Clément Romac, T.Thomas‌ Wolf, S.Sylvain‌ Lamprier, O.Olivier‌‌ Sigaud and P.-Y.Pierre-Yves Oudeyer. Grounding Large‌ Language Models in Interactive‌ Environments with Online Reinforcement‌‌ Learning.International Conference on Machine Learning 2023‌2023676-3713Honololu, Hawaii,‌ United States2023HAL‌‌
7 articleP.-A.Pierre-Antoine Cinquin, P.Pascal‌ Guitton and H.Hélène‌ Sauzéon. Towards Truly‌‌ Accessible MOOCs for Persons with Cognitive Impairments: a‌ Field Study.Human-Computer‌ Interaction2021HAL
8‌‌ inproceedingsC.Cédric Colas, P.Pierre Fournier‌, O.Olivier Sigaud‌, M.Mohamed Chetouani‌‌ and P.-Y.Pierre-Yves Oudeyer. CURIOUS: Intrinsically Motivated‌ Modular Multi-Goal Reinforcement Learning‌.International Conference on‌‌ Machine LearningLong Beach, FranceJune 2019HAL‌
9 inproceedingsC.Cédric‌ Colas, T.Tristan‌‌ Karch, N.Nicolas Lair, J.-M.Jean-Michel‌ Dussoux, C.Clément‌ Moulin-Frier, P. F.‌‌Peter Ford Dominey and P.-Y.Pierre-Yves Oudeyer.‌ Language as a Cognitive‌ Tool to Imagine Goals‌‌ in Curiosity-Driven Exploration.‌NeurIPS 2020 - 34th Conference on Neural Information‌ Processing SystemsContains main article and supplementariesVancouver‌ / Virtual, CanadaDecember 2020HAL back to‌ text
10 inproceedingsM.Mayalen Etcheverry, C.‌Clément Moulin-Frier and P.-Y.Pierre-Yves Oudeyer. Hierarchically‌ Organized Latent Modules for Exploratory Search in Morphogenetic‌ Systems.NeurIPS 2020 - 34th Conference on‌ Neural Information Processing SystemsVancouver / Virtual, Canada‌December 2020HAL back to text
11 article‌M.Mayalen Etcheverry, C.Clément Moulin-Frier,‌ P.-Y.Pierre-Yves Oudeyer and M.Michael Levin.‌ AI-driven Automated Discovery Tools Reveal Diverse Behavioral Competencies‌ of Biological Networks.eLifeAugust 2024HAL‌DOI
12 articleS.Sébastien Forestier, R.‌Rémy Portelas, Y.Yoan Mollard and P.-Y.‌Pierre-Yves Oudeyer. Intrinsically Motivated Goal Exploration Processes‌ with Automatic Curriculum Learning.Journal of Machine‌ Learning ResearchApril 2022HAL back to text‌
13 inproceedingsL.Loris Gaven, T.Thomas‌ Carta, C.Clément Romac, C.Cédric‌ Colas, S.Sylvain Lamprier, O.Olivier‌ Sigaud and P.-Y.Pierre-Yves Oudeyer. MAGELLAN: Metacognitive‌ predictions of learning progress guide autotelic LLM agents‌ in large goal spaces.ICML 2025 -‌ 42nd International Conference on Machine Learning267Vancouver‌ (BC), Canada2025HAL
14 articleJ.Jacqueline‌ Gottlieb and P.-Y.Pierre-Yves Oudeyer. Towards a‌ neuroscience of active sampling and curiosity.Nature‌ Reviews Neuroscience1912December 2018, 758-770‌HAL
15 articleG.Gautier Hamon, M.‌Mayalen Etcheverry, B.-C. W.Bert Wang-Chak Chan‌, C.Clément Moulin-Frier and P.-Y.Pierre-Yves Oudeyer‌. Discovering Sensorimotor Agency in Cellular Automata using‌ Diversity Search.Science Advances 11442025‌HAL DOI
16 articleG.Grgur Kovač,‌ R.Rémy Portelas, M.Masataka Sawayama,‌ P. F.Peter Ford Dominey and P.-Y.Pierre-Yves‌ Oudeyer. Stick to your role! Stability of‌ personal values expressed in large language models.‌PLoS ONE198August 2024, e0309114‌HAL DOI
17 inproceedingsA.Adrien Laversanne-Finot,‌ A.Alexandre Péré and P.-Y.Pierre-Yves Oudeyer.‌ Curiosity Driven Exploration of Learned Disentangled Goal Spaces‌.CoRL 2018 - Conference on Robot Learning‌Zürich, SwitzerlandOctober 2018HAL
18 articleC.‌Cécile Mazon, B.Benjamin Clément, D.‌Didier Roy, P.-Y.Pierre-Yves Oudeyer and H.‌Hélène Sauzéon. Pilot study of an intervention‌ based on an intelligent tutoring system (ITS) for‌ instructing mathematical skills of students with ASD and/or‌ ID.Education and Information Technologies2022HAL‌DOI
19 inproceedingsE.Eleni Nisioti, K.‌Katia Jodogne-del Litto and C.Clément Moulin-Frier.‌ Grounding an Ecological Theory of Artificial Intelligence in‌ Human Evolution.NeurIPS 2021 - Conference on‌ Neural Information Processing Systems / Workshop: Ecological Theory‌ of Reinforcement Learningvirtual event, FranceDecember 2021‌HAL
20 inproceedingsE.Eleni Nisioti, E.‌Elías Masquil, G.Gautier Hamon and A.‌ C.And Clément Moulin-Frier. Autotelic Reinforcement Learning in Multi-Agent Environments.‌CoLLAs 2023, Conference on‌ Lifelong Learning AgentsMontréal,‌‌ CanadaAugust 2023HAL
21 inproceedingsE.Eleni‌ Nisioti, S.Sebastian‌ Risi, I.Ida‌‌ Momennejad, P.-Y.Pierre-Yves Oudeyer and C.Clément‌ Moulin-Frier. Collective Innovation‌ in Groups of Large‌‌ Language Models.ALIFE 2024 - The Conference‌ on Artificial LifeCopenhagen,‌ DenmarkMIT Press2024‌‌HAL DOI
22 inproceedingsA.Alexandre Péré,‌ S.Sébastien Forestier,‌ O.Olivier Sigaud and‌‌ P.-Y.Pierre-Yves Oudeyer. Unsupervised Learning of Goal‌ Spaces for Intrinsically Motivated‌ Goal Exploration.ICLR2018‌‌ - 6th International Conference on Learning RepresentationsVancouver,‌ CanadaApril 2018HAL‌
23 inproceedingsJ.Jérémy‌‌ Perez, G.Grgur Kovač, C.Corentin‌ Léger, C.Cédric‌ Colas, G.Gaia‌‌ Molinaro, M.Maxime Derex, P.-Y.Pierre-Yves‌ Oudeyer and C.Clément‌ Moulin-Frier. When LLMs‌‌ Play the Telephone Game: Cultural Attractors as Conceptual‌ Tools to Evaluate LLMs‌ in Multi-turn Settings.‌‌The Thirteenth International Conference on Learning Representations (ICLR‌ 2025)Singapour, Singapore2025‌HAL
24 inproceedingsE.‌‌Erwan Plantec, G.Gautier Hamon, M.‌Mayalen Etcheverry, P.-Y.‌Pierre-Yves Oudeyer, C.‌‌Clément Moulin-Frier and B.-C. W.Bert Wang-Chak Chan‌. Flow-Lenia: Towards open-ended‌ evolution in cellular automata‌‌ through mass conservation and parameter localization.The‌ 2023 Conference on Artificial‌ LifeTokyo, JapanMIT‌‌ Press; MIT PressJuly 2023HAL DOI
25‌ inproceedingsR.Rémy Portelas‌, C.Cédric Colas‌‌, L.Lilian Weng, K.Katja Hofmann‌ and P.-Y.Pierre-Yves Oudeyer‌. Automatic Curriculum Learning‌‌ For Deep RL: A Short Survey.IJCAI‌ 2020 - International Joint‌ Conference on Artificial Intelligence‌‌Kyoto / Virtuelle, JapanJanuary 2021HAL
26‌ articleM.Matisse Poupard‌, F.Florian Larrue‌‌, M.Martin Bertrand, D.Dominique Liguoro‌, A.Andre Tricot‌ and H.Hélène Sauzéon‌‌. Using virtual reality for enhancing neuroanatomy learning‌ by optimizing cognitive load‌ and intrinsic motivation..‌‌Computers and Education235October 2025, 105332‌HAL DOI
27 inproceedings‌J.Julien Pourcel,‌‌ C.Cédric Colas, G.Gaia Molinaro,‌ P.-Y.Pierre-Yves Oudeyer and‌ L.Laetitia Teodorescu.‌‌ ACES: Generating diverse programming puzzles with autotelic language‌ models and semantic descriptors‌.NeurIPS 2024 -‌‌ The 38th Annual Conference on Neural Information Processing‌ SystemsVancouver, Canada2024‌HAL
28 articleJ.‌‌Julien Pourcel, C.Cédric Colas and P.-Y.‌Pierre-Yves Oudeyer. Self-Improving‌ Language Models for Evolutionary‌‌ Program Synthesis: A Case Study on ARC-AGI.‌Proceedings of Machine Learning‌ Research2025HAL DOI‌‌
29 inproceedingsC.Chris Reinke, M.Mayalen‌ Etcheverry and P.-Y.Pierre-Yves‌ Oudeyer. Intrinsically Motivated‌‌ Discovery of Diverse Patterns in Self-Organizing Systems.‌International Conference on Learning‌ Representations (ICLR)Source code‌‌ and videos athttps://automated-discovery.github.io/Addis Ababa, EthiopiaApril 2020‌HAL back to text‌
30 articleY.Yadurshana‌‌ Sivashankar, M.Myra Fernandes, P.-Y.Pierre-Yves‌ Oudeyer and H.Hélène‌ Sauzéon. The beneficial‌‌ role of curiosity on‌ route memory in children.Frontiers in Cognition‌3March 2024HALDOI
31 articleA.‌Alexandr Ten, P.Pramod Kaushik, P.-Y.‌Pierre-Yves Oudeyer and J.Jacqueline Gottlieb. Humans‌ monitor learning progress in curiosity-driven exploration.Nature‌ Communications121December 2021HAL DOI
32‌ inproceedingsZ.Ziang Xiao, X.Xingdi Yuan‌, Q. V.Q. Vera Liao, R.‌Rania Abdelghani and P.-Y.Pierre-Yves Oudeyer. Supporting‌ Qualitative Analysis with Large Language Models: Combining Codebook‌ with GPT-3 for Deductive Coding.IUI 2023‌ - 28th International Conference on Intelligent User Interfaces‌Sydney, AustraliaACMMarch 2023, 75-78HAL‌DOI
33 inproceedingsN.Nicolas Yax, P.-Y.‌Pierre-Yves Oudeyer and S.Stefano Palminteri. PhyloLM‌ : Inferring the Phylogeny of Large Language Models‌ and Predicting their Performances in Benchmarks.ICLR‌ 2025Singapore, Singapore2025HAL

12.2 Publications of‌ the year

International journals

34 articleG.Gautier‌ Hamon, M.Mayalen Etcheverry, B.-C. W.‌Bert Wang-Chak Chan, C.Clément Moulin-Frier and‌ P.-Y.Pierre-Yves Oudeyer. Discovering Sensorimotor Agency in‌ Cellular Automata using Diversity Search.Science Advances‌ 11442025HALDOI back to text‌back to text
35 articleE.Eric Meyer‌, H.Hélène Sauzéon, I.Isabeau Saint-Supery‌ and C.Cecile Mazon. Evaluating a Web-Based‌ Application to Facilitate Family-School-Health Care Collaboration for Children‌ With Neurodevelopmental Disorders in Inclusive Settings: Protocol for‌ a Nonrandomized Trial.JMIR Research Protocols14‌April 2025, e63378HAL DOI back to‌ text
36 articleM.Marion Pech, M.‌Maxime Adolphe, P.-Y.Pierre-Yves Oudeyer and H.‌Hélène Sauzéon. Broadening the lens: A review‌ of multi-object tracking task and its use in‌ cognitive training.Acta Psychologica258July 2025‌, 105271HAL DOIback to text
37‌ articleM.Matisse Poupard, F.Florian Larrue‌, M.Martin Bertrand, D.Dominique Liguoro‌, H.Hélène Sauzéon and A.André Tricot‌. From Movement to Learning: Leveraging VR Behavioral‌ Metrics to Evaluate Cognitive Load and Curiosity.‌International Journal of Human-Computer Studies209February 2026‌, 103751HAL DOIback to text
38‌ articleM.Matisse Poupard, F.Florian Larrue‌, M.Martin Bertrand, D.Dominique Liguoro‌, A.Andre Tricot and H.Hélène Sauzéon‌. Using virtual reality for enhancing neuroanatomy learning‌ by optimizing cognitive load and intrinsic motivation..‌Computers and Education235October 2025, 105332‌HAL DOI back to text back to text‌
39 articleM.Matisse Poupard, F.Florian‌ Larrue, H.Hélène Sauzéon and A.André‌ Tricot. A systematic review of immersive technologies‌ for education: effects of cognitive load and curiosity‌ state on learning performance.British Journal of‌ Educational Technology5612025, 5-41HAL‌DOI back to textback to text
40‌ articleJ.Julien Pourcel, C.Cédric Colas‌ and P.-Y.Pierre-Yves Oudeyer. Self-Improving Language Models for Evolutionary Program Synthesis:‌ A Case Study on‌ ARC-AGI.Proceedings of‌‌ Machine Learning Research2025HAL DOI back to‌ text
41 articleI.‌Isabeau Saint-Supery, H.‌‌Hélène Sauzéon, E.Eric Meyer and C.‌Cécile Mazon. CoEd,‌ an interactive website for‌‌ the stakeholders of school inclusion of children with‌ ASD: an iterative design‌ including user testing.‌‌Technology, Pedagogy and EducationSeptember 2025, 1-23‌HAL DOI back to‌ text
42 articleA.‌‌Alexandr Ten, P.-Y.Pierre-Yves Oudeyer, M.‌Michiko Sakaki and K.‌Kou Murayama. The‌‌ Curious U : Integrating Theories Linking Knowledge and‌ Information-Seeking Behavior.Open‌ Mind9October 2025‌‌, 1763-1785HAL DOIback to text

International‌ peer-reviewed conferences

43 inproceedings‌M. S.Mohamed Salim‌‌ Aissi, C.Clément Romac, T.Thomas‌ Carta, S.Sylvain‌ Lamprier, P.-Y.Pierre-Yves‌‌ Oudeyer, O.Olivier Sigaud, L.Laure‌ Soulier and N.Nicolas‌ Thome. Reinforcement Learning‌‌ for Aligning Large Language Models Agents with Interactive‌ Environments: Quantifying and Mitigating‌ Prompt Overfitting.NAACL‌‌ 2025 - Findings of the Association for Computational‌ LinguisticsAlbuquerque, United States‌Association for Computational Linguistics‌‌May 2025, 7030-7046HAL DOI
44 inproceedings‌T.Timothé Boulet,‌ X.Xavier Hinaut and‌‌ C.Clément Moulin-Frier. Software Engineering Agents for‌ Embodied Controller Generation :‌ A Study in Minigrid‌‌ Environments.NeurIPS 2025 Efficient Reasoning WorkshopSan‌ Diego (CA), United States‌December 2025HAL back‌‌ to text back to text
45 inproceedingsM.‌Marko Cvjetko, G.‌Gautier Hamon, C.‌‌Clément Moulin-Frier and P.-Y.Pierre-Yves Oudeyer. Discovering‌ and Controlling Diverse Self-Organised‌ Patterns in Cellular Automata‌‌ Using Autotelic Reinforcement Learning.Alife 2025 -‌ Conference on Artificial Life‌Kyoto, JapanOctober 2025‌‌HAL back to textback to text
46‌ inproceedingsJ.Juliette Deyts‌, R.Rafik Belloum‌‌, L.Lucile Dupuy, L.Loïc Blouin‌ and H.Hélène Sauzéon‌. Towards an Interactive‌‌ Longitudinal Visualization of Daily Living Activities for Aging‌ in Place.IHM‌ 2025 - 36e Conférence‌‌ Internationale Francophone sur l’Interaction Humain-MachineToulouse, FranceNovember‌ 2025HAL back to‌ text
47 inproceedingsL.‌‌Loris Gaven, T.Thomas Carta, C.‌Clément Romac, C.‌Cédric Colas, S.‌‌Sylvain Lamprier, O.Olivier Sigaud and P.-Y.‌Pierre-Yves Oudeyer. MAGELLAN:‌ Metacognitive predictions of learning‌‌ progress guide autotelic LLM agents in large goal‌ spaces.ICML 2025‌ - 42nd International Conference‌‌ on Machine Learning267Vancouver (BC), Canada2025‌HAL back to text‌back to text back‌‌ to text
48 inproceedingsS.Sina Khajehabdollahi,‌ G.Gautier Hamon,‌ M.Marko Cvjetko,‌‌ P.-Y.Pierre-Yves Oudeyer, C.Clément Moulin-Frier and‌ C.Cédric Colas.‌ Expedition & Expansion: Leveraging‌‌ Semantic Representations for Goal-Directed Exploration in Continuous Cellular‌ Automata.ALIFE 2025‌ - Conference on Artificial‌‌ LifeKyoto, Japan2025HAL back to text‌back to text
49‌ inproceedingsG.Grgur Kovač‌‌, J.Jérémy Perez‌, R.Rémy Portelas, P. F.Peter‌ Ford Dominey and P.-Y.Pierre-Yves Oudeyer. Recursive‌ Training Loops in LLMs: How training data properties‌ modulate distribution shift in generated data?Proceedings of‌ the 2025 Conference on Empirical Methods in Natural‌ Language ProcessingEMNLP 2025 - Conference on Empirical‌ Methods in Natural Language ProcessingSuzhou, ChinaAssociation‌ for Computational LinguisticsNovember 2025, 32278-32297HAL‌DOI back to textback to text
50‌ inproceedingsA.Aleksa Marusic, S. M.Sao‌ Mai Nguyen and A.Adriana Tapus. Skeleton-Based‌ Transformer for Classification of Errors and Better Feedback‌ in Low Back Pain Physical Rehabilitation Exercises.‌ICORR 2025 - 19th IEEE/RAS-EMBS International Conference on‌ Rehabilitation RoboticsHRI '23: Companion of the 2023‌ ACM/IEEE International Conference on Human-Robot InteractionMichigan, United‌ StatesIEEEMay 2025HAL
51 inproceedingsT.‌Thomas Michel, M.Marko Cvjetko, G.‌Gautier Hamon, P.-Y.Pierre-Yves Oudeyer and C.‌Clément Moulin-Frier. Exploring Flow-Lenia Universes with a‌ Curiosity-driven AI Scientist: Discovering Diverse Ecosystem Dynamics.‌Artificial Life Conference Proceedings 37ALIFE 2025 -‌ Conference on Artificial Life20251Kyoto /‌ Virtual, Japan2025, 68HAL DOI back‌ to text back to text
52 inproceedingsB.‌Bastien Morel, C.Clément Moulin-Frier and P.‌Pascal Barla. Complex System Exploration with Interactive‌ Human Guidance.WIVACE 2025 - XIX International‌ Workshop on Artificial Life and Evolutionary ComputationSiena,‌ Italy2025HAL back to text
53 inproceedings‌L.Léana Petiot, H.Hélène Sauzéon and‌ P.Pierre Dragicevic. The Effect of Augmented‌ Reality on Involuntary Autobiographical Memory.CHI 2025‌ - Conference on Human Factors in Computing Systems‌Yokohama, JapanApril 2025HAL DOI back to‌ text back to text
54 inproceedingsL.Léana‌ Petiot, H.Hélène Sauzéon and P.Pierre‌ Dragicevic. Using Visual Cues to Prevent Memory‌ Confusion Between the Virtual and the Real in‌ Augmented Reality.Late-breaking work / poster published‌ in the CHI extended abstracts (CHI EA ’25)‌CHI 2025 - Conference on Human Factors in‌ Computing SystemsYokohama, JapanApril 2025HAL DOI‌back to text
55 inproceedingsBest paperM.‌Max Taylor-Davies, G.Gautier Hamon, T.‌Timothé Boulet and C.Clément Moulin-Frier. Emergent‌ kin selection of altruistic feeding via non-episodic neuroevolution‌.Applications of Evolutionary ComputationInternational Conference on‌ the Applications of Evolutionary Computation (Part of EvoStar)‌Trieste, Italy2025HALback to text back‌ to text back to text
56 inproceedingsN.‌Nicolas Yax, P.-Y.Pierre-Yves Oudeyer and S.‌Stefano Palminteri. PhyloLM : Inferring the Phylogeny‌ of Large Language Models and Predicting their Performances‌ in Benchmarks.ICLR 2025Singapore, Singapore2025‌HAL back to text

National peer-reviewed Conferences

57‌ inproceedingsS.Sofiya Kobylyanskaya, C.Catherine de‌ Vulpillères and P.-Y.Pierre-Yves Oudeyer. A hybrid‌ AI approach to educational technologies : augmenting ITS‌ with generative AI.Actes de l'atelier Intelligence Artificielle générative et ÉDUcation‌ : Enjeux, Défis et‌ Perspectives de Recherche 2025‌‌ (IA-ÉDU)20e Conférence en Recherche d’Information et Applications‌ (CORIA) 32ème Conférence sur‌ le Traitement Automatique des‌‌ Langues Naturelles (TALN) 27ème Rencontre des Étudiants Chercheurs‌ en Informatique pour le‌ Traitement Automatique des Langues‌‌ (RECITAL) Les 18e Rencontres Jeunes Chercheurs en RI‌ (RJCRI)Marseille, FranceATALA‌ et ARIA2025,‌‌ 145-148HAL

Conferences without proceedings

58 inproceedingsH.‌Hana Al-Mrayati, É.‌Éric Meyer, H.‌‌Hélène Sauzéon and C.Cécile Mazon. Value‌ of Parent-Professional partnerships in‌ inclusive education : evaluation‌‌ of a digital tool supporting Family-School-Healthcare collaboration for‌ students with NDD.‌EIAH 2025 - 12ème‌‌ Conférence sur les Environnements Informatiques pour l'Apprentissage Humain‌Villeneuve d’Ascq, FranceJune‌ 2025HAL back to‌‌ text
59 inproceedingsG.Guillaume Levy, C.‌Cédric Colas, P.-Y.‌Pierre-Yves Oudeyer, T.‌‌Thomas Carta and C.Clément Romac. WORLDLLM:‌ IMPROVING LLMS' WORLD MODELING‌ USING CURIOSITY-DRIVEN THEORY-MAKING.‌‌RLDM 2025 - Multi-disciplinary Conference on Reinforcement Learning‌ and Decision MakingDublin,‌ IrelandJune 2025HAL‌‌back to text back to text
60 inproceedings‌J.Jérémy Perez,‌ G.Grgur Kovač,‌‌ C.Corentin Léger, C.Cédric Colas,‌ G.Gaia Molinaro,‌ M.Maxime Derex,‌‌ P.-Y.Pierre-Yves Oudeyer and C.Clément Moulin-Frier.‌ When LLMs Play the‌ Telephone Game: Cultural Attractors‌‌ as Conceptual Tools to Evaluate LLMs in Multi-turn‌ Settings.The Thirteenth‌ International Conference on Learning‌‌ Representations (ICLR 2025)Singapour, Singapore2025HAL back‌ to text

Doctoral dissertations‌ and habilitation theses

61‌‌ thesisG.Gautier Hamon. Towards open-ended dynamics‌ in Artificial Life and‌ Artificial Intelligence : an‌‌ eco-evo-devo perspective.Université de BordeauxMarch 2025‌HAL
62 thesisG.‌Grgur Kovač. Building,‌‌ evaluating and understanding socio-cultural AI : leveraging concepts‌ and methods from human‌ sciences.Université de‌‌ BordeauxNovember 2025HAL
63 thesisM.Matisse‌ Poupard. Curious and‌ therefore not overloaded: Towards‌‌ an integrated understanding of curiosity and cognitive load‌ in XR learning environments.‌.Université de bordeaux‌‌September 2025HAL

Reports & preprints

64 misc‌J.Jérémy Perez,‌ M.Maxime Derex,‌‌ P.-Y.Pierre-Yves Oudeyer and C.Clément Moulin-Frier.‌ Intrinsic motivation is key‌ to understanding peer cultures:‌‌ Commentary on target article: Lew-Levy, S. & Amir,‌ D. (2024). Children as‌ agents of cultural adaptation.‌‌ Behavioral and Brain Sciences, 1–68.November 2025‌HAL back to text‌
65 miscM.Matisse‌‌ Poupard, F.F Larrue, M.Martin‌ Bertrand, D.D‌ Liguoro, H.H‌‌ Sauzéon and A.A Tricot. Enhancing Anatomy‌ Learning Through Mixed Reality-Supported‌ Drawing: Investigating Learning Performance,‌‌ Cognitive Load, and Intrinsic Motivation.April 2025‌HAL back to text‌
66 miscM.Matisse‌‌ Poupard, F.Florian Larrue, M.Martin‌ Bertrand, D.Dominique‌ Liguoro, H.Hélène‌‌ Sauzéon and A.André Tricot. Reducing Load,‌ Fostering Curiosity: Empirical Validation‌ of the IMCLM-XR.‌‌July 2025HAL back‌ to text

Scientific popularization

67 miscC.Clément‌ Romac, P.-Y.Pierre-Yves Oudeyer and T.Thomas‌ Carta. Can AIs understand our world? Functionally‌ grounding LLMs in interactive environments..April 2025‌HAL

12.3 Cited publications

68 inproceedingsR.Rania‌ Abdelghani, E.Edith Law, C.Chloé‌ Desvaux, P.-Y.Pierre-Yves Oudeyer and H.Hélène‌ Sauzéon. Interactive environments for training children's curiosity‌ through the practice of metacognitive skills : a‌ pilot study.IDC 2023 - The 22nd‌ annual ACM Interaction Design and Children ConferenceChicago‌ IL, United StatesACMJune 2023, 495-501‌HAL DOI back to text
69 articleR.‌Rania Abdelghani, P.-Y.Pierre-Yves Oudeyer, E.‌Edith Law, C.Catherine de Vulpillières and‌ H.Hélène Sauzéon. Conversational agents for fostering‌ curiosity-driven learning in children.International Journal of‌ Human-Computer Studies167November 2022, 102887HAL‌DOI back to textback to text
70‌ inproceedings R.Rania Abdelghani, H.Hélène Sauzéon‌ and P.-Y.Pierre-Yves Oudeyer. Generative AI in‌ the Classroom: Can Students Remain Active Learners? NeurIPS‌ 2023 - GAIED Workshop - Conference on Neural‌ Information Processing Systems New orleans, USA, United States‌ arXiv December 2023 HAL DOI back to text‌
71 unpublishedM.Maxime Adolphe, M.Marion‌ Pech, M.Masataka Sawayama, D.Denis‌ Maurel, A.Alexandra Delmas, P.-Y.Pierre-Yves‌ Oudeyer and H.Hélène Sauzéon. Exploring the‌ Potential of Artificial Intelligence in Individualized Cognitive Training:‌ a Systematic Review.December 2023, working‌ paper or preprintHALDOI back to text‌
72 inproceedingsM.Mehdi Alaimi, E.Edith‌ Law, K. D.Kevin Daniel Pantasdo,‌ P.-Y.Pierre-Yves Oudeyer and H.Hélène Sauzéon.‌ Pedagogical Agents for Fostering Question-Asking Skills in Children‌.CHI '20 - CHI Conference on Human‌ Factors in Computing SystemsHonolulu / Virtual, United‌ StatesApril 2020HALDOI back to text‌
73 articleM.Mark Alfano, K.Kathryn‌ Iurino, P.Paul Stey, B.Brian‌ Robinson, M.Markus Christen, F.Feng‌ Yu and D.Daniel Lapsley. Development and‌ validation of a multi-dimensional measure of intellectual humility‌.PloS one1282017, e0182950‌back to text back to text
74 inproceedings‌A.Aurélien Appriou, J.Jessy Ceha,‌ S.Smeety Pramij, D.Dan Dutartre,‌ E.Edith Law, P.-Y.Pierre-Yves Oudeyer and‌ F.Fabien Lotte. Towards measuring states of‌ epistemic curiosity through electroencephalographic signals.IEEE SMC‌ 2020 - IEEE International conference on Systems, Man‌ and CyberneticsToronto / Virtual, CanadaOctober 2020‌HAL back to textback to text
75‌ inproceedingsP.Paul Barde, T.Tristan Karch‌, D.Derek Nowrouzezahrai, C.Clément Moulin-Frier‌, C.Christopher Pal and P.-Y.Pierre-Yves Oudeyer‌. Learning to Guide and to Be Guided‌ in the Architect-Builder Problem.International Conference on‌ Learning RepresentationsVirtual, FranceApril 2022HAL back to text
76 inproceedings‌J. C.Jonathan C.‌ Brant and K. O.‌‌Kenneth O. Stanley. Minimal Criterion Coevolution: A‌ New Approach to Open-Ended‌ Search.Proceedings of‌‌ the Genetic and Evolutionary Computation ConferenceGECCO '17‌2017, 67--74back‌ to text
77 article‌‌L.Levin Brinkmann, F.Fabian Baumann,‌ J.-F.Jean-François Bonnefon,‌ M.Maxime Derex,‌‌ T. F.Thomas F. Müller, A.-M.Anne-Marie‌ Nussberger, A.Agnieszka‌ Czaplicka, A.Alberto‌‌ Acerbi, T. L.Thomas L. Griffiths,‌ J.Joseph Henrich,‌ J. Z.Joel Z.‌‌ Leibo, R.Richard McElreath, P.-Y.Pierre-Yves‌ Oudeyer, J.Jonathan‌ Stray and I.Iyad‌‌ Rahwan. Machine Culture.Nature Human Behaviour‌711November 2023‌, 1855--1868DOI back‌‌ to text
78 articleM.Mauricio Cantor,‌ M.Michael Chimento,‌ S. Q.Simeon Q‌‌ Smeele, P.Peng He, D.Danai‌ Papageorgiou, L. M.‌Lucy M Aplin and‌‌ D. R.Damien R Farine. Social network‌ architecture and the tempo‌ of cumulative cultural evolution‌‌.9back to text
79 articleT.‌Thomas Carta, C.‌Clément Romac, T.‌‌Thomas Wolf, S.Sylvain Lamprier, O.‌Olivier Sigaud and P.-Y.‌Pierre-Yves Oudeyer. Grounding‌‌ large language models in interactive environments with online‌ reinforcement learning.arXiv‌ preprint arXiv:2302.026622023back‌‌ to text
80 articleJ.Jessy Ceha,‌ E.Edith Law,‌ D.Dana Kulić,‌‌ v.ves Oudeyer and D.Didier Roy.‌ Identifying Functions and Behaviours‌ of Social Robots for‌‌ In-Class Learning Activities: Teachers' Perspective.International Journal‌ of Social RoboticsSeptember‌ 2021HAL DOI back‌‌ to text
81 proceedingsLenia and Expanded Universe‌.ALIFE 2020: The‌ 2020 Conference on Artificial‌‌ LifeALIFE 2021: The 2021 Conference on Artificial‌ Life07 2020,‌ 221-229URL: https://doi.org/10.1162/isal_a_00297DOI‌‌back to text back to text
82 article‌B.-C. W.Bert Wang-Chak‌ Chan. Lenia-biology of‌‌ artificial life.Complex Systems2832019‌, 251-286back to‌ text
83 miscF.‌‌François Chollet. On the Measure of Intelligence‌.November 2019DOI‌back to text
84‌‌ articleJ.Junyi Chu and L. E.Laura‌ E. Schulz. Play,‌ Curiosity, and Cognition.‌‌Annual Review of Developmental Psychology212020‌, 317-343URL: https://doi.org/10.1146/annurev-devpsych-070120-014806‌DOI back to text‌‌
85 articleJ.Junyi Chu, J. B.‌Joshua B. Tenenbaum and‌ L. E.Laura E.‌‌ Schulz. In Praise of Folly: Flexible Goals‌ and Human Cognition.‌Trends in Cognitive Sciences‌‌287July 2024, 628--642DOI back‌ to text
86 phdthesis‌B.Benjamin Clément.‌‌ Adaptive Personalization of Pedagogical Sequences using Machine Learning‌.Université de Bordeaux‌December 2018HAL back‌‌ to text back to text
87 articleB.‌Benjamin Clément, D.‌Didier Roy, P.-Y.‌‌Pierre-Yves Oudeyer and M.Manuel Lopes. Multi-Armed‌ Bandits for Intelligent Tutoring‌ Systems.Journal of‌‌ Educational Data Mining (JEDM)‌72June 2015, 20--48HAL back‌ to text back to text
88 inproceedingsC.‌Cédric Colas, T.Tristan Karch, N.‌Nicolas Lair, J.-M.Jean-Michel Dussoux, C.‌Clément Moulin-Frier, P.Peter Dominey and P.-Y.‌Pierre-Yves Oudeyer. Language as a Cognitive Tool‌ to Imagine Goals in Curiosity Driven Exploration.‌Advances in Neural Information Processing Systems33Curran‌ Associates, Inc.2020, 3761--3774URL: https://proceedings.neurips.cc/paper/2020/hash/274e6fcf4a583de4a81c6376f17673e7-Abstract.htmlback‌ to text
89 articleC.Cédric Colas,‌ T.Tristan Karch, C.Clément Moulin-Frier and‌ P.-Y.Pierre-Yves Oudeyer. Language and culture internalization‌ for human-like autotelic AI.412December‌ 2022, 1068--1076URL: https://doi.org/10.1038/s42256-022-00591-4DOI back to‌ text back to text
90 articleC.Cédric‌ Colas, T.Tristan Karch, O.Olivier‌ Sigaud and P.-Y.Pierre-Yves Oudeyer. Autotelic Agents‌ with Intrinsically Motivated Goal-Conditioned Reinforcement Learning: A Short‌ Survey.Journal of Artificial Intelligence Research74‌July 2022, 1159--1199URL: https://www.jair.org/index.php/jair/article/view/13554DOI back‌ to text
91 unpublishedC.Cédric Colas,‌ T.Tristan Karch, O.Olivier Sigaud and‌ P.-Y.Pierre-Yves Oudeyer. Intrinsically Motivated Goal-Conditioned Reinforcement‌ Learning: a Short Survey.January 2021,‌ working paper or preprintHAL back to text‌
92 articleE. S.Enrico Sandro Colizzi,‌ R. M.Renske MA Vroomans and R. M.‌Roeland MH Merks. Evolution of multicellularity by‌ collective integration of spatial information.eLife9‌oct 2020, e56349URL: https://doi.org/10.7554/eLife.56349DOI back‌ to text
93 articleG.Guy Davidson,‌ G.Graham Todd, J.Julian Togelius,‌ T. M.Todd M. Gureckis and B. M.‌Brenden M. Lake. Goals as Reward-Producing Programs‌.Nature Machine Intelligence72February 2025‌, 205--220DOI back to text back to‌ text
94 articleM.Maxime Derex and R.‌Robert Boyd. Partial connectivity increases cultural accumulation‌ within groups.Proceedings of the National Academy‌ of Sciences11311March 2016, 2982--2987‌URL: http://www.pnas.org/lookup/doi/10.1073/pnas.1518798113DOI back to text back to‌ text
95 articleM.Maxime Derex and A.‌Alex Mesoudi. Cumulative Cultural Evolution within Evolving‌ Population Structures.Trends in Cognitive Sciences24‌82020, 654--667DOI back to text‌
96 phdthesisM.Mayalen Etcheverry. Curiosity-driven AI‌ for Science : Automated Discovery of Self-Organized Structures‌.Université de BordeauxNovember 2023HAL back‌ to text back to text
97 miscM.‌Mayalen Etcheverry. Intrinsically Motivated Discovery of Diverse‌ Patterns in Self-Organizing Systems.Self-organisation occurs in‌ many physical, chemical and biological systems, as well‌ as in artificial systems like the Game of‌ Life. Yet, these systems are still full of‌ mysteries and we are far from fully grasping‌ what structures can self-organize, how to represent and‌ classify them, and how to predict their evolution.‌ In this blog post, we present our recent‌ paper which formulates the problem of automated discovery‌ of diverse self-organized patterns in such systems. Using a continuous Game of‌ Life as a testbed,‌ we show how intrinsically-motivated‌‌ goal exploration processes, initially developed for learning of‌ inverse models in robotics,‌ can efficiently be transposed‌‌ to this novel application area.March 2020HAL‌back to text
98‌ inproceedingsM.Mayalen Etcheverry‌‌, C.Clément Moulin-Frier and P.-Y.Pierre-Yves Oudeyer‌. Hierarchically Organized Latent‌ Modules for Exploratory Search‌‌ in Morphogenetic Systems.NeurIPS 2020 - 34th‌ Conference on Neural Information‌ Processing SystemsVancouver /‌‌ Virtual, CanadaDecember 2020HAL back to text‌
99 articleM.Mayalen‌ Etcheverry, C.Clément‌‌ Moulin-Frier, P.-Y.Pierre-Yves Oudeyer and M.Michael‌ Levin. AI-driven Automated‌ Discovery Tools Reveal Diverse‌‌ Behavioral Competencies of Biological Networks.eLifeAugust‌ 2024HAL DOI back‌ to text back to‌‌ text back to text
100 miscM.Maxence‌ Faldor, J.Jenny‌ Zhang, A.Antoine‌‌ Cully and J.Jeff Clune. OMNI-EPIC: Open-endedness‌ via Models of human‌ Notions of Interestingness with‌‌ Environments Programmed in Code.2025, URL:‌ https://arxiv.org/abs/2405.15568back to text‌
101 articleM.Maxime‌‌ Gasse, D.Damien Grasset, G.Guillaume‌ Gaudron and P.-Y.Pierre-Yves‌ Oudeyer. Using Confounded‌‌ Data in Latent Model-Based Reinforcement Learning.Transactions‌ on Machine Learning Research‌ JournalAugust 2023HAL‌‌back to text
102 articleJ.Jacqueline Gottlieb‌, P.-Y.Pierre-Yves Oudeyer‌, M.Manuel Lopes‌‌ and A.Adrien Baranes. Information-seeking, curiosity, and‌ attention: computational and neural‌ mechanisms.Trends in‌‌ Cognitive Sciences1711November 2013, 585-93‌HAL DOI back to‌ text
103 articleL.‌‌Louise Goupil and J.Joëlle Proust. Curiosity‌ as a Metacognitive Feeling‌.Cognition231February‌‌ 2023, 105325DOIback to text
104‌ articleJ. P.J.‌ P. Guilford. Creativity:‌‌ Yesterday, Today, and Tomorrow.The Journal of‌ Creative Behavior11‌1967, 3--14DOI‌‌back to text
105 articleY.Yejia Guo‌ and B.Baker Ayoun‌. What's in It‌‌ for Them? The Role of Social Curiosity and‌ Social Needs in Motivating‌ and Retaining Hospitality Employees‌‌.International Journal of Hospitality Management1152023‌, 1--12DOI back‌ to text back to‌‌ text
106 articleL. P.Lydia Paine Hagtvedt‌, K.Karyn Dossinger‌, S. H.Spencer‌‌ H. Harrison and L.Li Huang. Curiosity‌ Made the Cat More‌ Creative: Specific Curiosity as‌‌ a Driver of Creativity.Organizational Behavior and‌ Human Decision Processes150‌2019, 1--13DOI‌‌back to text
107 articleW. D.W.‌ D. Hamilton. The‌ genetical evolution of social‌‌ behaviour. I.Journal of Theoretical Biology7‌1July 1964,‌ 1--16URL: https://www.sciencedirect.com/science/article/pii/0022519364900384DOI‌‌back to text
108 articleW.W.D. Hamilton‌. The genetical evolution‌ of social behaviour. II‌‌.Journal of Theoretical Biology711964‌, 17-52URL: https://www.sciencedirect.com/science/article/pii/0022519364900396‌DOI back to text‌‌
109 articleR. A.Ryan A. Hargrove and‌ J. L.John L.‌ Nietfeld. The Impact‌‌ of Metacognitive Instruction on‌ Creative Problem Solving.Journal of Experimental Education‌8332015, 291--318DOI back to‌ text
110 articleF.-M.Freda-Marie Hartung and B.‌Britta Renner. Perceived and Actual Social Discrimination:‌ The Case of Overweight and Social Inclusion.‌Frontiers in Psychology42013DOI back to‌ text back to text
111 articleJ.Joseph‌ Henrich, R.Robert Boyd, M.Maxime‌ Derex, M. A.Michelle A Kline,‌ A.Alex Mesoudi, M.Michael Muthukrishna,‌ A. T.Adam T Powell, S. J.‌Stephen J Shennan and M. G.Mark G‌ Thomas. Understanding cumulative cultural evolution.Proceedings‌ of the National Academy of Sciences11344‌2016, E6724--E6725back to text back to‌ text
112 articleX.Xiaoyu Jia, W.‌Weijian Li and L.Liren Cao. The‌ Role of Metacognitive Components in Creative Thinking.‌Frontiers in Psychology102019DOI back to‌ text
113 articleF.Frederic Kaplan and P.-Y.‌Pierre-Yves Oudeyer. In Search of the Neural‌ Circuits of Intrinsic Motivation.Frontiers in Neuroscience‌11October 2007, 225--236URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2518057/‌DOI back to text
114 articleT. B.‌Todd B. Kashdan, D. J.David J.‌ Disabato, F. R.Fallon R. Goodman and‌ P. E.Patrick E. McKnight. The Five-Dimensional‌ Curiosity Scale Revised (5DCR): Briefer Subscales While Separating‌ Overt and Covert Social Curiosity.Personality and‌ Individual Differences157April 2020, 109836DOI‌back to text back to text back to‌ text
115 articleT. B.Todd B Kashdan‌, M. C.Melissa C Stiksma, D.‌ J.David J Disabato, P. E.Patrick‌ E McKnight, J.John Bekier, J.‌Joel Kaji and R.Rachel Lazarus. The‌ five-dimensional curiosity scale: Capturing the bandwidth of curiosity‌ and identifying four unique subgroups of curious people‌.Journal of Research in Personality732018‌, 130--149back to text
116 articleH.‌Hiroaki Kitano. Biological robustness.Nature Reviews‌ Genetics5112004, 826--837back to‌ text
117 articleW.Wilma Koutstaal, K.‌Kara Kedrick and J.Joshua Gonzalez-Brito. Capturing,‌ Clarifying, and Consolidating the Curiosity-Creativity Connection.12‌1September 2022, 15300DOI back to‌ text back to text
118 unpublishedG.Grgur‌ Kovaċ, R.Rémy Portelas, K.Katja‌ Hofmann and P.-Y.Pierre-Yves Oudeyer. SocialAI: Benchmarking‌ Socio-Cognitive Abilities in Deep Reinforcement Learning Agents.‌October 2021, working paper or preprintHAL‌back to text
119 articleE. J.Elizabeth‌ J Krumrei-Mancuso, M. C.Megan C Haggard‌, J. P.Jordan P LaBouff and W.‌ C.Wade C Rowatt. Links between intellectual‌ humility and acquiring knowledge.The Journal of‌ Positive Psychology1522020, 155--170back‌ to text back to text
120 articleK.‌ N.Kevin N Laland, T.Tobias Uller‌, M. W.Marcus W Feldman, K.Kim Sterelny, G.‌ B.Gerd B Müller‌, A.Armin Moczek‌‌, E.Eva Jablonka and J.John Odling-Smee‌. The extended evolutionary‌ synthesis: its structure, assumptions‌‌ and predictions.Proceedings of the royal society‌ B: biological sciences282‌18132015, 20151019‌‌back to text
121 articleD.David Lazer‌ and A.Allan Friedman‌. The Network Structure‌‌ of Exploration and Exploitation.Administrative Science Quarterly‌524December 2007‌, 667--694URL: http://journals.sagepub.com/doi/10.2189/asqu.52.4.667‌‌DOI back to text
122 inproceedingsJ. Z.‌Joel Z. Leibo,‌ V.Vinicius Zambaldi,‌‌ M.Marc Lanctot, J.Janusz Marecki and‌ T.Thore Graepel.‌ Multi-Agent Reinforcement Learning in‌‌ Sequential Social Dilemmas.Proceedings of the 16th‌ Conference on Autonomous Agents‌ and MultiAgent SystemsAAMAS‌‌ '17São Paulo, Brazil2017, 464–473back‌ to text back to‌ text
123 articleS.‌‌Sheina Lew-Levy and D.Dorsa Amir. Children‌ as Agents of Cultural‌ Adaptation.The Behavioral‌‌ and Brain SciencesDecember 2024, 1--68DOI‌back to text
124‌ articleW. A.Winter‌‌ A. Mason, A.Andy Jones and R.‌ L.Robert L. Goldstone‌. Propagation of innovations‌‌ in networked groups.Journal of Experimental Psychology:‌ General13732008‌, 422--433URL: http://doi.apa.org/getdoi.cfm?doi=10.1037/a0012798‌‌DOI back to text
125 articleC.Cécile‌ Mazon, B.Benjamin‌ Clément, D.Didier‌‌ Roy, P.-Y.Pierre-Yves Oudeyer and H.Hélène‌ Sauzéon. Pilot study‌ of an intervention based‌‌ on an intelligent tutoring system (ITS) for instructing‌ mathematical skills of students‌ with ASD and/or ID‌‌.Education and Information Technologies2022HAL DOI‌back to text
126‌ articleC.Cécile Mazon‌‌, K.Kattalin Etchegoyhen, I.Isabeau Saint-Supery‌, A.Anouck Amestoy‌, M.Manuel Bouvard‌‌, C.Charles Consel and H.Hélène Sauzéon‌. Fostering parents-professional collaboration‌ for facilitating the school‌‌ inclusion of students with ASD: Design of the‌ ''ToGather'' web-based prototype.‌Educational Technology Research and‌‌ DevelopmentDecember 2021HALDOI back to text‌
127 articleC.Cécile‌ Mazon, C.Charles‌‌ Fage and H.Hélène Sauzéon. Effectiveness and‌ usability of technology-based interventions‌ for children and adolescents‌‌ with ASD: A systematic review of reliability, consistency,‌ generalization and durability related‌ to the effects of‌‌ intervention.Computers in Human Behavior93April‌ 2019HAL DOI back‌ to text
128 incollection‌‌C.Cécile Mazon and H.Hélène Sauzéon.‌ Utilisation des technologies mobiles‌ auprès des enfants avec‌‌ TSA..Autisme et usages du numériques en‌ éducation2022HAL back‌ to text
129 inproceedings‌‌E.Eric Meyer, H.Hélène Sauzéon,‌ I.Isabeau Saint-Supery and‌ C.Cécile Mazon.‌‌ Systematic review of technologies to collaborate and co-educate‌ students with special educational‌ needs and supporting their‌‌ schooling.IHIET 2023 - 10th International Conference‌ on Human Interaction and‌ Emerging Technologies111Nice,‌‌ FranceAHFE InternationalAugust 2023, 1-12HAL‌DOI back to text‌
130 articleA. B.‌‌Andrea B. Migliano,‌ F.Federico Battiston, S.Sylvain Viguier,‌ A. E.Abigail E. Page, M.Mark‌ Dyble, R.Rodolph Schlaepfer, D.Daniel‌ Smith, L.Leonora Astete, M.Marilyn‌ Ngales, J.Jesus Gomez-Gardenes, V.Vito‌ Latora and L.Lucio Vinicius. Hunter-gatherer multilevel‌ sociality accelerates cumulative cultural evolution.Science Advances‌69February 2020, eaax5913DOI back‌ to text
131 phdthesisC.Clément Moulin-Frier.‌ The Ecology of Open-Ended Skill Acquisition.Université‌ de Bordeaux (UB)December 2022HAL back to‌ text
132 articleK.Kou Murayama. A‌ Reward-Learning Framework of Knowledge Acquisition: An Integrated Account‌ of Curiosity, Interest, and Intrinsic--Extrinsic Rewards.Psychological‌ Review12912022, 175--198DOI back‌ to text
133 articleK.Kou Murayama,‌ L.Lily FitzGibbon and M.Michiko Sakaki.‌ Process Account of Curiosity and Interest: A Reward-Learning‌ Perspective.Educational Psychology Review314December‌ 2019, 875--895URL: http://link.springer.com/10.1007/s10648-019-09499-9DOI back to‌ text
134 techreportA.Arun Nair, P.‌Praveen Srinivasan, S.Sam Blackwell, C.‌Cagdas Alcicek, R.Rory Fearon, A.‌Alessandro De Maria, V.Vedavyas Panneershelvam,‌ M.Mustafa Suleyman, C.Charles Beattie,‌ S.Stig Petersen, S.Shane Legg,‌ V.Volodymyr Mnih, K.Koray Kavukcuoglu and‌ D.David Silver. Massively Parallel Methods for‌ Deep Reinforcement Learning.arXiv:1507.04296arXiv:1507.04296 [cs]arXiv‌July 2015, URL: http://arxiv.org/abs/1507.04296back to text‌
135 articleX.Xinxiao Nie, Y.Yuan‌ Tian, M.Mengjie Liu, D.Di‌ Wu and Y.Yunxiao Guo. The impact‌ of generative artificial intelligence on students' higher order‌ thinking: Evidence from a three-level meta-analysis.Education‌ and Information Technologies2025, 1--32back to‌ text
136 unpublishedE.Eleni Nisioti, M.‌Mateo Mahaut, P.-Y.Pierre-Yves Oudeyer, I.‌Ida Momennejad and C.Clément Moulin-Frier. Social‌ Network Structure Shapes Innovation: Experience-sharing in RL with‌ SAPIENS.July 2022, working paper or‌ preprintHAL back to text
137 miscE.‌Eleni Nisioti, M.Mateo Mahaut, P.-Y.‌Pierre-Yves Oudeyer, I.Ida Momennejad and C.‌Clément Moulin-Frier. Social Network Structure Shapes Innovation:‌ Experience-sharing in RL with SAPIENS.arXiv:2206.05060 [cs]‌November 2022, URL: http://arxiv.org/abs/2206.05060DOI back to‌ text back to textback to text
138‌ inproceedingsE.Eleni Nisioti, S.Sebastian Risi‌, I.Ida Momennejad, P.-Y.Pierre-Yves Oudeyer‌ and C.Clément Moulin-Frier. Collective Innovation in‌ Groups of Large Language Models.MIT Press‌July 2024, URL: https://dx.doi.org/10.1162/isal_a_00730DOI back to‌ text back to text
139 inproceedingsE.Eleni‌ Nisioti, S.Sebastian Risi, I.Ida‌ Momennejad, P.-Y.Pierre-Yves Oudeyer and C.Clément‌ Moulin-Frier. Collective Innovation in Groups of Large‌ Language Models.ALIFE 2024 - The Conference‌ on Artificial LifeCopenhagen, DenmarkMIT PressJuly‌ 2024HAL DOI back to text
140 miscK.Kenneth O. Stanley‌, J.Joel Lehman‌ and L.Lisa Soros‌‌. Open-endedness: The last grand challenge you've never‌ heard of.December‌ 2017, URL: https://www.oreilly.com/radar/open-endedness-the-last-grand-challenge-youve-never-heard-of/‌‌back to text
141 articleP.-Y.Pierre-Yves Oudeyer‌, F.F. Kaplan‌ and V.V. Hafner‌‌. Intrinsic Motivation Systems for Autonomous Mental Development‌.IEEE Transactions on‌ Evolutionary Computation112‌‌2007, 265--286DOIback to text
142‌ articleP.-Y.Pierre-Yves Oudeyer‌, F.Frédéric Kaplan‌‌ and V.Véréna Hafner. Intrinsic Motivation for‌ Autonomous Mental Development.‌IEEE Transactions on Evolutionary‌‌ Computation112January 2007, 265-286HAL‌DOI back to text‌
143 articleP.-Y.P-Y‌‌ Oudeyer, J.Jacqueline v and M.Manuel‌ Lopes. Intrinsic motivation,‌ curiosity, and learning: Theory‌‌ and applications in educational technologies.Progress in‌ brain research2292016‌, 257--284back to‌‌ text
144 inproceedingsJ.Jérémy Perez, G.‌Grgur Kovaċ, C.‌Corentin Léger, C.‌‌Cédric Colas, G.Gaia Molinaro, M.‌Maxime Derex, P.-Y.‌Pierre-Yves Oudeyer and C.‌‌Clément Moulin-Frier. When LLMs Play the Telephone‌ Game: Cultural Attractors as‌ Conceptual Tools to Evaluate‌‌ LLMs in Multi-turn Settings.The Thirteenth International‌ Conference on Learning Representations‌ (ICLR 2025)Singapour, Singapore‌‌April 2025HAL back to text
145 article‌J.Julien Perolat,‌ J. Z.Joel Z‌‌ Leibo, V.Vinicius Zambaldi, C.Charles‌ Beattie, K.Karl‌ Tuyls and T.Thore‌‌ Graepel. A multi-agent reinforcement learning model of‌ common-pool resource appropriation.‌Advances in neural information‌‌ processing systems302017back to text back‌ to text back to‌ text
146 articleR.‌‌Richard Phillips. Curious about Others: Relational and‌ Empathetic Curiosity for Diverse‌ Societies.New Formations‌‌8888March 2016, 123--142DOI back‌ to text
147 book‌J.J. Piaget.‌‌ The Language and Thought of the Child.‌The Language and Thought‌ of the ChildOxford,‌‌ EnglandHarcourt, Brace1926, xxiii, 246back‌ to text
148 article‌P. R.Paul R.‌‌ Pintrich, D. A.David A. F. Smith‌, T.Teresa Garcia‌ and W. J.Wilbert‌‌ J. McKeachie. A manual for the use‌ of the Motivated Strategies‌ for Learning Questionnaire (MSLQ).‌‌.1991back to text
149 inproceedingsE.‌Erwan Plantec, G.‌Gautier Hamon, M.‌‌Mayalen Etcheverry, P.-Y.Pierre-Yves Oudeyer, C.‌Clément Moulin-Frier and B.-C.‌ W.Bert Wang-Chak Chan‌‌. Flow-Lenia: Towards open-ended evolution in cellular automata‌ through mass conservation and‌ parameter localization.The‌‌ 2023 Conference on Artificial LifeTokyo, JapanMIT‌ PressJuly 2023HAL‌DOI back to text‌‌
150 inproceedingsJ.Julien Pourcel, C.Cédric‌ Colas, G.Gaia‌ Molinaro, P.-Y.Pierre-Yves‌‌ Oudeyer and L.Laetitia Teodorescu. ACES: Generating‌ diverse programming puzzles with‌ autotelic language models and‌‌ semantic descriptors.NeurIPS 2024 - The 38th‌ Annual Conference on Neural‌ Information Processing SystemsVancouver,‌‌ CanadaDecember 2024HAL‌back to text
151 articleR.Rogelio Puente-Diaz‌ and J.Judith Cavazos-Arroyo. Creative Metacognitive Feelings‌ as a Source of Information for Creative Self-efficacy,‌ Creativity Potential, Intrapersonal Idea Selection, and Task Enjoyment‌.The Journal of Creative Behavior543‌2020, 499--507DOIback to text
152‌ articleJ. K.Justin K. Pugh, L.‌ B.Lisa B. Soros and K. O.Kenneth‌ O. Stanley. Quality Diversity: A New Frontier‌ for Evolutionary Computation.Frontiers in Robotics and‌ AI32016, URL: https://www.frontiersin.org/articles/10.3389/frobt.2016.00040back to‌ text
153 incollectionI.Isabeau Saint-Supery, C.‌Cécile Mazon, M.Meyer Eric and H.‌Hélène Sauzéon. Conception d'une application de soutien‌ à la coéducation pour l'inclusion scolaire des élèves‌ TSA.Éthiques inclusives en éducation. Recherches, contextes‌ et pratiques (p. 145-160)Parentalité & HandicapChamps‌ Social2023, 260HAL back to text‌
154 inproceedingsI.Isabeau Saint-Supery, H.Hélène‌ Sauzéon, C.Christelle Maillart, N.Nicolas‌ Neu, E.Eric Meyer and C.Cécile‌ Mazon. Cross-cultural evaluation of a web application‌ to support communication and collaboration among stakeholders of‌ the school inclusion of children with ASD.‌AAATE 2023 - The 17h International Conference of‌ the Association for the Advancement of Assistive Technology‌ in EuropeAAATEParis, FranceAugust 2023HAL‌back to text
155 unpublishedI.Isabeau Saint-Supery‌, H.Hélène Sauzéon, E.Eric Meyer‌ and C.Cécile Mazon. ToGather, an interactive‌ website for the stakeholders of school inclusion of‌ children with ASD: an iterative design including user‌ testing.2022, working paper or preprint‌HAL back to text
156 articleN. S.‌Nicola S. Schutte and J. M.John M.‌ Malouff. Connections between Curiosity, Flow and Creativity‌.152January 2020, 109555DOI back‌ to text back to text
157 articleM.‌Michael Shulman. Strange new universes: Proof assistants‌ and synthetic foundations.Bulletin of the American‌ Mathematical Society6122024, 257--270back‌ to text
158 miscO.Olivier Sigaud,‌ G.Gianluca Baldassarre, C.Cedric Colas,‌ S.Stephane Doncieux, R.Richard Duro,‌ P.-Y.Pierre-Yves Oudeyer, N.Nicolas Perrin-Gilbert and‌ V. G.Vieri Giuliano Santucci. A Definition‌ of Open-Ended Learning Problems for Goal-Conditioned Agents.‌June 2024DOI back to text
159 unpublished‌Y.Yadurshana Sivashankar, M.Myra Fernandes,‌ P.-Y.Pierre-Yves Oudeyer and H.Hélène Sauzéon.‌ The Beneficial Role of Curiosity on Route memory‌ in Children.January 2024, working paper‌ or preprintHAL DOIback to text back‌ to text
160 articleJ. M.J. Maynard‌ Smith. Group selection and kin selection.‌Nature2011964, 1145-1147URL: https://doi.org/10.1038/2011145a0back‌ to text
161 articleK. O.Kenneth O.‌ Stanley, J.Jeff Clune, J.Joel‌ Lehman and R.Risto Miikkulainen. Designing Neural‌ Networks through Neuroevolution.Nature Machine Intelligence11January 2019,‌ 24--35DOI back to‌ text
162 articleA.‌‌Adam Tapal, E.Ela Oren, R.‌Reuven Dar and B.‌Baruch Eitam. The‌‌ sense of agency scale: A measure of consciously‌ perceived control over one's‌ mind, body, and the‌‌ immediate environment.Frontiers in psychology82017‌, 1552back to‌ text
163 incollectionA.‌‌Alexandr Ten, P.-Y.Pierre-Yves Oudeyer and C.‌Clément Moulin-Frier. Curiosity-Driven‌ Exploration: Diversity of Mechanisms‌‌ and Functions.The Drive for Knowledge: The‌ Science of Human Information‌ Seeking2022DOI back‌‌ to text
164 articleD. R.Daryl R‌ Van Tongeren, V.‌Vincent Ng, L.‌‌Louis Hickman and L.Louis Tay. Behavioral‌ measures of humility: Part‌ 1. Theoretical and methodological‌‌ review.The Journal of Positive Psychology18‌52023, 711--721‌back to text
165‌‌ articleJ. P.John P Veillette, L.‌Letitia Ho and H.‌ C.Howard C Nusbaum‌‌. Metacognition bridges experiences and beliefs in sense‌ of agency.Consciousness‌ and Cognition1242024‌‌, 103745back to text
166 articleS.‌Stéphan Vincent-Lancrin and R.‌Reyer Van der Vlies‌‌. Trustworthy artificial intelligence (AI) in education: Promises‌ and challenges.OECD‌ education working papers218‌‌2020, 0_1--17back to text
167 article‌S.Sophie Von Stumm‌ and P. L.Phillip‌‌ L Ackerman. Investment and intellect: a review‌ and meta-analysis..Psychological‌ bulletin13942013‌‌, 841back to text
168 bookL.‌ S.Lev Semenovich Vygotsky‌ and M.Michael Cole‌‌. Mind in society: Development of higher psychological‌ processes.Harvard university‌ press1978back to‌‌ text
169 articleD.Dennis Whitcomb, H.‌Heather Battaly, J.‌Jason Baehr and D.‌‌Daniel Howard-Snyder. Intellectual humility.Philosophy and‌ Phenomenological Research943‌2017, 509--539back‌‌ to text
170 inproceedingsZ.Ziang Xiao,‌ X.Xingdi Yuan,‌ Q. V.Q. Vera‌‌ Liao, R.Rania Abdelghani and P.-Y.Pierre-Yves‌ Oudeyer. Supporting Qualitative‌ Analysis with Large Language‌‌ Models: Combining Codebook with GPT-3 for Deductive Coding‌.IUI 2023 -‌ 28th International Conference on‌‌ Intelligent User InterfacesSydney, AustraliaACMMarch 2023‌, 75-78HAL DOI‌back to text

FLOWERS - 2025

FLOWERS - 2025

2025Activity report﻿​​﻿Project-TeamFLOWERS

Keywords

Computer﻿‌​‌ Science and Digital Science﻿​​﻿

Other Research​​﻿﻿ Topics and Application Domains​​​‌

1 Team​​​‌ members, visitors, external collaborators﻿​﻿﻿

Research Scientists

Faculty Member

Post-Doctoral﻿​﻿﻿ Fellows

PhD Students​​﻿﻿

Technical Staff

Interns and Apprentices

Administrative Assistants

External Collaborators

2 Overall objectives​​​‌

3﻿​﻿﻿ Research program

3.1 Background:​‌﻿﻿

3.2 Understanding Autotelic​​​‌ Learning in Humans

3.2.1﻿﻿﻿‌ Curiosity, meta-cognitionand agency across﻿‌​‌ the lifespan.

3.2.2 Experimental paradigms﻿​​﻿ for studying autotelic learning​​​‌ in humans.

3.2.3 Links﻿​​﻿ between curiosity and creativity​​​‌ for autotelic learning in﻿﻿﻿‌ children:

3.2.4​​​‌ Curiosity to learn about﻿​﻿﻿ others and social interaction:​‌﻿﻿

3.2.5 Autotelic﻿‌​‌ game invention and cultural﻿​​﻿ transmission.

3.3﻿‌​‌ Building Curiosity-Driven Autotelic and﻿​​﻿ Aligned AI

3.3.1 Language-Augmented​​​‌ Autotelic Agents with Foundational﻿﻿﻿‌ Models

3.3.2 Program Synthesis for​​﻿﻿ Abstract and Verifiable Intelligence​​​‌

3.3.3​​﻿﻿ Curiosity in Cultural Evolution,​​​‌ Collective Intelligence, and AI﻿​﻿﻿ Science Teams

3.4 Applications in​​﻿﻿ Education and Scientific Discovery​​​‌

3.4.1 Training Curiosity and﻿​﻿﻿ Metacognition Across the Lifespan​‌﻿﻿

3.4.2﻿​​﻿ Assisted Scientific Discovery with​​​‌ Autotelic Exploration

4 Application domains

5 Social and﻿‌​‌ environmental responsibility

5.1 Footprint﻿​​﻿ of research activities

5.2 Impact of research﻿‌​‌ results

6 Highlights of the​​​‌ year

6.1 Awards​​​‌

6.2 PhD defenses​​﻿﻿

7​​﻿﻿ Latest software developments, platforms,​​​‌ open data

7.1 Latest﻿​﻿﻿ software developments

7.1.1 SocialAI​‌﻿﻿

7.1.2 AutoDisc​​​‌

7.1.3​​​‌ ADTool

7.1.4 Kids Ask﻿​​﻿

7.1.5​​﻿﻿ ToGather

7.1.6 mc_training​​​‌

7.1.7​‌﻿﻿ Evolution of adaptation mechanisms​​﻿﻿ in complex environments

7.1.8​​​‌ SAPIENS

7.1.9​‌﻿﻿ architect-builder-abig

7.1.10 EAGER

7.1.11 Flow-Lenia​‌﻿﻿

7.1.12 Kidlearn: money​​﻿﻿ game application

7.1.13 cognitive-testbattery﻿​​﻿

7.1.14 Sensorimotor-lenia

7.1.15 Lamorel

7.1.16 GLAM

7.1.17﻿​​﻿ SBMLtoODEjax

7.1.18 Vivarium

7.1.19﻿​﻿﻿ LLM_Culture

7.1.20 TelephoneGameLLMs​​​‌

7.1.21 styr​‌﻿﻿

7.1.22​​﻿﻿ transformerXL_PPO_JAX

7.1.23 ER-MRL

7.1.24 LLM4Humanities

7.2 New​​​‌ platforms

7.2.1 ToGather application﻿​﻿﻿

8 New results​​​‌

8.1​​﻿﻿ Open-ended learning and autotelic​​​‌ AI with large language﻿​﻿﻿ models

8.1.1 ACES:​‌﻿﻿ Generating a Diversity of​​﻿﻿ Challenging Programming Puzzles with​​​‌ Autotelic Generative Models

Motivation.

Results.

8.1.2 MAGELLAN: Metacognitive Generalization​​﻿﻿ of Learning Progress for​​​‌ Online RL in LLM﻿​﻿﻿ agents

8.1.3 When goals are​​​‌ beyond reach: Metacognitive monitoring﻿﻿﻿‌ guides autonomous discovery of﻿‌​‌ frugal assistance-seeking in LLMs﻿​​﻿

8.1.4 LLM-based goal generation​​​‌ for autotellic exploration with﻿﻿﻿‌ goal-conditioned RL

8.1.5 Self-Improving Language﻿​​﻿ Models for Evolutionary Program​​​‌ Synthesis: A Case Study﻿﻿﻿‌ on ARC-AGI

8.1.6​​​‌ WorldLLM: Improving LLMs' world﻿​﻿﻿ modeling using curiosity-driven theory-making​‌﻿﻿

8.1.7 HERAKLES: Hierarchical﻿​​﻿ Skill Compilation for Open-ended​​​‌ LLM Agents

8.1.8﻿​﻿﻿ Software Engineering Agents for​‌﻿﻿ Embodied Controller Generation :​​﻿﻿ A Study in Minigrid​​​‌ Environments

Motivation.

Results.﻿​﻿﻿

2025Activity reportProject-TeamFLOWERS

Computer‌‌ Science and Digital Science

Other Research Topics and Application Domains‌

1 Team‌ members, visitors, external collaborators

Post-Doctoral Fellows

PhD Students

2 Overall objectives‌

3 Research program

3.1 Background:‌

3.2 Understanding Autotelic‌ Learning in Humans

3.2.1‌ Curiosity, meta-cognitionand agency across‌‌ the lifespan.

3.2.2 Experimental paradigms for studying autotelic learning‌ in humans.

3.2.3 Links between curiosity and creativity‌ for autotelic learning in‌ children:

3.2.4‌ Curiosity to learn about others and social interaction:‌

3.2.5 Autotelic‌‌ game invention and cultural transmission.

3.3‌‌ Building Curiosity-Driven Autotelic and Aligned AI

3.3.1 Language-Augmented‌ Autotelic Agents with Foundational‌ Models

3.3.2 Program Synthesis for Abstract and Verifiable Intelligence‌

3.3.3 Curiosity in Cultural Evolution,‌ Collective Intelligence, and AI Science Teams

3.4 Applications in Education and Scientific Discovery‌

3.4.1 Training Curiosity and Metacognition Across the Lifespan‌

3.4.2 Assisted Scientific Discovery with‌ Autotelic Exploration

5 Social and‌‌ environmental responsibility

5.1 Footprint of research activities

5.2 Impact of research‌‌ results

6 Highlights of the‌ year

6.1 Awards‌

6.2 PhD defenses

7 Latest software developments, platforms,‌ open data

7.1 Latest software developments

7.1.1 SocialAI‌

7.1.2 AutoDisc‌

7.1.3‌ ADTool

7.1.4 Kids Ask

7.1.5 ToGather

7.1.6 mc_training‌

7.1.7‌ Evolution of adaptation mechanisms in complex environments

7.1.8‌ SAPIENS

7.1.9‌ architect-builder-abig

7.1.11 Flow-Lenia‌

7.1.12 Kidlearn: money game application

7.1.13 cognitive-testbattery

7.1.17 SBMLtoODEjax

7.1.19 LLM_Culture

7.1.20 TelephoneGameLLMs‌

7.1.21 styr‌

7.1.22 transformerXL_PPO_JAX

7.2 New‌ platforms

7.2.1 ToGather application

8 New results‌

8.1 Open-ended learning and autotelic‌ AI with large language models

8.1.1 ACES:‌ Generating a Diversity of Challenging Programming Puzzles with‌ Autotelic Generative Models

8.1.2 MAGELLAN: Metacognitive Generalization of Learning Progress for‌ Online RL in LLM agents

8.1.3 When goals are‌ beyond reach: Metacognitive monitoring‌ guides autonomous discovery of‌‌ frugal assistance-seeking in LLMs

8.1.4 LLM-based goal generation‌ for autotellic exploration with‌ goal-conditioned RL

8.1.5 Self-Improving Language Models for Evolutionary Program‌ Synthesis: A Case Study‌ on ARC-AGI

8.1.6‌ WorldLLM: Improving LLMs' world modeling using curiosity-driven theory-making‌

8.1.7 HERAKLES: Hierarchical Skill Compilation for Open-ended‌ LLM Agents

8.1.8 Software Engineering Agents for‌ Embodied Controller Generation : A Study in Minigrid‌ Environments

Results.

8.2 Models of cultural evolution in‌ humans and AI systems

8.2.1 The effect of social network structure‌ on collective innovation

Cultural evolution‌ in populations of RL agents.

Cultural evolution in populations of‌ LLM agents.

8.2.2 When LLMs‌‌ Play the Telephone Game: Cultural Attractors as Conceptual‌ Tools to Evaluate LLMs‌ in Multi-turn Settings

8.2.3 Recursive Training Loops in LLMs: How training‌ data properties modulate distribution shift in generated data?‌

8.2.4 Intrinsic motivation‌‌ is key to understanding peer cultures

8.2.5 Cultural variation and regularities in intrinsically‌ motivated exploration: investigating autonomous‌ goal selection in BaYaka‌‌ foragers and Bandongo fisher-farmers

Preliminary results:

Conclusion:

8.2.6 The cultural evolution of human goals: How‌ individuals generate, select, and transmit goals

8.2.7 Evolving Interaction Protocols‌ for Open-Ended Collective Innovation‌

8.2.8 Inferring the Phylogeny of Large Language‌ Models and Predicting their Performances in Benchmarks

8.3 An Eco-Evo-Devo perspective on Artificial Intelligence‌

8.3.1 Research perspective: The‌ Ecology of Open-Ended skill‌‌ Acquisition

8.3.2 Eco-evolutionary Dynamics‌ of Non-episodic Neuroevolution in Large Multi-agent Environments

8.3.3 Emergent kin selection of altruistic feeding via‌ non-episodic neuroevolution

8.3.4 Evolving large populations‌‌ of adaptive neural agents in ecologically plausible environments‌

8.4 Theories and experiments‌‌ on human curiosity-driven learning

8.4.1 DevCur Project: studying‌ the co-development of curiosity,‌ metacognition and agency in‌‌ adolescents