2025Activity reportProject-TeamFLOWERS
RNSR: 200820949R- Research center Inria Centre at the University of Bordeaux
- In partnership with:Ecole nationale supérieure des techniques avancées - Institut polytechnique de Paris, Université de Bordeaux
- Team name: FLOW in Exploration, leaRning, and diScovery
Creation of the Project-Team: 2025 March 01
Each year, Inria research teams publish an Activity Report presenting their work and results over the reporting period. These reports follow a common structure, with some optional sections depending on the specific team. They typically begin by outlining the overall objectives and research programme, including the main research themes, goals, and methodological approaches. They also describe the application domains targeted by the team, highlighting the scientific or societal contexts in which their work is situated.
The reports then present the highlights of the year, covering major scientific achievements, software developments, or teaching contributions. When relevant, they include sections on software, platforms, and open data, detailing the tools developed and how they are shared. A substantial part is dedicated to new results, where scientific contributions are described in detail, often with subsections specifying participants and associated keywords.
Finally, the Activity Report addresses funding, contracts, partnerships, and collaborations at various levels, from industrial agreements to international cooperations. It also covers dissemination and teaching activities, such as participation in scientific events, outreach, and supervision. The document concludes with a presentation of scientific production, including major publications and those produced during the year.
Keywords
Computer Science and Digital Science
- A5.1.1. Engineering of interactive systems
- A5.1.2. Evaluation of interactive systems
- A5.1.4. Brain-computer interfaces, physiological computing
- A5.1.5. Body-based interfaces
- A5.1.6. Tangible interfaces
- A5.1.7. Multimodal interfaces
- A5.8. Natural language processing
- A5.10.5. Robot interaction (with the environment, humans, other robots)
- A5.10.8. Cognitive robotics and systems
- A6.3.1. Inverse problems
- A9.2. Machine learning
- A9.4. Natural language processing
- A9.5. Robotics and AI
- A9.7. AI algorithmics
- A9.9. Distributed AI, Multi-agent
- A9.10. Hybrid approaches for AI
- A9.11. Generative AI
- A9.12.1. Object recognition
- A9.12.2. Activity recognition
- A9.13. Agentic AI
- A9.14. Evaluation of AI models
- A9.16. Societal impact of AI
Other Research Topics and Application Domains
- B1.2.1. Understanding and simulation of the brain and the nervous system
- B1.2.2. Cognitive science
- B5.8. Learning and training
- B9. Society and Knowledge
- B9.1. Education
- B9.1.1. E-learning, MOOC
- B9.1.2. Serious games
- B9.2. Art
- B9.2.1. Music, sound
- B9.2.4. Theater
- B9.6. Humanities
- B9.6.1. Psychology
- B9.6.8. Linguistics
- B9.7. Knowledge dissemination
1 Team members, visitors, external collaborators
Research Scientists
- Pierre-Yves Oudeyer [Team leader, INRIA, Senior Researcher, HDR]
- Clément Moulin-Frier [INRIA, Researcher, until Apr 2025]
- Hélène Sauzéon [INRIA, Professor Detachement, HDR]
Faculty Member
- Cécile Mazon [UNIV BORDEAUX, Associate Professor]
Post-Doctoral Fellows
- Olivier Clerc [INRIA, Post-Doctoral Fellow]
- Cedric Colas [INRIA, Post-Doctoral Fellow]
- Sina Khajehabdollahi [INRIA, Post-Doctoral Fellow, from Apr 2025 until Jul 2025]
- Marion Pech [INRIA, Post-Doctoral Fellow, until Mar 2025]
- Leslie Tricoche [UNIV BORDEAUX, from Oct 2025]
PhD Students
- Timothe Boulet [INRIA]
- Thomas Carta [INRIA, from Feb 2025 until Nov 2025]
- Thomas Carta [UNIV BORDEAUX, until Jan 2025]
- Marko Cvjetko [UNIV BORDEAUX]
- Marie-Sarah Desvaux [UNIV BORDEAUX]
- Juliette Deyts [UNIV BORDEAUX]
- Loris Gaven [INRIA]
- Gautier Hamon [INRIA, until Apr 2025]
- Sina Khajehabdollahi [INRIA, until Mar 2025]
- Grgur Kovac [INRIA, from Aug 2025 until Nov 2025]
- Grgur Kovac [INRIA, until Jul 2025]
- Jeremy Perez [UNIV BORDEAUX]
- Matisse Poupard [UNIV BORDEAUX, from Sep 2025]
- Matisse Poupard [INRIA, from Apr 2025 until Jul 2025]
- Matisse Poupard [CATIE, CIFRE, until Mar 2025]
- Julien Pourcel [INRIA]
- Clément Romac [HUGGING FACE SAS, CIFRE]
- Julien Rosenberger [INRIA, from Oct 2025]
- Julien Rosenberger [UCL, from May 2025 until Sep 2025]
- Isabeau Saint-Supery [UNIV BORDEAUX, until Apr 2025]
- Paul Tabbara [INRIA, from Dec 2025]
- Nicolas Yax [ENS Paris]
Technical Staff
- Camille Anthounet [UNIV BORDEAUX, Engineer, from Nov 2025]
- Zacharie Bugaud [INRIA, Engineer, until Jan 2025]
- Ludovic Matar [INRIA, Engineer, from Feb 2025]
Interns and Apprentices
- Hana Al Mrayati [UNIV BORDEAUX, Intern, from Mar 2025 until Jun 2025]
- Loic Blouin [UNIV BORDEAUX, Intern, from May 2025 until Aug 2025]
- Sophie Lepennetier [UNIV PADOVA, Intern, from Sep 2025 until Nov 2025]
- Eliott Poisson [UNIV BORDEAUX, Intern, from May 2025 until Jul 2025]
- Paul Tabbara [INRIA, Intern, from May 2025 until Nov 2025]
- Kan Yao [INRIA, Intern, from Mar 2025 until Jul 2025]
Administrative Assistants
- Fabienne Cuyollaa [INRIA]
- Nathalie Robin [INRIA]
External Collaborators
- Eleni Nisioti [UNIV COPENHAGUE]
- Didier Roy [EPFL]
2 Overall objectives
Abstract: This project-team aims to study the fundamental mechanisms that can enable open-ended learning and development in humans and machines, i.e. how individuals, or groups of individuals, can continuously discover and learn novel skills of increasing complexity. We also aim to leverage this fundamental understanding for human-centered real-world applications in education and in assisted scientific discovery.
In particular, we focus on studying mechanisms enabling Autotelic and Aligned Intelligence in humans and machines. A first key ingredient of open-ended learning is curiosity-driven autotelic learning, which is the ability of individuals to set and pursue their own goals (from the greek ‘telos’/goal, and ‘auto’/self), a form of intrinsic motivation pushing organisms to continuously seek new knowledge and skills. self-organizing their own learning curriculum, using meta-cognition and leading to creative exploration.
To enable abstraction, collective intelligence, and alignment of autotelic systems on human cultures (values, preferences), we also aim to study how language and social interaction, both as a communication system and as a cognitive tool, can guide autotelic exploration. Symmetrically, using multi-scale models, we aim to study how curiosity-driven autotelic exploration could self-organize at the group level. We also aim to study what are the ecosystemic and evolutionary origins of autotelic systems.
Context: Humans explore, learn and discover continuously novel skills and knowledge, through open-ended processes. In fact, humans, and some other life forms, are equipped with intrinsic motivation systems (“curiosity”) pushing them to spontaneously explore and actively seek new knowledge 102, setting and pursuing their own goals (they are autotelic 90), ranging from the most concrete (e.g. stack cubes) to the most abstract (e.g. invent new maths problems). This happens at the level of individuals, starting with children who eagerly and spontaneously explore their bodies and their environment as they develop, up to adults of all ages and all backgrounds. This autotelic exploration process also benefits from social and collective dynamics, leveraging past discoveries and being guided to align with the culture (values, preferences, ethics) of a given group 111. The iteration of this collective intelligence process, accumulating and transmitting discoveries over generations, gives rise to open-ended cultural evolution, and to autotelic exploration at the level of collectives. Understanding the mechanisms that enable the origins and functionalities of autotelic learning in interaction with social groups and culture, giving rise to open-endedness, is still a major mystery for science.
We study these mechanisms from four complementary scientific perspectives, structured into three main objectives:
-
Objective 1: Improve understanding of human autotelic and aligned intelligence.
A first objective of this project is to advance our fundamental understanding of the origins, mechanisms and functionalities of autotelic learning and exploration, and how this interacts and is aligned with collective dynamics. This will involve a combination of computational models for developing new theories and hypotheses, as well as design and analysis of new human experimental paradigms analysed with these computational models. Particular scientific questions we target include studying links between autotelic learning, metacognition (one’s own ability to know and control one’s own knowledge and cognitive functions) and creativity in humans, e.g. how do these skills develop in childhood and which internal and external factors influence them ? How do they link to language and processes of social interaction ?
-
Objective 2: Building curiosity-driven autotelic and aligned AI systems.
Our second major objective will be to build and study curiosity-driven autotelic artificial agents that learn by interacting with external environments and within socio-cultural collectives. To do this, we leverage and extend state-of-the art deep reinforcement learning algorithms and transform them into autotelic RL systems. Also and crucially, we will study how algorithms for autotelic learning can be made better aligned (teachable, driveable), more robust and more creative using language both as a tool for social interaction, and as a cognitive tool to make abstractions and leveraging knowledge acquired by others. To achieve this, we will use pre-trained generative AI models as a cognitive tool bootstrap, enabling us to address the poor sample efficiency (i.e. require large amounts of environment interaction) and poor generalisation of classical (autotelic) RL algorithms. In addition, autotelic architectures will enable us to achieve incremental grounding and alignment of generative AI models with external physical and social dynamics. To improve further abstraction, generalisation, we aim to establish links between autotelic learning and program synthesis techniques, whereby 1) autotelic generative models will self-improve their coding abilities by setting learnable coding problems of increasing complexity; 2) we will use code for autotelic procedural self-generation of environments, tasks and policies. Because of its expressivity and abstractness, we believe working in code spaces will open new perspective on open-endedness. We also aim to study how autotelic algorithms, using the learning progress theory, will enable frugal adaptation of generative models thanks to automatic curriculum learning. Finally, we aim to study how groups of autotelic language-augmented agents can work in group and give rise to higher-order forms of autotelic collective intelligence, using multi-agent autotelic reinforcement learning techniques and measurement tools from the field of cultural evolution to track collective innovations.
-
Objective 3: Applications in education and assisted scientific discovery.
Objective 3.1: Train curiosity-driven autotelic learning in humans across the lifespan. We aim to develop educational technologies and interventions that help children and adults across the lifespan to learn in ways that are more motivating and more efficient, for example by stimulating curiosity and meta-cognition and using both models of curiosity and generative AI. Our approach combines 1) an interdisciplinary perspective using both cognitive science, educational sciences and machine learning; 2) a user-centric approach with real-world field studies, in particular with real classrooms in the French educational system, or with field studies with adult of ageing populations; 3) consideration of both neurotypical and neurodiverse populations. Beyond directly training curiosity and meta-cognition, and given their transversal role, we also aim to study how personalised training techniques (e.g. using adaptive curriculum with algorithms that maximize learning progress measures) can enable more efficient and more motivating training of disciplinary skills (e.g. maths, languages) and other cognitive dimensions (attention, working memory, etc). Beyond showing effective impact in RCTs (randomized control trials), our objective is that our techniques and interventions be used large-scale in the real world. To achieve this, we will combine focused collaboration with the educational institutions and edTech industry, with user-centered design of open licence pedagogical material that will aim to be directly and easily reusable by teachers. We also aim to help public action in this domain, through interaction and advising with national and European public institutions.
Objective 3.2: Assisted scientific discovery with autotelic exploration algorithms. Some of the greatest scientific challenges include the study and design of novel materials, molecules or networks with complex dynamics, where the space of possible self-organized behaviour is often initially mostly unknown, and the space of parameters very large, making exploration and discoveries very costly and difficult for physicists, chemists or biologists. We aim to study and show how autotelic aligned exploration algorithms can be used as powerful discovery assistants in these contexts. We believe they have specific capabilities making them highly relevant for this application: they are made to explore and discover in a sample efficient manner a high diversity of behaviours in complex systems (autotelic), while being driveable so that scientists can drive them in directions of interest (aligned). To maximize diversity, we aim to develop methods learning a diversity of goal representations (autotelic and quality-diversity exploration using meta-diversity search). To enable abstractness and high-level guidance from human scientists (e.g. to provide feedback on measures of interestingness), we aim to leverage language and multimodal generative models. To make fast progress in this direction, we first aim to use artificial life environments, such as continuous cellular automata, as an experimental domain, aiming to use autotelic exploration algorithms to help discover the origins of autopoietic systems (and even autotelic systems self-organised from the ground up) as well as study how evolutionary processes themselves could self-organise. For further real world impact, we aim to develop collaborations with physics/chemistry/biology academic labs, as well as various industrial companies working on the design of new physical or biomolecular systems.
Beyond core scientific questions across disciplines, this project addresses two key societal challenges: 1) How can we build AI systems that serve humans and human societies in their diversity, helping their curiosity and cultures to bloom? 2) How can we provide educational opportunities for all children, and adults across the lifespan, in a world with many challenges, to become intrinsically motivated learners, critical thinkers, autotelic explorers?
3 Research program
3.1 Background:
Around the mid-20th century, psychologists started studying the hypothesis that humans, and some other animals, are endowed with mechanisms of intrinsic motivation, also called “curiosity” in everyday language, leading them to spontaneously explore novel activities for their own sake. Such curiosity-driven exploration processes were hypothesised to play important roles in learning, both in cognitive and educational sciences: however, until the start of the 21st century, research for understanding of the underlying mechanisms was still very scarce. This also explains why such mechanisms were overlooked in machine learning and robotics.
In the first years of the 2000s, several labs in the world began studying these mechanisms through proposing various computational theories and hypotheses. Among these groups, Pierre-Yves Oudeyer and his colleagues, first at Sony CSL Paris and then at Inria Bordeaux, proposed several theoretical ideas and techniques to build some of the foundations of a new emerging field studying curiosity at the cross-roads of AI, machine learning, cognitives sciences, psychology and neuroscience. In particular, one major contribution has been the development of the Learning Progress Hypothesis (LPH), proposing that human brains are intrinsically motivated to explore activities with high learning progress, leveraging meta-cognitive processes and leading to the self-organisation of efficient learning curricula 113, 143. A second major contribution has been the development of a theoretical framework to account for autotelic learning, a form of learning where individuals learn to represent, sample and pursue their own goals 141, 89.
Based on several proof-of-concept studies of these computational theories 142, the Flowers team was founded in 2011 by Pierre-Yves Oudeyer (joining Inria) and David Filliat (Ensta ParisTech), with a research program aiming at scaling up these theories along two main dimensions: 1) showing how the LPH could account for key properties of sensorimotor in human infants; 2) showing how it was possible to develop curiosity-driven autotelic learning algorithms that would enable high-dimensional real world robots to acquire complex sensorimotor skills in a human-like way. Several major results were achieved in the 2011-2016 period along these lines.
In the 2017-25 period, we have operated a strategic scientific and applicative pivot: while keeping curiosity-driven autotelic learning in humans and machines as our core research activity, we 1) started projects testing our theoretical predictions in human psychology experiments, and articulated links between curiosity and metacognition; 2) Integrated modern Deep RL techniques with autotelic algorithms, and shifted from the developmental robotics to the machine learning community as target of our contributions to the design of more open, flexible and robust learning machines; 3) Shifted from sensorimotor autotelic learning to language-based abstract yet grounded autotelic learning, and built synergetic bridges with recent advances in generative AI; 4) Scaled up our research in educational technology by taking a translational approach and developing industrial collaborations, with actions to support public policies; 5) Started the new application domain of automated scientific discovery. These constitute the pillars of our current research program, structured as follows:
3.2 Understanding Autotelic Learning in Humans
3.2.1 Curiosity, meta-cognitionand agency across the lifespan.
The Learning Progress hypothesis, as well as other theories of curiosity-driven learning, all assume meta-cognitive competencies (e.g. ability to evaluate one’s own uncertainty, knowledge gaps or learning progress) as well as forms of agency. However, experimental studies of human curiosity have so far mostly overlooked studying the influence of meta-cognition and agency, let alone simply measure them together with various dimensions of curiosity 132. Another major limit of current models and experimental studies of human curiosity has been that they have not studied how curiosity develops across the lifespan. Actually, the scientific community knows very little on how various forms of curiosity change across childhood, adolescence, and up to ageing populations.
We will aim to address some of these limitations by collaborating with various international groups, including M. Gruber (Univ. Cardiff) and Y. Fandakova (Max Planck Institute for Human Development) with whom we just submitted a major ANR/DFG/ESRC project on this topic. In particular, we propose an interdisciplinary approach to make new breakthroughs in understanding how metacognition contributes to the development of curiosity-based learning, and set the stage for educational interventions that could help children develop their curiosity. Given the links between curiosity and metacognition, and the fact that metacognition continues to improve across childhood and adolescence, we formulate the hypothesis that the efficiency of curiosity-based learning, i.e. the ability to inquire about and prioritise learning of information associated with high curiosity, improves across child and adolescent development.
3.2.2 Experimental paradigms for studying autotelic learning in humans.
Another limit of existing experimental studies of human curiosity, including the ones mentioned above, has been that most of them focused so far on studying how humans prefer exploring one of several pre-existing stimuli or learning activities 163. However, as shown in our theoretical and AI work described above, and as argued in complementary arguments from Laura Schulz and Junyi Chu 84, exploration of self-generated goals, including arbitrary goals or games, may be key in accounting for human development, and further in accounting for human innovation and cultural evolution 85. Only very few exploratory experimental protocols have started to be investigated in the literature 93: we aim to further develop this form of experimental protocol, informed by predictions made by our theoretical models, in collaboration with researchers such as G. Molinaro and A. Collins (Univ. Berkeley, both in a 6 months research visit at Inria Flowers and Mnemosyne in 2024) J. Chu (Harvard Univ, US), L. Rat-Fischer (Univ. Nanterre) and A. Ruggeri (TU Munich).
3.2.3 Links between curiosity and creativity for autotelic learning in children:
The ability to imagine abstract and new goals is essential for creative discovery and open-ended learning throughout life. Children achieve this by using the compositionality of language as a tool to imagine situations they have never experienced before, targeting them as goals during play 147, 168. Echoing the IMAGINE architecture 88, an intrinsically motivated deep reinforcement learning architecture modelling compositional imagination (the creation of new linguistic associations for new goals), we aim to investigate the links between curiosity and creativity in humans, focusing on the metacognitive role of language in guiding autonomous learning behaviours. Although the nature of the links between curiosity and creativity is currently not well defined, a recent meta-analysis shows that higher levels of curiosity are significantly associated with higher levels of creativity 156. Divergent thinking mechanisms are said to be the cognitive resource common to both skills 104, 117, and some authors even identify curiosity as a facilitator, a trigger for creativity 106: high curiosity states induce better ideation and greater idea associations conducive to problem solving. Also, both creativity and curiosity are governed by metacognitive processes of self-regulation of learning 151 enabling the identification of information gaps, problem situations or uncertainties, the generation of ideas, paths to resolution, and monitoring and evaluating the value of ideas as creative output or as majoring knowledge 109, 112. We aim to investigate developmental differences on curiosity-based learning and problem-solving tasks while studying their relationships and their dependency to intrapersonal factors (especially metacognitive skills and personality dimensions such as epistemic curiosity, creativity or intellectual humility traits) in late childhood (from 6 to 11 yo). To achieve experiments needed to address these topics, we will leverage an educational Léa-Ifé collaboration network established with 10 primary schools around Bordeaux. As a whole, in this part of the project, we aim to demonstrate that curiosity as a process to seek knowledge in the face of self-generated goals of knowledge gaps, or as a metacognitive feeling 103, leads to better initiation of the creative process.
3.2.4 Curiosity to learn about others and social interaction:
Social curiosity is defined as the desire to acquire knowledge about others in society, encompassing an interest in their emotions, thoughts, and behaviours. This type of curiosity can be divided into two forms 146: 1) empathetic curiosity (the desire to acquire knowledge about others), and 2) relational curiosity (the desire to interact with others). Like other types of curiosity, social curiosity motivates people to engage in exploratory behaviours directed toward the social world, seeking novel information about how people think, behave, and feel. 110 proposes three functions of social curiosity: 1) acquiring information useful for learning and development, 2) establishing interpersonal relationships and increasing a sense of social belonging, and 3) controlling the social world by making it more predictable and manageable. Thus, social curiosity enhances social functioning and has been linked to improved social behaviour adaptation, the ability to establish and maintain social relationships, and better social judgement abilities 110. Recently, another distinction has been proposed in social curiosity 114: 1) overt social curiosity, an explicit interest in understanding other people, which motivates direct communication with others; 2) covert social curiosity, an “hidden” interest that motivates more indirect and furtive behaviours to understand others, such as discreetly observing people, listening to others’ conversations, and reading tabloids and human-interest stories. Covert curiosity is often associated with negative outcomes like gossiping or spying 114, but it can also drive the understanding of the social world through observation and finally motive interactions with others 105. On the other hand, overt social curiosity has been linked with open-mindedness, extraversion, and sociability 114, and was associated with better job performance 105.
3.2.5 Autotelic game invention and cultural transmission.
Leveraging the theoretical ideas on the interaction between autotelic learning and cultural evolution as described in the previous section, we also aim to study experimentally these interactions in chains of humans incentivized to transmit to each other games or artefacts of their own intrinsically motivated invention (either physical or video games, e.g. using experimental setups like 93). We aim to design new experimental protocols and run them both in various age ranges in European populations, as well as in populations in non western culture leveraging associated collaborations with Maxime Derex at IAST, Toulouse, Sheina Lew-Levy at Durham University, and Sarah Pope-Caldwell at Georgia State University.
3.3 Building Curiosity-Driven Autotelic and Aligned AI
3.3.1 Language-Augmented Autotelic Agents with Foundational Models
We will develop architectures where LLMs function as cognitive tools for autotelic RL agents across five dimensions: (1) LLM-based agents with environmental alignment—extending our work on grounding LLMs through online RL 79 where LLMs generate goals, evaluate achievement, relabel experiences, and provide natural language interfaces. We will extend goal generation to creative, time-extended, and learning-oriented goals including self-generated causal questions and hypotheses 101, while correcting hallucinations through incremental LLM updates via environment interaction. (2) Multimodal grounding and social environments—extending our SocialAI School framework 118 to incorporate theory of mind, joint intentionality, and social norms, investigating whether social curiosity can drive efficient acquisition of complex social skills. (3) Real-time human-in-the-loop learning—enabling agents to interpret instructions, respond to feedback, explain exploration processes, and adapt to user preferences for education and discovery applications. (4) Learning to use cognitive tools—agents will learn when to invoke APIs, generate/execute code, query knowledge bases, or request human assistance, including chain-of-thought and self-reflection mechanisms. (5) Metacognitive curriculum learning with coordinated interestingness measures—leveraging our MAGELLAN architecture 47 which enables LLM agents to learn metacognitive predictions of their own competence and learning progress across large language-defined goal spaces. By capturing semantic relationships between goals, MAGELLAN enables sample-efficient progress estimation and dynamic adaptation to evolving goal spaces. We will extend this to develop meta-diversity search algorithms 97 leveraging LLMs to generate novel conceptual dimensions, enabling exploration across objective (learning progress, novelty) and subjective, culturally-contextualized criteria 158. Long-term objectives will include studying how agents pursue goals across extended timescales using cultural artifacts for long-term planning.
3.3.2 Program Synthesis for Abstract and Verifiable Intelligence
We will explore autotelic learning in formal language spaces where goals and policies are represented as programs. Our ACES architecture demonstrates autotelic LLMs self-improving coding skills by iteratively generating diverse problems and solutions using code interpreters 150. This addresses three limitations: environments are not truly open-ended (code is), LLMs lack grounding (interpreters provide it), and code LLMs struggle beyond training distributions (autotelic learning enables self-improvement). We will train small models with advanced coding capabilities and extend to mathematical problem invention and theorem proving. Within Inria LLM4Code, we will develop autotelic LLMs interacting with proof assistants like Coq and Lean 157. This approach provides compact interpretable representations, formal verification, and compositional generalization. Progress on benchmarks like ARC 83 will validate these methods.
3.3.3 Curiosity in Cultural Evolution, Collective Intelligence, and AI Science Teams
Understanding how groups coordinate curiosity-driven exploration and self-organize collective intelligence represents both a fundamental scientific question and a path toward transformative applications. This research direction addresses several interconnected challenges. First, how individual curiosity combines with social transmission and collective innovation 111, 95 is still poorly understood and modeled. Second, as generative AI increasingly participates in human cultural production, understanding cultural evolution in hybrid human-AI groups becomes essential for anticipating societal impacts. Third, many scientific and creative challenges require coordinated teams leveraging complementary expertise and perspectives—motivating our vision of autotelic AI science teams collaborating with human researchers.
Near-term work will investigate coordination when self-generated goals conflict in shared environments. Some goals require collaboration with agents possessing complementary skills—agents must negotiate joint goals serving individual curiosity. We will study how network topology influences innovation dynamics 137, how agents develop communication protocols 75, and whether groups display curiosity at collective levels—pursuing structured exploration maximizing diversity of learned goals through simple individual-level mechanisms. The increasing role of generative AI in cultural production necessitates understanding these dynamics in hybrid settings—the emerging field of "Machine Culture" 77. We will systematically investigate how interaction protocols, social structures, and model capabilities shape cultural dynamics in LLM populations and mixed human-AI groups (collaboration with Derex, IAST).
3.4 Applications in Education and Scientific Discovery
3.4.1 Training Curiosity and Metacognition Across the Lifespan
Addressing 21st century educational challenges—inclusive education, cross-disciplinary skills (attention, curiosity, learning to learn), and digital transformation—requires technologies fostering curiosity and metacognition. Our ZPDES algorithm personalizes curricula by maximizing learning progress 87. RCTs with >1,000 children demonstrated enhanced learning efficiency and motivation versus expert-designed curricula. We will extend ZPDES to new domains (attention training, language learning) and populations: aging adults, neurodiverse learners, professional contexts (sports, gaming). To address ZPDES's requirement for expert-formatted content, we will leverage generative AI to automate exercise generation from textbooks, training smaller LLMs for lightweight systems avoiding foreign-hosted dependencies. We will scale up metacognitive skills and curious question training 69, studying transfer to creativity and including pioneering work using GPT-3 conversational agents in real classrooms 70.
A distinctive objective involves metacognitive empowerment: developing interventions helping children understand their own learning progress to self-generate curricula independently—addressing limited technology contexts while fostering autonomy. Long-term work will include teacher training programs embedding curiosity-fostering practices (Peterson, 2020) and unplugged activity versions. Partnerships with educational institutions (Académie de Bordeaux), industry (EvidenceB, Ubisoft), and NGOs (France IOI) will enable deployment and policy influence.
3.4.2 Assisted Scientific Discovery with Autotelic Exploration
Scientists studying complex systems face challenges mapping behavioral spaces when lacking models and representations with scarce experimental resources. Autotelic algorithms offer sample-efficient diverse behavior discovery while remaining steerable through natural language. Proof-of-concept work efficiently maps spaces in cellular automata 98 and gene regulatory networks (99, with Levin, Harvard), independently adopted by physics researchers (U. Washington). We will maximize diversity through meta-diversity search and leverage language/multimodal models for abstract guidance.
Near-term work will leverage artificial life as testbeds, studying origins of autopoietic systems and self-organizing evolutionary processes in cellular automata 34, 149 (2023 Best Paper ALife). Long-term objectives will transition to real-world systems through collaborations: Levin (synthetic biology), Murugan (soft condensed matter, U. Chicago), Aymonier (chemistry, ICMCB Bordeaux), with applications to power networks, neuromuscular models, and artistic domains. We will explore autotelic algorithms for mathematical problem/proof exploration within LLM4Code. Success will establish autotelic exploration as methodology for materials science, systems biology, and mathematical discovery, with industrial translation potential (e.g. Solvay/Syensqo).
4 Application domains
Neuroscience, Developmental Psychology and Cognitive Sciences Being primarily experts in curiosity and its links with open-ended learning, our aim has been to build and grow internationally an integrated science of curiosity. By leveraging and integrating concepts and techniques also often used in AI, psychology and education, we aim to reinforce our existing contributions in this direction, ranging from building theories and experiments that add to the corpus of scientific knowledge on curiosity, to leading the organisation of international events dedicated to this integrated science. As an example, co-leading the organisation of a Gordon Research Conference series entitled “The New Science of Curiosity” (see). Complementarily, a European ORA project on the cognitive science study of curiosity and metacognition (with M. Gruber and Y. Fandakova). Other examples are the study of the role of intrinsic motivation in adoption of technologies fostering autonomy in ageing populations, with a view to assessing its positive value against cognitive aging as a protective ingredient. This includes: CuriousTECH associate team with M. Fernendes from the Cognitive Neuroscience Lab of the University of Waterloo, the InnovCare project (with S. Lechevalier) within the PPR Autonomie-France 2030 (and with Fondation France-Japan of EHESS), the project VBHI - France 2030 (IHU, S. Debette), with F. Lotte and F. Wagner from Inria.
Development and open-endedness in generative AI There have been revolutionary advances in AI in the last few years, especially around generative systems such as multi-modal foundational models. However, as described above, these systems are still strongly limited in several key dimensions: they are not pro-active agents interacting with external environments, they lack grounding, meta-cognition and curiosity. One of our goals is to make fundamental scientific and technological contributions to adapt and extend current generative AI systems by integrating forms of curiosity, meta-cognition and grounding, for which we recently made proofs of concepts, and vice-versa take advantage of powerful capabilities of foundational models to build new kinds of curiosity-driven learning systems capable of creative and abstract exploration learning and discovery.
Machine culture Beyond technological advances, generative AI is also starting to have a major influence on human cultural evolution. They are now massively used as intermediation platforms between individuals and existing corpuses of knowledge and culture, conveying multiple forms of biased cultural perspectives that they can amplify. This phenomenon has recently become massive as social networks are pervaded by bots powered by generated AI systems, playing the roles of humans with particular opinion or backgrounds, and increasingly interacting directly among each other, beyond interaction with humans. While generative AI offers unique potential in enabling humans make discoveries and know and understand each others' cultures, these properties have also been leveraged by diverse organisations to influence in unfair and dangerous manners what populations think and do. Even though this poses major societal issues, this evolution has been so rapid that basic scientific understanding of cultural evolution in hybrid human-machine groups is strongly lacking. Thus, we believe the parts of our project which aim at modelling cultural evolution in groups of generative AI agents, or hybrid groups, as well as its links with properties of curiosity-driven learning at the level of individuals, has a potential to make very useful contributions to these high stake issues
Translational educational technologies that foster curiosity-driven and critical mind We live in a world that is evolving fast: global factors such as climate change and geopolitical processes fragilize the context children live in. New technologies, such as generative AI, are profoundly impacting economic dynamics, democracy and cultural evolution. Yet, in most educational contexts, including in Europe, what is taught in classrooms is very similar to what was taught 50 years ago. And even for so-called “fundamental knowledge”, studies such as PISA show a worrying decrease of skills and motivation in children. As mentioned in a recent report from OECD 166, we believe it is essential to train children to become autonomous lifelong learner, through fostering and training their curiosity and their critical minds, their ability to go search by themselves new information, and to question the validity of information they collect, as well as question their own knowledge and opinions. Thus, our research program aimed to train curiosity and the associated metacognitive skills that underlie the critical mindset, has the potential to contribute in this perspective. We aim to leverage our fundamental research in translational projects where we will work directly with major educational stakeholders from the start (e.g. students, teachers, parents, educational institutions like individual schools, Académie de Bordeaux, edTech companies like EvidenceB, government and in particular ministry of education) to build educational interventions that will be efficient, adapted to the needs and constraints of real world educational contexts, and with the aim of large scale adoption and use (a first step in this direction are the AdaptivMaths and MIA Seconde educational software now deployed in all French primary schools and supported by the French ministry of education).
Generative AI and education: scientific understanding of stakes and opportunities in support of public policies One particular topic we focus on is the study of the opportunities and challenges of generative AI in education. While very recent (ChatGPT was introduced only 1.5 years ago), generative AI has already very importantly impacted the educational world in the last few months. More than 50% of children in the 12-18 age range have already used generative AI systems for their homework, and this tendency is quickly rising, including in Europe. Associated challenges include forms of uses of generative AI by students that may harm their abilities to learn, understand, and be motivated to put effort and be actively engaged in these processes. Also, it impacts profoundly the way teachers design homework - for which students are already massively using these tools. On the other hand, generative AI opens unique opportunities for rich personalised tutoring, ranging from opportunities to obtain tailored explanations and feedback, to getting the opportunity to discuss and train in foreign languages. Such opportunities may be particularly magnified for countries where the educational system is underdeveloped 135. Key aspects of our research program are geared towards studying these opportunities and challenges, for example running field studies in middle and high schools to understand how students currently (mis)understand and (mis)use generative AI tools. In complement, we continue working on outreach, especially developing educational tools enabling to improve generative AI literacy in students, teachers and parents, for example by further developing and disseminating our pedagogical video series “ChatGPT explained in 5mn” (see), which has has been integrated in various tools from DNE and in the European mooc made for introducing teachers to AI (the AI4Teacher mooc). Participating in popular science events, visiting middle and high schools and welcoming students in the lab, writing popular science books (such as “C'est pas moi, c'est l'IA”, published by Nathan), and participating in discussions on these topics in wide audience media constitute another application axis. Given the high societal challenges associated with this line of work, we also aim to strongly develop our activities in informing and supporting public policies: a key vector for such public support is actively participating in interactions and discussions with public bodies that analyze current stakes and propose new actions and laws. In this lens we recently supported Inria in writing notes on generative AI and its societal dimensions for the cabinet of E. Macron, we participated in interviews from senators preparing a report on AI and education. We made presentations of the stakes associated to training curiosity and metacognition using AI technologies at Conseil Scientifique de l'Education Nationale, and at an annual scientific event organized by DNE (Direction du Numérique Educatif) and were invited by BPI to participate to evaluation and monitoring of projects related to education/edTech by this institution. We are also working to develop collaborations between Inria and the UK AI safety institute, towards building a French institution similar to the UK one. This includes developing a collaboration with Chris Summerfield on doing field studies to assess the current state of use of generative AI in middle and high schools to inform public policies on this topic. Lastly, our activities aimed at sharing AI models and data that are fully open-source (open weights and open data) and trained on data associated with appropriate rights (we are here also collaborating with the Hugging Face company to distribute these open models and data on their platform). For example, we recently built a project with the EvidenceB company, in collaboration with Région Ile-de-France, to build an open model trained on data from free manuals, for which authors will be retribution in an appropriate manner: this kind of model will enable wider and legally compliant access to AI models by the edTech ecosystem in France.
Automated discovery in science Machine learning algorithms integrating intrinsically-motivated goal exploration processes (IMGEPs) with flexible modular representation learning are very promising directions to help human scientists discover novel structures in complex dynamical systems, in fields ranging from biology to physics. The automated discovery project aims to boost the efficiency of these algorithms by empowering discovery in science and engineering. These entail real-world applications with high societal stakes, such as helping scientists make new discoveries that may for e.g. help build more sustainable materials, generate cleaner energy or save energy, find molecules with medical applications, design accessible and efficient educational tools, or help design more sustainable forms of plant growing in agriculture. In many cases, the complexity of self-organising materials or biological systems involves significant scientific and engineering challenges for understanding, controlling and inventing. Following several of our recent proof-of-concept projects 96, 99, we aim to do translational research also in this domain, enabling chemists, physicists and biologists, in both academia and industry, to efficiently use our tools for curiosity-driven exploration to help them make new discoveries. In particular, we are now starting exploring several new collaborations in these fields: with Solvay/Syensqo we have started several discussions to develop collaborations on using autotelic exploration algorithms to efficiently explore and map the space of material design and properties, with the aim to help scientists at Syensqo to discover new materials with high environmental and functionality properties; with IRT Saint Exupery, we have an ongoing consortium collaboration around the project AIxIA, where we study the use of autotelic exploration algorithms to map the space of interference behaviours on embedded software and hardware.
Building self-organising AI from the ground up When using continuous cellular automata as a playground for designing and evaluating our algorithms for curiosity-driven automated discovery for the sciences, we are also actually making direct contributions to the domain of Artificial Life. In particular, we believe the tools and approach we are taking, in particular exploring the self-organisation of sensorimotor agency and open-ended evolutionary processes, has the potential to have significant impact in this domain. This has been attested recently by our Best Paper award at the Alife 2023 conference (and also wider impact, e.g. through > 2 millions views of the popular science videos of Sciences Etonnantes and EGO presenting - in part - our work on this topic). As we are aiming to study the self-organisation of basic forms of memory, learning, and even autotelic learning, in such environments, this may also constitute a foundational approach to build AI systems from the ground up, possibly opening new possibilities in terms of robustness, adaptivity and generalisation
5 Social and environmental responsibility
5.1 Footprint of research activities
AI is a field of research that currently requires a lot of computational resources, which is a challenge as these resources have an environmental cost. In the team we try to address this challenge in two ways:
- by working on developmental machine learning approaches that model how humans manage to learn open-ended and diverse repertoires of skills under severe limits of time, energy and compute: for example, curiosity-driven learning algorithms can be used to guide agent's exploration of their environment so that they learn a world model in a sample efficient manner, i.e. by minimizing the number of runs and computations they need to perform in the environment;
- by monitoring the number of CPU and GPU hours required to carry out our experiments. For instance, our work 9 used a total of 2.5 cpu years. More globally, our work uses large scale computational resources, such as the Jean Zay supercomputer platform, in which we use several hundred thousands hours of GPU and CPU each year.
5.2 Impact of research results
Our research activities are organized along two fundamental research axis (models of human learning and algorithms for developmental machine learning) and one application research axis (involving multiple domains of application, see the Application Domains section). This entails different dimensions of potential societal impact:
- Towards autonomous agents that can be shaped to human preferences and be explainable We work on reinforcement learning architectures where autonomous agents interact with a social partner to explore a large set of possible interactions and learn to master them, using language as a key communication medium. As a result, our work contributes to facilitating human intervention in the learning process of agents (e.g. digital assistants, video games characters, robots), which we believe is a key step towards more explainable and safer autonomous agents.
- Reproducibility of research: By releasing the codes of our research papers, we believe that we help efforts in reproducible science and allow the wider community to build upon and extend our work in the future. In that spirit, we also provide clear explanations on the statistical testing methods when reporting the results.
- Digital transformation and Competences' challenges facing schools in the 21st century. We expect our findings to inform the broader societal challenges inherent to the School of the 21st Century, ranging from helping children (and their teachers) to develop cross-domain skills for learning such as curiosity and meta-cognition, while improving inclusivity in schools (learners with disabilities, especially cognitive disabilities) as well as promoting lifelong learning in older adults (successful aging), using cognitive-based research findings.
- AI and personalized educational technologies to reduce inequalities due to neurodiversity The Flowers team develops AI technologies aiming to personalize sequences of educational activities in digital educational apps: this entails the central challenge of designing systems which can have equitable impact over a diversity of students and reduce inequalities in academic achievement. Using models of curiosity-driven learning to design AI algorithms for such personalization, we have been working to enable them to be positively and equitably impactful across several dimensions of diversity: for young learners or for aging populations; for learners with low initial levels as well as for learners with high initial levels; for "normally" developping children and for children with developmental disorders; and for learners of different socio-cultural backgrounds (e.g. we could show in the KidLearn project that the system is equally impactful along these various kinds of diversities).
- Health: Bio-printing The Flowers team is studying the use of curiosity-driven exploration algorithm in the domain of automated discovery, enabling scientists in physics/chemistry/biology to efficiently explore and build maps of the possible structures of various complex systems. One particular domain of application we are studying is bio-printing, where a challenge consists in exploring and understanding the space of morphogenetic structures self-organized by bio-printed cell populations. This could facilitate the design and bio-printing of personalized skins or organoids for people that need transplants, and thus could have major impact on the health of people needing such transplants.
- Tools for human creativity and the arts Curiosity-driven exploration algorithms could also in principle be used as tools to help human users in creative activities ranging from writing stories to painting or musical creation, which are domains we aim to consider in the future, and thus this constitutes another societal and cultural domain where our research could have impact.
- Education to AI As artificial intelligence takes a greater role in human society, it is of foremost importance to empower individuals with understanding of these technologies. For this purpose, the Flowers lab has been actively involved in educational and popularization activities, in particular by designing educational robotics kits that form a motivating and tangible context to understand basic concepts in AI: these include the Inirobot kit (used by >30k primary school students in France (see) and the Poppy Education kit (see) now supported by the Poppy Station educational consortium (see)
- Health: optimization of intervention strategies during pandemic events Modelling the dynamics of epidemics helps proposing control strategies based on pharmaceutical and non-pharmaceutical interventions (contact limitation, lock down, vaccination, etc). Hand-designing such strategies is not trivial because of the number of possible interventions and the difficulty to predict long-term effects. This task can be cast as an optimization problem where state-of-the-art machine learning algorithms such as deep reinforcement learning, might bring significant value. However, the specificity of each domain – epidemic modelling or solving optimization problem – requires strong collaborations between researchers from different fields of expertise. Due to its fundamental multi-objective nature, the problem of optimizing intervention strategies can benefit from the goal-conditioned reinforcement learning algorithms we develop at Flowers. In this context, we have developped EpidemiOptim, a Python toolbox that facilitates collaborations between researchers in epidemiology and optimization (see).
6 Highlights of the year
- Renewal of the team: After a decade of research and applications, the team was renewed and is now named the Flowers AI & CogSci Lab. This new name highlights our activities at the cross-roads of AI and cognitive sciences, studying curiosity and its roles in open-ended learning in humans and machines, from individuals to collectives. Our new detailed research program is available here. The team is associated with both Inria and the University of Bordeaux, France.
- Understanding human curiosity and metacognition. We started a new European project in collaboration with cognitive neuroscience labs of M. Gruber's in Univ. Cardiff, and Y. Fandakova's in University of Trier, aiming to study the joint development of curiosity and metacognition in adolescents, through a set of behavioural and neuro-imaging studies. This project also aims to leverage new insights to be applied in educational technologies. We collaborated with Alexandr Ten, Michiko Sasaki and Kou Murayama (Univ. Tuebingen) in producing a theoretical framework enabling to integrate multiple theories of curiosity developed across the litterature, and relating them to the well-known "Curious U" effect: this framework was published in the Open Mind journal 42. Collaborating with A. Tricot (Univ. Montpellier), we developped a theoretical perspective to study the links between intrinsic motivation and cognitive load in the context of extended-reality educational interventions 39, and published an associated study about the use of virtual reality for optimizing cognitive load and intrinsic motivation in educational technologies 38. Finally, we collaborated with M. Derex (IAST, Toulouse), as well as with Sheina Lew-Levy (Durham University) and Sarah Pope-Caldwell (Georgia State University), in the design and implementation of a study of cross-cultural similarities and differences in curiosity-driven exploration, conducted in Congo with Bayaka and Bandongo populations.
- Autotelic curiosity and open-ended learning in agentic generative AI. We continued building the foundations of a new generation of genAI systems that are open-ended, curious, autotelic, grounded and continuously self-improving. To do so, we leveraged GLAM, an approach we designed 3 years ago to turn LLMs into agents that learn to solve goals in interactive environments though online RL (not produce texts that humans like, but achieve practical goals!), as the basis for building curious agents that sample their own goals. We designed MAGELLAN (published at ICML 2025), a method enabling genAI agents to navigate very large spaces of goals, where millions of them may be either two easy or difficult 47. MAGELLAN makes it possible by leveraging the learning progress hypothesis, which we developed to account for human curiosity-driven learning: goals that are sampled in priority are those with high expected learning progress. Achieving this requires advanced metacognitive skills, which LLMs lacked so far: MAGELLAN learns these metacognitives skills, enabling to predict learning progress in goals that were never sampled, using semantic information in embedding spaces. Curious LLM agents can also enact artificial scientists that explore the environment to hypothesize, experiment, test, confirm or revise abstract rules to build human-readable world models. First steps in this directions were made in WorldLLM59. Imagining new abstract goals that maximize learning progress is another challenges for autotelic genAI systems. One approach is to formulate them directly as code, such as in the AutotelicLLM agent (40) enabling open-ended exploration in Crafter (a 2D Minecraft). These projects involved collaborations with S. Aissi, O. Sigaud, L. Soulier, N. Tome (Sorbonne Université), S. Lamprier (Univ. Angers), T. Wolf (Hugging Face) and G. Pourcel (Univ. Amsterdam).
- Autotelic Generative AI for Self-Improving Program Synthesis and ARC-Prize. In the context of the LLM4Code Inria challenge, the team started collaboration on projects at the intersection of generative AI, program synthesis and AI assisted discovery in mathematics. In particular, we developped collaborations with N. Fijalkow (Labri, CNRS), X. Hinault (Mnemosyne), G. Baudart (PiCube). This year we continued leveraging the ACES method, enabling large language models to self-generate diverse and challenging programming puzzles (using autotelic exploration), to transpose and adapt it in the domain of mathematics. In the domain of program synthesis, we developed a new approach, called SOAR , enabling continuous self-improvement of LLMs as operator of evolutionary algorithms. This new method, published at ICML 2025, enabled to push the state-of-the-art on the ARC-AGI 1 benchmark (category of approaches based on open-source models and program synthesis), and was awarded the 2nd place at the ARC-Prize (paper category). We also started to explore how full-fledged generative AI agents can be used to search and optimize program controllers for simulated agents 44.
- Collective intelligence and social learning in AI systems. We continued exploring key questions at the crossroads of AI and society, studying how methods from human sciences may (or not) be used to understand socio-cultural properties of genAI. In particular, this year we focused on studying fundamental properties of GenAI systems as cultural transmission technologies: they massively (re)produce cultural artifacts (e.g. texts) which are in turn viewed by/influencing both humans and other GenAI systems. It's important to understand the dynamics of the evolution of cultural artefacts when GenAI are part of the transmission chains. As first steps in this direction, we adapted the so-called "iterative chain design" from the cultural evolution community, where LLMs basically play a version of the telephone game. This allowed us to identify dynamical properties like collapse or attractors that depend on various properties of data. This work resulted in one paper published at ICLR 20260, and another at EMNLP 202549. This involved collaborations with Maxime Derex (IAST Toulouse), Cédric Colas (MIT), Gaia Molinaro (UC Berkeley), Eleni Nisioti and Sebastian Risi (ITU Copenhagen), Peter-Ford Dominey (INSERM), Ida Momennejad (Microsoft Research), Remy Portelas (Ubisoft).
-
Education, generative AI and cognitive training We started a large scale collaborative project, called GAIMHE, to study the design of educational technologies that combine the power of pedagogically grounded ITS for cross-exercise personalization, with the flexibility of generative AI for pre-generation of exercices and within-exercise personalization. This project, funded by BPI, involves collaboration with the EvidenceB company, as well as ClassCode and Café Pédagogique educational NGOs.
We also continued working on developing and evaluating in classrooms various pedagogical interventions training curiosity and metacognition (both conceptually and procedurally), and focused on studying the comparative impact of interventions when made by teachers themselves as opposed to researchers. Furthermore, we continued conducting our series of experimentations in middle schools to study whether schoolchildren understand and know how to use generative AI tools in the context of educational exercices, showing strong limits and pointing to two needs: training their metacognition and their AI litteracy (a first series of results is available here). Also, we developed a software library tools (LLM4humanities library) enabling to use LLMs to partially automate qualitative analysis methods in social sciences, leveraging our prior work in this direction 170, opening new perspectives for studying qualitatively large text corpuses or verbal data from psychology or educational experiments. We also continued working on evaluating the use of adaptive personalization algorithms (in particular ZPDES, based on the learning progress theory) for cognitive training, and with diverse populations. This was associated to a review of AI-based approaches to cognitive training, published in Plos One.This involved collaborations with R. Abdelghani and Kou Murayama (University of Tuebingen), C. Kidd (Univ. Berkeley).
We also continued working on developing frameworks and tools to support social curiosity among the stakeholders working with children with ASD 41, 35, 58. Finally, we continued to develop and adapt AI-based personalization technologies for supporting learning and well-being in aging populations, for example through cognitive training 36 and monitoring of daily activities 46.
- Curiosity-driven AI for assisted scientific discovery: We continued studying how curiosity-driven AI algorithms can enable scientists (physicists, chemists, biologists, etc) explore and map the space of self-organized behaviours in diverse complex systems 96. We published a milestone article (34) in Science Advances presenting the results of our multi-year projects using autotelic AI algorithm to investigate the possibilities for guided self-organization of robust sensorimotor agents from low-level interactions in continuous cellular automata (see web site). We also started to explore how generative AI semantic models can be used to drive open-ended exploration of self-organized patterns in cellular automata (Alife 2025 paper 48), how autotelic algorithms can explore full ecosystems (Alife 2025 paper 51), to use autotelic reinforcement learning techniques to control and grow in an online manner such self-organized patterns (ALife 2025 paper 45), and to study human-guidance in these loops 52. On this line of research, we continued developing collaborations with M. Levin at Tufts University. In particular, we studied how autotelic AI systems (IMGEP algorithms) can enable cost effective discovery of diverse sophisticated and robust behaviors in gene regulatory networks, resulting in a milestone paper published in eLife 99.
- Clément Moulin-Frier recently moved for personal reasons to Inria Lyon, but we are still strongly collaborating with him. He joined the BioTiC Inria team which focuses on fundamental research in theoretical and computational biology, with a specific a specific interest in modeling evolutionary processes. We are exploring two main research directions in collaboration with BioTiC: (1) Large-scale evo-evolutionary simulations in cellular automata, which is an important topic in both teams (from a Computational Biology in BioTic 92 vs. an Artificial Life perspective in Flowers 55) and (2) Studying the similarities and differences between biological vs. cultural evolution, both in the natural world and in computer simulation (see Section 8.2.6).
- Reverse engineering large language models and cheaply predicting their performances in benchmarks. In collaboration with N. Yax and Stefano Palminteri (ENS), we developed a new algorithmic method, called PhyloLM and published at ICLR 2025 56, that aims at reverse-engineering generative AI model origins from only black-box access — which models derive from which (e.g. reusing data or algorithm or architecture or other features). It proved remarkably powerful at reconstructing model evolutionary trees and predicting benchmark performance cheaply. This approach opens new possibilities for safety applications as hundreds of new models appear daily.
- Software The team continued to develop several key software libraries: Lamorel, enabling LLMs to be used as agents in interactive environments; AdTool, enabling easy use of autotelic exploration algorithms for automated discoveries in physics/chemistry/Alife; Vivarium, for building and running multi-agents simulations using Jax, with a focus on educational use; LLM4Humanities, to enable researchers in human sciences leverage generative AI models for tasks like annotations or analysis of texts corpuses, using a solid methodological approach.
- Outreach The team participated to multiple events such as the science festival at Cap Sciences (Bordeaux), AI days for teachers (Bordeaux), Main à la pâte foundation, Learning Show (Rennes), Science with and for society (Bordeaux), the Chiche program (Nouvelle Aquitaine), Academic days of Poitiers academy, or CogniForum, and welcomed several middle and high-school students for their internships. The team also continued to produce the pedagogical video series "ChatGPT explained in 5 mn", aimed at training generative AI literacy in a wide diversity of students (e.g. high school), available here. They are under a Creative Commons licence, CC-BY, enabling open and free reuse. They were already integrated in the MOOC AI4T (see here), as well as in an internal training platform of "Académie du Numérique du Ministère de la défense", in a mobile app made by Inria with educational materials related to AI (see here), and are being adapted and integrated in a training platform for the whole population of civil servants in France, coordinated by DINUM.
- Support to public policy The team was involved in several major actions to support public policies on the topic of AI and education. Members of the team designed and conducted training sessions in different academies for supervisory staff and teachers, e.g. ETAPP-IA day in Nouvelle-Aquitaine (January 2025); departmental training of CPE and documentary teachers of Nouvelle-Aquitaine during a day at the Lycée Les Iris in Lormont (May 2025); Academic Days of Innovation for teachers of Nouvelle-Aquitaine, Spring Days of Education Research at INSPEs, (June 2025); PhilosophIA Citizens' Convention (April 2025), twin conference of Cnesco/Cardie Charente-Maritime (January 2025), working group Education and Cognitive Sciences of the academies of Créteil, Versailles and Paris, scheduled for March 2026. H. Sauzéon and PY. Oudeyer were interviewed and wrote reports to contribute to the report of French Senate on AI and education. PY Oudeyer was auditioned by the commission on cultural and educational affairs in the French parliament, to discuss the major challenges and opportunities of AI and education.
- PY Oudeyer was selected by the French National Research Agency (ANR) as one of 20 researchers across all disciplines to highlight research projects funded in the last 20 years, and at the occasion of celebrating the 20th anniversay of ANR. He was also invited to give a keynote talk on curiosity-driven learning in humans: learning progress, autotelic exploration and open-ended development, at the Budapest Conference on Cognitive Development, see video.
6.1 Awards
- Julien Pourcel, Cédric Colas and Pierre-Yves Oudeyer were awarded the 2nd place ARC-Prize in the paper category, for their article and method SOAR, published at ICML 2025. This method introduced a novel approach to enable self-improvement of LLMs when used as operators of evolutionary search algorithms in general program synthesis, and pushed the frontier of state-of-the-art results on the ARC-AGI 1 benchmark (in the category of approaches using open-source models and program synthesis).
- We received two Best Paper Awards at the Evostar 2025 conference for our paper Emergent kin selection of altruistic feeding via non-episodic neuroevolution55 (Best paper of the EvoApp track + Best student paper award to the first author Max Taylor-Davis).
- Didier Roy, Pierre-Yves Oudeyer (authors) and Clémentine Latron (illustrator) obtained the 38th Prize Roberval, fo the best popular science book in the youth category, for their book C'est pas (moi), c'est l'IA (Nathan), and were selected among the 3 finalists of Prize "Goût des Sciences" organized by the French Ministry of Higher Education, Science and Space. D. Roy and P-Y. Oudeyer gave many general public presentations in the context of this book.
- Matisse Poupart was awarded the Best PhD prize from R3NumEd, the research network on educational technologies in Nouvelle-Aquitaine, for his PhD entitled Curious and therefore not overloaded: Towards an integrated understanding of curiosity and cognitive load in XR learning environments.
- Leana Petitot, Hélène Sauzéon and Pierre Dragicevic obtained an Honorable mention at the ACM-CHI2 conference for their paper entitled "The Effect of Augmented Reality on Involuntary Autobiographical Memory", on co-design of an augmented reality (AR) application simulating a museum visit in the context of the I-am Associated Team, 2023, integrated with an evaluation of involuntary and uncontrollable memory revival. This study confirmed our hypothesis: AR enhances this type of memory compared to 3D images 53, suggesting potential cognitive manipulations.
6.2 PhD defenses
- Grgur Kovac defended his PhD entitled Building, evaluating and understanding socio-cultural AI: leveraging concepts and methods from human sciences (see also the video of his talk, as well as slides and outlines).
- Gauthier Hamon defended his PhD entitled Towards open-ended dynamics in Artificial Life and Artificial Intelligence: an eco-evo-devo perspective,
- Matisse Poupart defended his PhD entitled Curious and therefore not overloaded: Towards an integrated understanding of curiosity and cognitive load in XR learning environments (see also the video of his PhD defense talk).
- Thomas Carta defended his PhD entitled Language as a cognitive tool for open-ended agents (see also the video of his PhD defense talk and slides and outline).
- Nicolas Yax defended his PhD entitled Are we smart enough to understand Large Language Models? Tools for studying and improving LLMs’cognition (see also the slides and outline).
7 Latest software developments, platforms, open data
7.1 Latest software developments
7.1.1 SocialAI
-
Name:
SocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents
-
Keywords:
Artificial intelligence, Deep learning, Reinforcement learning, Large Language Models
-
Functional Description:
Source code for the paper https://arxiv.org/abs/2107.00956.
A suite of environments for testing socio-cognitive abilities of artificial agents. Environments can be used in the multimodal setting (suitable for RL agents) and in the pure text setting (suitable for Large Language Model-based agents). Also contains RL and LLM baselines.
- URL:
-
Contact:
Grgur Kovac
7.1.2 AutoDisc
-
Keyword:
Complex Systems
-
Functional Description:
AutoDisc is a software built for automated scientific discoveries in complex systems (e.g. self-organizing systems). It can be used as a tool to experiment automated discovery of various systems using exploration algorithms (e.g. curiosity-driven). Our software is fully Open Source and allows user to add their own systems, exploration algorithms or visualization methods.
- URL:
-
Contact:
Clément Romac
7.1.3 ADTool
-
Keywords:
Machine learning, Python, Cellular automaton, Physical simulation, Pattern discovery, Exploration
-
Functional Description:
ADTool is a versatile and open-source Python framework designed to explore complex parametric systems using IMGEP algorithms (Intrinsic Motivation for Goal Exploration Processes) as described in https://arxiv.org/pdf/1708.02190. This curiosity-driven approach enables automatic exploration and the discovery of new behaviors across a wide range of domains, offering a novel way to study complex systems.
With ADTool, users can explore cellular automata such as Lenia, Particle Lenia, and Flowlenia to uncover patterns and emergent behaviors. Its capabilities extend to drug discovery, exploring chemical spaces to identify promising protein-ligand affinity profiles. The framework also ventures into physics, with applications such as searching for trajectories in the N-body problem, simulating the Kuramoto model, exploring the Gray-Scott reaction-diffusion system, and studying hypergraph rewriting systems for Wolfram physics. In digital art, ADTool fosters creativity by exploring processes like subtractive sound synthesis and other artistic methods.
The framework is designed to be flexible and extensible, allowing users to define their own systems and integrate custom exploration strategies. It includes mechanisms for saving discoveries to disk, making it easier to resume experiments or share results with collaborators. Additionally, an integrated visualization tool provides a user-friendly interface to track exploration progress, enhancing the understanding and analysis of results.
The scientific foundation of ADTool lies in "curiosity-search" algorithms, which autonomously explore behavioral spaces to identify interesting phenomena without predefined objectives. These algorithms, initially developed for robotic learning, are now applied to the study of emergent behaviors in various systems.
Whether you are a physicist, chemist, biologist, or digital artist, ADTool can help you explore and understand complex systems.
Reproducibility is guarantied with a predifined Python environment and experiments can be launched with a simple command line: python3 run.py –config_file examples/grayscott/gray_scott.json
-
Contact:
Zacharie Bugaud
7.1.4 Kids Ask
-
Keywords:
Human Computer Interaction, Cognitive sciences
-
Functional Description:
Kids Ask is a web-based educational platform that involves an interaction between a child and a conversational agent. The platform is designed to teach children how to generate curiosity-based questions and use them in their learning in order to gain new knowledge in an autonomous way.
- URL:
-
Contact:
Rania Abdelghani
7.1.5 ToGather
-
Keywords:
Education, Handicap, Environment perception
-
Scientific Description:
With participatory design methods, we have designed an interactive website application for educational purposes. This application aims to provide interactive services with continuously updated content for the stakeholders of school inclusion of children with specific educational needs.
-
Functional Description:
Website gathering information on middle school students with neurodevelopmental disorders. Authentication is required to access the site's content. Each user can only access the student file(s) of the young person(s) they are accompanying. A student file contains 6 tabs, in which each type of user can add, edit or delete information: 1. Profile: to quickly get to know the student 2. Skills: evaluation at a given moment and evolution over time 3. Compendium of tips: includes psycho-educational tips 4. Meetings: manager and reports 5. News: share information over time 6. Contacts: contact information for stakeholders The student only has the right to view information about him/her.
- Publication:
-
Contact:
Cécile Mazon
-
Participant:
4 anonymous participants
7.1.6 mc_training
-
Name:
Platform for metacognitive training
-
Keywords:
Human Computer Interaction, Education
-
Functional Description:
This is a web platform for children between 9 and 11 years old, designed to help children practice 4 metacognitive skills that are thought to be involved in curiosity-driven learning: - the ability to identify uncertainties - the ability to generate informed hypotheses - the ability to ask questions - the ability to evaluate the value of a preconceived inference.
Children work on a reading-comprehension tasks and, for each of these skills, the platform offers help through a "conversation" with conversational agents that give instructions to perform the task, with respect to every skill, and can give suggestions if the child asks for it.
-
Contact:
Rania Abdelghani
7.1.7 Evolution of adaptation mechanisms in complex environments
-
Name:
Plasticity and evolvability under environmental variability: the joint role of fitness-based selection and niche-limited competition
-
Keywords:
Evolution, Ecology, Dynamic adaptation
-
Functional Description:
This is the code accompannying our paper Plasticity and evolvability under environmental variability: the joint role of fitness-based selection and niche-limited competition" which is to be presented at the Gecco 2022 conference.
In this work we have studied the evolution of a population of agents in a world where the fitness landscape changes with generations based on climate function and a latitudinal model that divides the world in different niches. We have implemented different selection mechanisms (fitness-based selection and niche-limited competition).
The world is divided into niches that correspond to different latitudes and whose state evolves based on a common climate function.
We model the plasticity of an individual using tolerance curves originally developed in ecology. Plasticity curves have the form of a Gaussian the capture the benefits and costs of plasticity when comparing a specialist (left) with a generalist (right) agent.
The repo contains the following main elements :
folder source contains the main functionality for running a simulation scripts/run/reproduce_gecco.py can be used to rerun all simulations in the paper scripts/evaluate contains scripts for reproducing figures. reproduce_figures.py will produce all figures (provided you have already run scripts/run/reproduce_gecco.py to generate the data) folder projects contains data generated from running a simulation How to run To install all package dependencies you can create a conda environment as:
conda env create -f environment.yml
All script executions need to be run from folder source. Once there, you can use simulate.py, the main interface of the codebase to run a simulation, For example:
python simulate.py –project test_stable –env_type stable –num_gens 300 –capacity 1000 –num_niches 10 –trials 10 –selection_type NF –climate_mean_init 2
will run a simulation with an environment with a climate function whose state is constantly 2 consisting of 100 niches for 300 generations and 10 independent trials. The maximum population size will be 1000*2 and selection will be fitness-based (higher fitness means higher chances of reproduction) and niche limited (individuals reproduce independently in each niche and compete only within a niche),
You can also take a look at scripts/run/reproduce_gecco.py to see which flags were used for the simulations presented in the paper.
Running all simulations requires some days. You can instead download the data produced by running scripts/run/reproduce_gecco.py from this google folder and unzip them under the projects directory.
- URL:
-
Contact:
Eleni Nisioti
7.1.8 SAPIENS
-
Name:
SAPIENS: Structuring multi-Agent toPology for Innovation through ExperieNce Sharing
-
Keywords:
Reinforcement learning, Multi-agent
-
Functional Description:
SAPIENS is a reinforcement learning algorithm where multiple off-policy agents solve the same task in parallel and exchange experiences on the go. The group is characterized by its topology, a graph that determines who communicates with whom.
All agents are DQNs and exchange experiences have the form of transitions from their replay buffers.
Using SAPIENS we can define groups of agents that are connected with others based on a a) fully-connected topology b) small-world topology c) ring topology or d) dynamic topology.
Install required packages You can install all required python packages by creating a new conda environment containing the packages in environment.yml:
conda env create -f environment.yml
And then activating the environment:
conda activate sapiens
Example usages Under notebooks there is a Jupyter notebook that will guide you through setting up simulations with a fully-connected and a dynamic social network structure for solving Wordcraft tasks. It also explains how you can access visualizations of the metrics produced during th$
Reproducing the paper results Scripts under the scripts directory are useful for reproducing results and figures appearing in the paper.
With scripts/reproduce_runs.py you can run all simulations presented in the paper from scratch.
This file is useful for looking at how the experiments were configured but better avoid running it: simulations will run locally and sequentially and will take months to complete.
Instead, you can access the data files output by simulations on this online repo.
Download this zip file and uncompress it under the projects directory. This should create a projects/paper_done sub-directory.
You can now reproduce all visualization presented in the paper. Run:
python scripts/reproduce_visuals.py
This will save some general plots under visuals, while project-specific plots are saved under the corresponding project in projects/paper_done
- URL:
-
Contact:
Eleni Nisioti
7.1.9 architect-builder-abig
-
Name:
Architect-Builder Iterated Guiding
-
Keyword:
Artificial intelligence
-
Functional Description:
Codebase for the paper Learning to guide and to be guided in the Architect-Builder Problem
ABIG stands for Architect-Builder Iterated Guiding and is an algorithmic solution to the Architect-Builder Problem. The algorithm leverages a learned model of the builder to guide it while the builder uses self-imitation learning to reinforce its guided behavior.
- URL:
-
Contact:
Tristan Karch
7.1.10 EAGER
-
Name:
Exploit question-Answering Grounding for effective Exploration in language-conditioned Reinforcement learning
-
Keywords:
Reinforcement learning, Language, Question Generation Question Answering, Reward shaping
-
Functional Description:
A novel QG/QA framework for RL called EAGER In EAGER, an agent reuses the initial language goal sentence to generate a set of questions (QG): each of these self-generated questions defines an auxiliary objective. Here, generating a question consists in masking a word of the initial language goal. Then the agent tries to answer these questions (guess the missing word) only by observing its trajectory so far. When it manages to answer a question correctly (QA) it obtains an intrinsic reward proportional to its confidence in the answer. The QA module is trained using a set of successful example trajectories. If the agent follows a path too different from correct ones at some point in its trajectory, the QA module will not answer the question correctly, resulting in zero intrinsic reward. The sum of all the intrinsic rewards measures the quality of a trajectory in relation to the given goal. In other words, maximizing this intrinsic reward incentivizes the agent to produce behaviour that unambiguously explains various aspects of the given goal.
- URL:
-
Contact:
Thomas Carta
7.1.11 Flow-Lenia
-
Name:
Flow Lenia: Mass conservation for the study of virtual creatures in continuous cellular automata
-
Keywords:
Cellular automaton, Self-organization
-
Functional Description:
This repo contains the code to run the Flow Lenia system which is a continuous parametrized cellular automaton with mass conservation. This work extends the classic Lenia system with mass conservation and allows to implement new feature like local parameter, environment components etc
Several declination of the system (1 or several channels etc ) are available
Please refer to the associated paper for the details of the system
Implemented in JAX
- URL:
-
Contact:
Gautier Hamon
7.1.12 Kidlearn: money game application
-
Functional Description:
The games is instantiated in a browser environment where students are proposed exercises in the form of money/token games (see Figure 1). For an exercise type, one object is presented with a given tagged price and the learner has to choose which combination of bank notes, coins or abstract tokens need to be taken from the wallet to buy the object, with various constraints depending on exercises parameters. The games have been developed using web technologies, HTML5, javascript and Django.




Figure 1: Four principal regions are defined in the graphical interface. The first is the wallet location where users can pick and drag the money items and drop them on the repository location to compose the correct price. The object and the price are present in the object location. Four different types of exercises exist: M : customer/one object, R : merchant/one object, MM : customer/two objects, RM : merchant/two objects. - URL:
-
Contact:
Benjamin Clement
7.1.13 cognitive-testbattery
-
Name:
Cognitive test battery of human attention and memory
-
Keywords:
Open Access, Cognitive sciences
-
Scientific Description:
Cognitive test batteries are widely used in diverse research fields, such as cognitive training, cognitive disorder assessment, or brain mechanism understanding. Although they need flexibility according to the objectives of their usage, most of the test batteries are not be available as open-source software and not be tuned by researchers in detail. The present study introduces an open-source cognitive test battery to assess attention and memory, using a javascript library, p5.js. Because of the ubiquitous nature of dynamic attention in our daily lives, it is crucial to have tools for its assessment or training. For that purpose, our test battery includes seven cognitive tasks (multiple-objects tracking, enumeration, go/no-go, load-induced blindness, task-switching, working memory, and memorability), common in cognitive science literature. By using the test battery, we conducted an online experiment to collect the benchmark data. Results conducted on two separate days showed the high cross-day reliability. Specifically, the task performance did not largely change with the different days. Besides, our test battery captures diverse individual differences and can evaluate them based on the cognitive factors extracted from latent factor analysis. Since we share our source code as open-source software, users can expand and manipulate experimental conditions flexibly. Our test battery is also flexible in terms of the experimental environment, i.e., it is possible to experiment either online or in a laboratory environment.
-
Functional Description:
The evaluation battery consists of 6 cognitive activities (serious games: multi-object tracking, enumeration, go/no-go, Corsi, load-induced blindness, taskswitching, memorability). Easily deployable as a web application, it can be re-used and modified for new experiments. The tool is documented in order to facilitate the deployment and the analysis of results.
- URL:
- Publication:
-
Contact:
Maxime Adolphe
-
Participant:
4 anonymous participants
7.1.14 Sensorimotor-lenia
-
Keywords:
Cellular automaton, Gradient descent, Curriculum Learning
-
Functional Description:
Source code for the search of sensorimotor agency in cellular automata associated to this blogpost https://developmentalsystems.org/sensorimotor-lenia/. The code allows to find rules in the cellular automata Lenia (through gradient descent, curriculum learning and diversity search) that lead to the self-organization of moving agents robust to perturbation by obstacles.
- URL:
-
Contact:
Gautier Hamon
7.1.15 Lamorel
-
Keywords:
Large Language Models, Reinforcement learning, Distributed computing
-
Scientific Description:
Lamorel allows for seamless scaling of LLMs when using embodied artificial agents such as Reinforcement Learning agents. One can use and modify the LLM in any part of such agents (policy, goal sampler, social peer...). Lamorel is particularly useful when performing large-scale experiments on clusters.
It was already used in several papers, notably leading to the first paper performing online RL on an LLM-based agent in an embodied environment (Carta et. al, 2023).
-
Functional Description:
Lamorel was initially designed to easily use LLMs in interactive environments. It is especially made for high throughput using a distributed architecture. The philosophy of *Lamorel* is to be very permissive and allow as much as possible usage of LLMs while maintaining scaling: the application should run with 1 or N LLMs.
For this reason, it is not specialised neither in RL nor in particular in RLHF. Our examples illustrate how *Lamorel* can be used for various applications including RLHF-like finetuning. However, one must understand that *Lamorel*'s philosophy means that users must implement themselves what they want to do with the LLM(s).
This is why we advise users knowing in advance they want to do RLHF, especially without any modification of classic implementations, to use libs specialised in RLHF that already come with RL implementations (e.g. RL4LMs, TRL). On the other hand, users more inclined to experiment with implementations or looking for an LLM lib they can use in different projects may prefer Lamorel.
Here are Lamorel's key features: 1. Abstracts the use of LLMs (e.g. tonekization, batches) into simple calls
2. Provides a method to compute the log probability of token sequences (e.g. action commands) given a prompt 3. Is made for scaling up your experiments by deploying multiple instances of the LLM and dispatching the computation thanks to a simple configuration file 4. Provides access to open-sourced LLMs from the Hugging Face's hub along with Model Parallelism to use multiple GPUs for an LLM instance 5. Allows one to give their own PyTorch modules to compute custom operations (e.g. to add new heads on top of the LLM) 6. Allows one to train the LLM (or part of it) thanks to a Data Parallelism setup where the user provides its own update method
- URL:
- Publications:
-
Contact:
Clément Romac
7.1.16 GLAM
-
Name:
Grounding LAnguage Models
-
Keywords:
Large Language Models, Reinforcement learning
-
Scientific Description:
Recent works successfully leveraged Large Language Models' (LLM) abilities to capture abstract knowledge about world's physics to solve decision-making problems. Yet, the alignment between LLMs' knowledge and the environment can be wrong and limit functional competence due to lack of grounding. In this paper, we study an approach (named GLAM) to achieve this alignment through functional grounding: we consider an agent using an LLM as a policy that is progressively updated as the agent interacts with the environment, leveraging online Reinforcement Learning to improve its performance to solve goals. Using an interactive textual environment designed to study higher-level forms of functional grounding, and a set of spatial and navigation tasks, we study several scientific questions: 1) Can LLMs boost sample efficiency for online learning of various RL tasks? 2) How can it boost different forms of generalization? 3) What is the impact of online learning? We study these questions by functionally grounding several variants (size, architecture) of FLAN-T5.
-
Functional Description:
GLAM is a new approach to achieve alignment between a Large Language Model (LLM) and a considered environment/world through functional grounding: we consider an agent using an LLM as a policy that is progressively updated as the agent interacts with the environment, leveraging online Reinforcement Learning to improve its performance to solve goals.
- URL:
- Publication:
-
Contact:
Clément Romac
7.1.17 SBMLtoODEjax
-
Keywords:
SBML, JAX, Python, Numerical simulations, Numerical optimization, Automatic differentiation, Ordinary differential equations, Biomedical data
-
Scientific Description:
Advances in bioengineering and biomedicine demand a deep understanding of the dynamic behavior of biological systems, ranging from protein pathways to complex cellular processes. Biological networks like gene regulatory networks and protein pathways are key drivers of embryogenesis and physiological processes. Comprehending their diverse behaviors is essential for tackling diseases, including cancer, as well as for engineering novel biological constructs. Despite the availability of extensive mathematical models represented in Systems Biology Markup Language (SBML), researchers face significant challenges in exploring the full spectrum of behaviors and optimizing interventions to efficiently shape those behaviors. Existing tools designed for simulation of biological network models are not tailored to facilitate interventions on network dynamics nor to facilitate automated discovery. Leveraging recent developments in machine learning (ML), this paper introduces SBMLtoODEjax, a lightweight library designed to seamlessly integrate SBML models with ML-supported pipelines, powered by JAX. SBMLtoODEjax facilitates the reuse and customization of SBML-based models, harnessing JAX's capabilities for efficient parallel simulations and optimization, with the aim to accelerate research in biological network analysis.
-
Functional Description:
SBMLtoODEjax extends SBMLtoODEpy, a python library developed in 2019 for converting SBML files into python files written in Numpy/Scipy. The chosen conventions for the generated variables and modules are slightly different from the standard SBML conventions (used in the SBMLtoODEpy library) with the aim here to accommodate for more flexible manipulations while preserving JAX-like functional programming style.
- URL:
- Publication:
-
Contact:
Mayalen Etcheverry
-
Partner:
Tufts University
7.1.18 Vivarium
-
Name:
Large-scale simulator for research and teaching in Artificial Intelligence and Artificial Life
-
Keywords:
Simulation, Artificial intelligence, Artificial Life, Multi-Agents System, Teaching of programming, Research
-
Functional Description:
This project aims to seize these opportunities through the design and implementation of a software platform providing an integrated simulation environment for research, teaching, and dissemination in the fields of Artificial Intelligence (AI) and Artificial Life (AL). The project is titled The Vivarium, which reflects a fundamental aspect of the convergence between these two domains: the emergence of complex behaviors, whether in the natural or artificial world, necessarily relies on a need for adaptation to a complex environment in which many autonomous entities interact.
It will be used as an educational software in a course from CISC Master at UPF-Barcelona in January 2025.
-
Release Contributions:
This release corresponds to the state of the repo after all fixes were made following the SDIC course at Universitat Pompeu Fabra of Barcelone (CSIM Master) in January 2025 .
This version mostly focuses on educational purposes, with ready-to-use practical sessions in notebooks/sessions. Corentin Léger was the main contributor over the last year.
-
News of the Year:
Corentin Léger, ingénieur de recherche recruté sur l'ANR JCJC ECOCURL (porté par Clément Moulin-Frier) a mené un gros travail de développement du logiciel au cours de l'année 2024. Ses applications pour l'enseignement sont maintenant validées. Le logiciel a notamment été utilisé pendant 10 heures de travaux pratiques dans le Master CSIC de Universitat Pompeu Fabra à Barcelone, Espagne.
- URL:
-
Contact:
Clément Moulin-Frier
-
Participant:
3 anonymous participants
7.1.19 LLM_Culture
-
Keywords:
LLM, Multi-Agents System, Natural language processing
-
Functional Description:
Code for the 'Cultural evolution in populations of Large Language Models' paper. This repository provides a comprehensive framework for studying the cultural evolution of linguistic content in populations of Large Language Models (LLM).
It allows organizing LLM agents into networks wherein each agent interacts with neighboring agents by exchanging texts. Each agent can be assigned specific personalities and transmission instructions, serving as prompts for generating new texts from their neighbors’ narratives. Once the network structure and agent characteristics are defined, you can simulate the cultural evolution of texts across generations of agents. We also provide built-in metrics and vizualizations to analyze the results.
- URL:
-
Contact:
Jeremy Perez
-
Participant:
2 anonymous participants
7.1.20 TelephoneGameLLMs
-
Keywords:
Large Language Models, Multi-Agents System, Cultural Evolution
-
Functional Description:
Code for the paper "When LLMs Play the Telephone Game: Cumulative Changes and Attractors in Iterated Cultural Transmissions" https://arxiv.org/abs/2407.04503 In this paper, we introduce conceptual and methodological tools for evaluating Large Language Models in multi-turn settings. Those tools are inspired by cultural evolutionary theory, and in particular by the concepts of cultural attractors.
- URL:
- Publication:
-
Contact:
Jeremy Perez
7.1.21 styr
-
Name:
Stick To Your Role
-
Keywords:
LLM, Cognitive sciences
-
Functional Description:
Code for our paper https://arxiv.org/abs/2402.14846 and leaderboard https://huggingface.co/spaces/flowers-team/StickToYourRoleLeaderboard.
Enables evaluating LLMs using personal value questionnaires (PVQ, SVS). More precisely, it instructs the LLM to simulated various personas and exposes it to different contexts (e.g. long reddit posts). Then it evaluates the value stability of the simulated population between those contexts. Additionally, it computes confirmatory factor analysis (CFI, SRMR, RMSEA), and the structure of expressed values (stress metric).
- URL:
-
Contact:
Grgur Kovac
7.1.22 transformerXL_PPO_JAX
-
Keywords:
Reinforcement learning, Transformer
-
Functional Description:
This repository provides a JAX implementation of TranformerXL with PPO in a RL setup following : "Stabilizing Transformers for Reinforcement Learning" from Parisotto et al. (https://arxiv.org/abs/1910.06764).
The code uses the PureJaxRL template for PPO and copied some of the code from hugging face trasnformer XL repo transferring it to JAX. We also took inspiration from the pytorch code in https://github.com/MarcoMeter/episodic-transformer-memory-ppo, which has some simplification of gradient propagation and positional encoding compared to transformerXL as it is described in the original paper (https://arxiv.org/abs/1901.02860). The training handles [Gymnax](https://github.com/RobertTLange/gymnax) environment.
We also tested it on Craftax, on which it beat the baseline presented in the paper (https://arxiv.org/abs/2402.16801) including PPO-RNN, training with unsupervised environment design and intrinsic motivation. Notably we reach the 3rd level (the sewer) and obtain several advanced advancements, which was not achieved by the methods presented in the paper. See Craftax Results for more informations.
The training of a 5M transformer on craftax for 1e9 steps (with 1024 environments) takes about 6h30 on a single A100.
-
Contact:
Gautier Hamon
7.1.23 ER-MRL
-
Keywords:
Reinforcement learning, Evolutionary Algorithms, Recurrent network
-
Functional Description:
Code for the "Evolving-Reservoirs-for-Meta-Reinforcement-Learning" (ER-MRL) paper (https://arxiv.org/abs/2312.06695).
We adopt a computational framework based on meta reinforcement learning, modeling the interplay between evolution and development. At the evolutionary scale, we evolve reservoirs, a family of recurrent neural networks generated from hyperparameters. These evolved reservoirs are then utilized to facilitate the learning of a behavioral policy through reinforcement learning. This is done by encoding the environment state through the reservoir before providing it to the agent's policy.
-
Contact:
Corentin Leger
7.1.24 LLM4Humanities
-
Keywords:
LLM, Python, Data Generator, Generative AI
-
Scientific Description:
Qualitative research in experimental psychology and the humanities often relies on manual annotation of textual data using defined codebooks. This process is indispensable but time-consuming and costly. Moreover, best practices require at least two independent annotators in order to compute inter-rater reliability (IRR), which further increases the required resources. IRR is crucial to distinguish variance due to coder subjectivity from variance due to the phenomenon under study, yet in practice it is frequently omitted, misreported, or computed using inadequate metrics (e.g., raw percentage agreement or simple correlations). The objective of the LLM4Humanities project is to design an open-source, Python-based toolkit and web application that leverages large language models (LLMs) to support, accelerate, and improve the methodological rigor of qualitative annotation workflows. In addition to annotation assistance, LLM4Humanities includes a generation mode designed to support the creation of experimental material. In this mode, users can select one or several template items (e.g., a mathematics exercise) and specify a set of constraints. The system then generates multiple new variants of the item. These generated items can subsequently be passed through the same annotation and evaluation pipeline, providing a first automated assessment of the quality and consistency of the generated content.
-
Functional Description:
LLM4Humanities is an open-source, Python-based toolkit and web application that integrates LLM-assisted annotation, inter-rater reliability analysis, and the generation of controlled variants of experimental material within a single end-to-end workflow
- URL:
-
Contact:
Olivier Clerc
-
Participant:
Olivier Clerc
7.2 New platforms
7.2.1 ToGather application
Participants: Cécile Mazon, Hélène Sauzéon, Eric Meyer, Isabeau Saint-Supery.
-
Name:
Application for Specialized education
-
Keywords:
Parent-professional relationships; user-centered design; school inclusion; autism spectrum disorder; ecosystemic approach
-
Participants:
Isabeau Saint-supery, Cécile Mazon, Hélène Sauzéon, Agilonaute
-
Scientific Description:
With participatory design methods, we have designed an interactive website application for educational purposes. This application aims to provide interactive services with continuously updated content for the stakeholders of school inclusion of children with specific educational needs. Especially, the services provide: 1) the student's profile with strengths and weaknesses; 2) an evaluation and monitoring over time of the student's repertoire of acquired, emerging or targeted skills; 3) a shared notebook of effective psycho-educational solutions for the student ; 4) a shared messaging system for exchanging "news" about the student and his/her family and, 5) a meeting manager allowing updates of evaluations (student progress). This application is currently assessed with a field study. Then, it will be transferred to the Academy of Nouvelle-Aquitaine-Bordeaux of the National Education Ministery.
-
URL:
The website is not online yet, but all informations such as tutorials are here.
- Publication:
8 New results
The team's research program, within the domain of developmental artificial intelligence, aims to study mechanisms of open-ended learning, and in particular the role of curiosity-driven autotelic learning and the role of language as a cognitive tool. We study these topics both in humans and AI systems, both at the level of individuals and at the level of cultural groups, and both at the fundamental and application levels.
Here, we present our recent results along the following research dimensions:
- Open-ended learning and autotelic AI with large language models;
- Models of cultural evolution in humans and AI systems;
- An Eco-Evo-Devo perspective on Artificial Intelligence;
- Generative AI and educational technologies;
- Theories of human curiosity-driven learning
- Curiosity-driven learning in educational technologies;
- Curiosity-driven AI for assisted scientific discovery;
8.1 Open-ended learning and autotelic AI with large language models
The team continued to lay the foundations of autotelic AI 89, i.e. the science stuyding mechanisms enabling artificial agents to learn to represent and sample their own goals and achieve open-ended learning.
8.1.1 ACES: Generating a Diversity of Challenging Programming Puzzles with Autotelic Generative Models
Participants: Julien Pourcel [correspondant], Cédric Colas, Gaia Molinaro, Pierre-Yves Oudeyer, Laetitia Teodorescu.
Motivation.
In this project, we examine how one can generate an interesting diversity of programming puzzles (same domain as Codeplay). We recall that this is an important case study for linguistic autotelic agents because it is a first step towards generalist agents inventing their own problems. Inspired by the Evolution Through Large Models (ELM) method where authors evolve robot morphologies expressed as Sodarace programs using a Large Language Model as a mutation operator, we aim to develop an evolutionary method to create a diverse population of problems using pretrained Language Models. We remark that diversity-producing methods (such as Map-Elites) need a Behavioral Characterization (BC) space in which to measure the diversity of their evolved populations; this is feasible with virtual creatures but seems pretty hard with programming puzzles. We thus introduce the notion of a Semantic BC space, composed of abstract categories, and labelling inside this space is done through LLM responses. In our case, we introduce 10 programming descriptors:
- 0 - Sorting and Searching
- 1 - Counting and Combinatorics
- 2 - Trees and Graphs
- 3 - Mathematical Foundations
- 4 - Bit Manipulation
- 5 - String Manipulation
- 6 - Geometry and Grid Problems
- 7 - Recursion and Dynamic Programming
- 8 - Stacks and Queues
- 9 - Optimization Algorithms
We then define an archive of generated programming puzzles and their solutions, and the position of a puzzle in the archive is given by the combination of descriptors that the puzzle-solution pair belongs to (the semantic representation of a puzzle thus being a 10-dimensional vector). The semantic archive is used to store puzzles.
We then perform experiments with the following algorithms:
- ACES: our proposed method samples a target cell (combination of descriptors) in the archive at random and populates a few-shot prompt for the language model with puzzles from neighboring cells in the archive. See Figure 2 for an illustration.
- ELM Semantic: based on ELM, example puzzles and solutions are given as few-shot in-context examples and a puzzle sampled from the archive is then mutated.
- ELM: same as the previous one, except we do not use the semantic archive for sampling: instead we build an archive with centroidal voronoi tessellations, from the embedding of puzzles inside the latent space of a Language Model. This baseline allows us to compare the semantic archive with a more classical one;
- Static Gen: In this method, puzzles are sampled from the train set and added as few-shot examples in the prompt;
For all experiments we seed the archive with the P3 train set.
Results.
We report results of our runs in Figure 3. Overall, the methods based on semantic archives, ACES and ELM-Semantic, achieve the highest diversity in the semantic space. We report diversity measures inside the embedding spaces of various smaller language models in Figure 4. In these figures we see that overall ACES outperforms other methods in this measure of diversity. We additionally perform tests of the suitability of generated puzzles as finetuning data for smaller LMs. For all methods, we finetune a smaller model (OpenLlama-3b) on the generated set and we test the pass@k metric for different values of k on the P3 test set; we report the scores in Figure 5. From that figure we see that we encounter a tradeoff between how diverse the data is and how useful it is to get a high score on the P3 test set. Further work is needed to get data that is both diverse and useful.a
Overview of ACES. ACES maintains an archive of discovered puzzles grouped into cells indexed by their semantic representation (skill combination). ACES runs in several steps: 1) sample a target semantic goal and relevant examples from the archive. 2) given these, generate a puzzle f and its solution g with the puzzle generator. 3) test the validity of that pair by running assert(f(g()) in the interpreter. 4) if the pair is valid, obtain its semantic representation with the puzzle labeler. 5) add the new pair to its corresponding cell in the archive.
Diversity of generated puzzles in semantic space. We report the evolution of several diversity metrics computed in the semantic space as a function of the number of puzzle-solution pairs generated by the puzzle generator. Semantic algorithms (algname and ELM semantic) achieve higher diversity in the semantic space.
Diversity of generated puzzles in embedding spaces. We report the evolution of the pairwise distance between puzzle-solution pair embeddings as a function of the number of generated puzzle-solution pairs, for three different embedding representation spaces (average across seeds).
Downstream performance on the P3 test set. Pass@k is the fraction of puzzles solved after attempts ([1:10]). Green overlaps with yellow.
8.1.2 MAGELLAN: Metacognitive Generalization of Learning Progress for Online RL in LLM agents
Participants: Loris Gaven [correspondant], Thomas Carta, Clément Romac, Cédric Colas, Pierre-Yves Oudeyer, Olivier Sigaud [ISIR Sorbonne Université, Paris, France], Sylvain Lamprier [Univ Angers, LERIA].
We are developing MAGELLAN47, a method designed to enable LLM-based reinforcement learning (RL) agents to estimate their own Learning Progress (LP) and use it to dynamically organize their training curriculum. By leveraging the LLM's rich semantic representations, MAGELLAN allows agents to generalize LP estimations to unseen, language-defined goals, overcoming limitations of classical methods that require direct evaluation of each goal.
MAGELLAN uses the LLM to generate latent representations of goals and tasks, capturing their semantic relationships. It continuously monitors the agent's performance over time, estimating LP as the change in success rates for specific goals. This approach enables the agent to identify goals where it is making progress and focus its training on those areas. MAGELLAN's integration ensures that the LLM-based agent can simultaneously refine its policy and competence estimations, adapting both to new tasks in real time.
Our experiments in the Little-Zoo environment, which features hierarchical and commonsense-driven tasks, demonstrate that MAGELLAN effectively prioritizes high-LP goals, even when faced with novel or unseen tasks. Unlike traditional LP estimation methods, which rely on direct evaluations and struggle with generalization, MAGELLAN enables the agent to quickly identify meaningful learning opportunities. This results in faster adaptation, improved sample efficiency, and more effective curriculum organization, paving the way for truly autonomous agents capable of navigating vast and complex goal spaces.
8.1.3 When goals are beyond reach: Metacognitive monitoring guides autonomous discovery of frugal assistance-seeking in LLMs
Participants: Clément Romac [correspondant], Pierre-Yves Oudeyer.
Enhancing LLMs with metacognitive capabilities has been identified as a key challenge for improving the trustworthiness and interpretability of these models. In this work, we investigate how such metacognitive abilities can be leveraged to trigger external assistance when the model’s own capabilities are insufficient. While improving LLMs’ is essential, it is equally critical that models learn to recognize their own limitations—and to seek or rely on external support in real-world settings where functional competence may be partial or underdeveloped. This ability forms a crucial part of a broader learning loop: requesting help when needed, then internalizing the knowledge or skills acquired through that assistance.
Augmenting LLMs with external assistance and, in particular, what has been named "tools", has become a well-established practice. These augmentations range from calculators and retrieval systems to code interpreters, and even other LLMs. This shift has led to a rethinking of the role of LLMs—not as general-purpose solvers, but as assistants (often referred to as action models) that must learn to orchestrate the use of external resources and integrate their outputs into coherent, human-readable responses.
This reframing introduces a new class of decision-making problems: LLMs must determine when and which external assistance to invoke. However, the optimal assistance strategy is not known in advance. Some tasks may be solvable independently by the LLM, while others may require external help. Additionally, the tools themselves may be fallible—for instance, even large or specialized LLMs can return suboptimal results. To address this, most prior approaches rely on supervised learning, fine-tuning LLMs on curated datasets containing examples of effective tool use. More recently, several works have begun exploring how RL can be used to learn assistance-seeking strategies from scratch, without requiring predefined tool-use demonstrations.
While both RL and more conventional supervised learning approaches have shown promise, an important dimension of the assistance-seeking problem remains largely understudied: external assistance comes at a cost. This cost may take the form of increased latency in the LLM’s response, financial charges for calling APIs, or computational overhead. Although early work on tool use—such as—acknowledged this issue, it has received limited attention since. A recent exception is, which introduced a first approach to this multi-objective problem: maximizing task performance while minimizing assistance costs. Their method involves a multi-stage learning pipeline: (1) an estimator of LLM performance is trained using interaction data between the LLM and the task space; (2) a separate model is trained to simulate the outputs of both the LLM and its assistance sources; and (3) given a predefined cost budget, Dynamic Programming is used to derive the optimal assistance strategy. While effective, this method is computationally intensive and requires extensive data collection and training across multiple stages.
In this work, we propose a fully online approach (see Figure 6) based on multi-objective contextual multi-armed bandits. Given a task, we frame the decision of whether to keep the task with the LLM or delegate it to external assistance as the selection of an arm. Given a task, we consider the dual objective of maximizing the answer's performance while minimizing its cost , and we adopt scalarization—i.e., combining the two objectives into a single weighted sum that our approach aims to maximize. Crucially, our method naturally adapts to any specified user-specified budget by treating the budget as the scalarization weight that balances the two objectives. A central challenge of this approach lies in efficiently estimating the performance and cost associated with each option (i.e., the LLM and all available assistance sources), using as few interactions as possible. To address this, we draw inspiration from MAGELLAN and leverage the LLM itself to learn these estimations.
We first evaluate our method on a set of carefully designed math problems with calculator tools as assistance, for which the optimal strategy is known. This setting enables us to investigate how the strategy discovered by our method compares to the optimal one, as well as the sample efficiency of our approach (i.e., the number of interactions required to converge). We notably show that our LLM-based estimation of performance and cost reaches similar or even better performance than a classic moving average approach which has access to privileged information—namely, the problem category. Finally, we demonstrate the broader applicability of our method by applying it to real-world problems faced by LLMs. In particular, we apply it to a standard question-answering benchmark: MMLU-Pro. The results show that our approach is scalable to complex natural language tasks without access to any external expert knowledge.
We study how LLMs can autonomously learn how and when to use tools when solving tasks.
8.1.4 LLM-based goal generation for autotellic exploration with goal-conditioned RL
Participants: Guillaume Pourcel [correspondant], Grgur Kovač, Thomas Carta, Cédric Colas, Pierre-Yves Oudeyer.
Designing autotelic agents capable of autonomously generating and pursuing their own goals represents a promising endeavor for open-ended learning and skill acquisition in reinforcement learning. This challenge is especially difficult in open worlds that require inventing new previously unobserved goals. In this work, we propose an architecture where a single generalist autotelic agent is trained on an automatic curriculum of goals. We leverage large language models (LLMs) to generate goals as code for reward functions based on learnability and difficulty estimates. The goal-conditioned RL agent is trained on those goals sampled based on learning progress. We compare our method to an adaptation of OMNI-EPIC to goal-conditioned RL. Our preliminary experiments imply that our method generates a higher proportion of learnable goals, suggesting better adaptation to the goalconditioned learner. This work is described in this technical report.
The used architecture.
8.1.5 Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI
Participants: Julien Pourcel [correspondant], Cédric Colas, Pierre-Yves Oudeyer.
In our work on SOAR (Self-improving Operators for Automated program Refinements), published at ICML 2025, we address a fundamental limitation in program synthesis: while large language models struggle to solve complex tasks in single attempts, traditional evolutionary approaches are constrained by the fixed capabilities of their underlying generative models. We developed a framework that integrates language models into a self-improving evolutionary loop, enabling continuous performance enhancement through experience rather than relying on static model capabilities.
SOAR architecture.
Our method operates through an iterative two-phase process that we designed to create a virtuous cycle of improvement. First, an evolutionary search phase employs a language model to sample and refine candidate program solutions. Second, a hindsight learning phase converts these search attempts—both successful and unsuccessful—into valid problem-solution pairs that we use to fine-tune the LLM's sampling and refinement capabilities. This approach leverages positive transfer between the sampling and refinement fine-tuning tasks, allowing the system to bootstrap its own improvement without requiring human-engineered training data.
We evaluated SOAR on the challenging ARC-AGI benchmark, which tests abstract reasoning and program induction capabilities. Our framework solves 52% of the public test set, establishing state-of-the-art results for program synthesis using open-source language models. These improvements compound through iterations, with models showing enhanced abilities to both generate initial program ideas and refine existing solutions. Notably, the gains carry over to test-time adaptation, enabling continuous improvement on target problems even after deployment.
Our research demonstrates that program synthesis systems can transcend the limitations of their base models through self-improvement, opening new possibilities for autonomous AI development. By showing how iterative model improvement can overcome performance plateaus inherent to search methods, SOAR provides a drop-in upgrade for existing systems like AlphaEvolve or ShinkaEvolve, transforming their fixed LLM operators into continuously improving ones.
8.1.6 WorldLLM: Improving LLMs' world modeling using curiosity-driven theory-making
Participants: Guillaume Levy, Cedric Colas, Pierre-Yves Oudeyer, Thomas Carta, Cément Romac [correspondant].
Large Language Models (LLMs) possess broad knowledge about the world, but leveraging this knowledge for precise dynamics modeling remains challenging. While LLMs can engage in general reasoning, they struggle to make accurate predictions in specific domains with structured observations and dynamics, such as physics simulations or video games. This limitation stems from the gap between their general capabilities and the need for grounded, domain-specific understanding.
In this paper, we present WorldLLM59, a framework for autonomous improvement of an LLM's world modeling abilities. Our approach combines 1) probabilistic theory induction to produce hypotheses that are given in our LLM's prompt to improve its predictions and 2) curiosity-driven RL to explore the environment and collect transitions poorly predicted by the current hypotheses (see Figure 9). Formally, our LLM's world model is the conditional probability , where represents a state, an action, and a set of natural language hypothesized theories. This probability is computed by the LLM by giving it , , and in its prompt and taking the probability of to follow this prompt. Our key insight is that natural language theories can help ground an LLM's broad knowledge into precise predictive power by providing domain-specific rules. Our approach consists of three interacting components: (1) our LLM that computes by conditioning its predictions on both a state-action pair and the current hypotheses, (2) a theory generator that updates natural language hypotheses using Bayesian inference, and (3) a curiosity-driven reinforcement learning agent trained to collect evidence against the current hypotheses. Inspired by how humans, from children to scientists, actively update their internal world model by performing experiments, our agent's exploration provides new evidence for hypothesis refinement, creating a virtuous cycle of improvement.
WorldLLM.
We demonstrate our approach in a video game environment where agents manipulate and combine objects, showing that WorldLLM successfully learns accurate predictive models while generating human-interpretable theories about environment dynamics. This work contributes to a growing body of research on improving LLMs' world modeling capabilities and grounding their knowledge in specific domains. By combining ideas from theory-based RL, Bayesian inference, and active exploration, we provide a framework for learning structured, interpretable world models that leverage both the broad knowledge of LLMs and domain-specific experiences without any costly gradient-based learning.
8.1.7 HERAKLES: Hierarchical Skill Compilation for Open-ended LLM Agents
Participants: Thomas Carta [correspondant], Cément Romac, Loris Gaven, Pierre-Yves Oudeyer, Olivier Sigaud, Sylvain Lamprier.
In our work on HERAKLES (HiERarchicAl sKill compiLation for open-Ended llm agentS), we address a fundamental challenge in open-ended AI: as goal spaces expand, increasingly complex goals require composing multiple elementary actions, leading to combinatorial explosion that impedes learning progress. While existing hierarchical reinforcement learning approaches rely on expert-defined skill spaces and pre-trained low-level policies, such designs are inadequate for open-ended scenarios where goal spaces naturally diversify across a broad spectrum of difficulties. We developed a framework that enables continuous skill compilation, dynamically expanding the agent's capabilities through experience rather than relying on fixed, predefined abstractions.
Our method operates through a two-level hierarchical architecture designed to create a virtuous cycle of skill acquisition. A high-level policy, instantiated as a Large Language Model, decomposes complex goals into subgoals and selects skills from an evolving skill space. A low-level policy, implemented as lightweight neural networks, executes these skills through primitive actions. Crucially, as the hierarchical agent masters a goal, the complete trajectory is compiled into the low-level policy as a new reusable skill. A competence estimator predicts the low-level policy's success probability for each skill, ensuring the high-level policy only invokes skills that can be reliably executed. This approach leverages language's compositional and combinatorial properties to structure the skill space, enabling generalization across semantically related goals.
We evaluated HERAKLES in the Crafter environment, a procedurally generated Minecraft-like world designed to assess agent capabilities within a unified open-ended framework. Our framework achieves Crafter scores above 70, while baselines plateau below 30. More importantly, HERAKLES scales near-linearly with goal difficulty, whereas non-hierarchical methods exhibit exponential growth in learning time. The framework also demonstrates strong generalization: when tested on synonymous goal formulations, HERAKLES shows only a 16% performance drop compared to 24-27% for baselines, and maintains robust performance on compositional variants requiring repeated skill execution.
Our research demonstrates that open-ended agents can transcend the limitations of fixed skill spaces through continuous compilation, opening new possibilities for lifelong learning systems. By showing how mastered behaviors can be recursively encoded at lower levels for rapid reuse—mirroring how humans overcome complexity barriers through hierarchical learning—HERAKLES provides a principled approach for building agents that autonomously expand their competencies over time.
8.1.8 Software Engineering Agents for Embodied Controller Generation : A Study in Minigrid Environments
Participants: Timothé Boulet [correspondant], Xavier Hinaut [Mnemosyne, Inria Bordeaux], Clément Moulin-Frier, Nathanaël Fijalkow.
Motivation.
Software Engineering Agents (SWE-Agents) have proven effective for traditional software engineering tasks with accessible codebases, but their performance for embodied tasks requiring well-designed information discovery remains unexplored. In this paper 44, we present the first extended evaluation of SWE-Agents on controller generation for embodied tasks, adapting Mini-SWE-Agent (MSWEA) to solve 20 diverse embodied tasks from the Minigrid environment. Our experiments compare agent performance across different information access conditions: with and without environment source code access, and with varying capabilities for interactive exploration. We quantify how different information access levels affect SWE-Agent performance for embodied tasks and analyze the relative importance of static code analysis versus dynamic exploration for task solving. This work establishes controller generation for embodied tasks as a crucial evaluation domain for SWE-Agents and provides baseline results for future research in efficient reasoning systems.
This work investigates a fundamental question: How do SWE-Agents perform in controller generation for embodied tasks ? Our approach involves a code-agent (the SWE-Agent interacting with a code-environment involving codebases and terminals) that generates controller-agents (Python programs) to solve tasks in an embodied setup, creating a two-level agency structure that differs from direct LLM-environment interaction approaches common in embodied AI. Figure 10 illustrates this two-level agency structure. The agent can evaluate its proposed solution by executing them in the environment and receiving feedback in the form of success/failure and reward. Task terminates either when the agent validates with a special command or when the maximum number of steps or cost is reached.
Two-level agency structure: a code-agent interacts with a code-environment to generates controller-agents (Python programs) to solve the embodied task.
We evaluate the challenge of controller generation for embodied tasks by adapting Mini-SWE-Agent (MSWEA) to solve diverse Minigrid tasks under different information access conditions:
- Source Code Access: When the agent can read Minigrid environment code, it can analyze environment mechanics, constraints, and object interactions to inform controller design.
- Interactive Exploration: When the agent can write and execute scripts to probe the environment, where it can discover dynamics through exploration, observing outcomes of actions in various states.
Results.
The best@5 success rates of MSWEA across different tasks and information access conditions are summarized in Figure 11. We display standard deviation as error bars in all our plots.
Minigrid PO was very hard to solve for MSWEA, with many tasks not being solved even with full access. In Minigrid FO however, all tasks except 1 are solved by at least MSWEA with full access. Partial Observability, as a component of embodied tasks, is thus a hard step for SWE Agents to solve.
Best@5 success rate of MSWEA across different tasks and information access conditions in Fully Observable Minigrid.
To identify patterns in the influence of the type of the task to the performance of different information access conditions, we grouped the average best@5 success rate metric into 4 categories : navigation, manipulation, hazard, memory, as well as the overall average across all tasks. The results are shown in Figures 12 and Figure 13.
Mean-by-category best@5 success rate in Fully Observable Minigrid
Mean-by-category best@5 success rate in Partially Observable Minigrid
In the Fully Observable benchmark, comparing MSWEA (blue bars) with its fully ablated version (red bars) without neither source code read access nor interactive exploration, we observe performance drop dramatically. An agent with only the Test-Access capability (i.e. being capable of testing its solution to obtain the success rate of its controller solution on the task) obtain much worse result, but surprisingly still manages to solve some tasks through iterated submissions.
If we try to get back to the MSWEA performance level by adding only the code access (cyan bars), we see very limited improvement, which means reading only help partially the agent and that the difficulty lies elsewhere. If we add only the interactive execution capability however (orange bars), we observe the performance get back to a comparable level as MSWEA. This pattern is consistent across all task categories and particularly for manipulation task, where the very exact knowledge of how the environment operates is required to solve the task. This systematic pattern means that the interactive access is an essential capacity of SWE-Agents that allows them to perform significantly better in embodied tasks.
In the Partially Observable benchmark, performance is much lower than in Minigrid FO, in particular for the complex manipulation tasks. We can note there are different patterns depending on the task category, but we will not try to interpret them as these may arise either from statistical variability given the relatively high standard errors, or from subtle hard to infer and task-specific factors that bias the agent’s behavior in ways not observed in similar tasks. The overall performance does not vary significantly with the information access conditions. We interpret this as the PO tasks being inherently too hard for MSWEA, such that the agent only solve the simplest tasks such as the easiest navigation tasks, and can make little use of different information access to increase performances. This leads us to believe that strongly embodied tasks such as Minigrid PO tasks represent a good benchmark for SWE Agents : they perform decently on some tasks, but on others, even with good LLMs and access to source code and execution access, they still have significant room for improvement regarding the understanding of the functioning of the environment. These results encourages the use of embodied tasks for future software engineering agents benchmarks.
8.2 Models of cultural evolution in humans and AI systems
As generative AI systems become powerful cultural transmission technologies that influence human cultural evolution in important ways, and can also have their own cultural processes through machine-machine large scale interaction, the study of the dynamics of cultural processes in populations of AI systems/humans becomes crucial.
8.2.1 The effect of social network structure on collective innovation
Participants: Eleni Nisioti [correspondant], Mateo Mahaut, Pierre-Yves Oudeyer, Ida Momennejad, Sebastian Risi, Pierre-Yves Oudeyer, Clément Moulin-Frier.
Innovations are a central component of open-ended skill acquisition: they denote the emergence of new solutions by the recombination of existing ones and their presence is necessary to ensure a continuous complexification of an agent's cultural repertoire. While we often tend to attribute discoveries to certain innovative individuals, if we shed a broad perspective at the history of our species we see that human innovation is primarily a collective process. Fields such as psychology and anthropology have been studying the ability of human groups to innovate for some time, with studies indicating that the social network structure has a significant impact: fully-connected structures are better suited for quick convergence in easy problems with clear global optima, while partially-connected structures perform best in difficult tasks where local optima may lure agents away from the globally optimal solution 94. At the same time a parallel story is unfolding in reinforcement learning (RL): distributed RL is a sub-field where multiple agents solve a task collectively 134. Compared to the single-agent paradigm, distributed RL algorithms converge quicker and often achieve superior performance. However, these algorithms have only considered full connectivity. In this inter-disciplinary project, we presented a novel learning framework that augments distributed RL with the notion of a social network structure and employed it to study the hypothesis from human studies that partial connectivity performs best in innovation tasks.
Cultural evolution in populations of RL agents.
We implemented such innovation tasks using Wordcraft, a recently introduced RL playground inspired from the Little Alchemy 2 game (see left of figure 14 for an illustration of how this task works). We considered a wide diversity of social network structures: static structures that remain constant throughout learning (fully-connected, ring, small-world) and a dynamic structure where the group oscillates between phases of low and high connectivity (we illustrate this dynamic structure on the right of figure 14). Each agent in our implementation employs the DQN learning algorithm and exchanges experiences that have the form of sequences of state-action combinations with its neighbors.
(Left) Illustration of an innovation task, consisting of an initial set of elements (Earth, Water) and a recipe book indicating which combinations create new elements. Upon creating a new element the player moves up an innovation level and receives a reward that increases monotonically with levels. (Right) Dynamic social network structures oscillate between phases of low connectivity, where experience sharing takes place within clusters, and high connectivity, where experiences spread between clusters.
A central conclusion of our empirical analysis was that the dynamic social network structure performs best. In addition to the performance groups achieve we measured behavioral and mnemonic metrics such as behavioral conformity and mnemonic diversity. Such metrics were inspired from human studies and helped us further analyze the behavior of groups. For example, one empirical observation was that sharing experiences did not help the group learn quicker in a very simple innovation task; instead the fully-connected group was the slowest. By looking at the diversity in the memories of the agents we observed that the fully-connected structure had the highest individual diversity (left of figure 15 ) and the lowest group diversity (right of figure 15): sharing experiences with others diversifies an individual's experiences but also homogenizes the group, which is bad for its performance.
Illustration of an innovation task, consisting of an initial set of elements (Earth, Water) and a recipe book indicating which combinations create new elements. Upon creating a new element the player moves up an innovation level and receives a reward that increases monotonically with levels. Dynamic social network structures oscillate between phases of low connectivity, where experience sharing takes place within clusters, and high connectivity, where experiences spread between clusters.
We see the contribution of this project as two-fold. From the perspective of fields studying human intelligence, we have shown that using RL algorithms as computational tool is a promising direction towards increasing the verisimilitude of simulations and analyzing both behavior and memory. From the perspective of RL, we have shown that distributed RL algorithm should move beyond the fully-connected architecture and explore groups with dynamic topologies. This work is currently a preprint 136 and is about to be submitted in PNAS. We open-source the code at this link.
Cultural evolution in populations of LLM agents.
In 2024, we have extended this framework with agents equipped with Large Language Models (LLMs) playing Little Alchemy 2, a creative video game originally developed for humans (figure 16). We, first, study an LLM in isolation and discover that it exhibits both useful skills and crucial limitations. We, then, study groups of LLMs that share information related to their behaviour and focus on the effect of social connectivity on collective performance. In agreement with previous human and computational studies (including the one described above), we observe that groups with dynamic connectivity out-compete fully-connected groups. Our work reveals opportunities and challenges for future studies of collective innovation that are becoming increasingly relevant as Generative Artificial Intelligence algorithms and humans innovate alongside each other. We published this work at the ALife 2024 conference 139.
Studying collective innovation in groups of LLMs: A) we experiment with Little Alchemy 2, a game where players combine real-world items to create new ones. A knowledge graph describes the possible combinations (we only present a small sub-part of the graph which contains 720 items in total) B) Alice-LLM and Bob-LLM are two LLMs playing the game together. They are provided with the same intro prompt, explaining the rules of the game, and the same task (they start with the same set of items). Alice-LLM and Bob-LLM have identical weights but behave differently because the state prompt depends on their crafting history. They are informed about the actions of others through their prompt. In this paper, we study how groups of such LLM agents are able to efficiently explore a knowledge graph, focusing in particular on the effect of different social structures specifying with whom and when they can share information
8.2.2 When LLMs Play the Telephone Game: Cultural Attractors as Conceptual Tools to Evaluate LLMs in Multi-turn Settings
Participants: Jérémy Perez [correspondant], Grgur Kovač, Corentin Léger, Cédric Colas, Gaia Molinaro, Maxime Derex, Pierre-Yves Oudeyer, Clément Moulin-Frier.
As large language models (LLMs) start interacting with each other and generating an increasing amount of text online, it becomes crucial to better understand how information is transformed as it passes from one LLM to the next. While significant research has examined individual LLM behaviors, existing studies have largely overlooked the collective behaviors and information distortions arising from iterated LLM interactions. Small biases, negligible at the single output level, risk being amplified in iterated interactions, potentially leading the content to evolve towards attractor states.
In this project, we ran a series of telephone game experiments, applying a transmission chain design borrowed from the human cultural evolution literature: LLM agents iteratively receive, produce, and transmit texts from the previous to the next agent in the chain.
This figures depicts the method to estimate the strength and position of theoretical attractors. Each dot in this figure corresponds to one chain, for a total of 100 chains (20 initial texts * 5 seeds). The position of a dot on the x-axis corresponds to the value of the property (positivity in this example) in the initial text, while the position on the y-axis corresponds to the value of this property of the text produced after 50 generations. We then used these 100 data points to fit a linear regression predicting the relationship between the initial and final values of the property.
Our main contributions are:
- We propose that there might be a gap in current LLM evaluations methods (single-turn evaluations might not be suited to assess the properties of multi-turn interactions)
- We empirically confirm this hypothesis by showing that multi-turn interactions indeed often lead to distributions of text properties that are significantly different from what is observed after a single interaction.
- We introduce novel conceptual and methodological tools to fill this gap, grounded in research in cultural evolution, and in particular the concept of cultural attractor.
- We showcase the potential of this method by applying it to compare the effect of different tasks, of different models, of temperature, and of fine-tuning on the properties of multi-turn interactions.
- We find several robust effects, such as the fact that less constrained tasks lead to stronger attractors, that some properties posses stronger attractors than others, and that fine-tuning can shift the position and modify the strength of attractors.
The heigth of the bars represent the position (top row) and strength (bottom row) of theoretical attractors, for each property (columns), task, and model. Less constrained tasks, such as Continue, appear to produce stronger attractors than more constrained tasks, such as Rephrase. Attractors appear to be stronger for toxicity than for length. Finally, we can notice that the position of attractors appears to vary between models.
These findings highlight the importance of accounting for multi-step transmission dynamics and represent a first step towards a more comprehensive understanding of LLM cultural dynamics.
This work was presented during a 15-minutes talk given at the 2024 Cultural Evolution Society conference, and was accepted as a conference paper at the International Conference on Representation Learning 2025 (ICLR 2025) Conference 144 The code is available at here. We also created a website featuring a Data Explorer tool, allowing to directly inspect the texts generated during our experiments.
8.2.3 Recursive Training Loops in LLMs: How training data properties modulate distribution shift in generated data?
Participants: Grgur Kovač [correspondant], Jérémy Perez [correspondant], Remy Portelas, Peter Ford Dominey, Pierre-Yves Oudeyer.
Large language models (LLMs) are increas- ingly used in the creation of online content, creating feedback loops as subsequent gener- ations of models will be trained on this syn- thetic data. Such loops were shown to lead to distribution shifts - models misrepresenting the true underlying distributions of human data (also called model collapse). However, how hu- man data properties affect such shifts remains poorly understood. In this paper, we provide the first empirical examination of the effect of such properties on the outcome of recursive training. We first confirm that using differ- ent human datasets leads to distribution shifts of different magnitudes. Through exhaustive manipulation of dataset properties combined with regression analyses, we then identify a set of properties associated with distribution shift magnitudes. Lexical diversity is found to am- plify these shifts, while semantic diversity and data quality mitigate them. Furthermore, we find that these influences are highly modular: data scrapped from a given internet domain has little influence on the content generated for an- other domain. Finally, experiments on political bias reveal that human data properties affect whether the initial bias will be amplified or re- duced. Overall, our results portray a novel view, where different parts of internet may undergo different types of distribution shift.
The main contributions of this work are:
- We propose and experimentally confirm the hypothesis that different training datasets lead to different distribution shift dynamics, motivating an investigation on the underlying causes.
- Through an extensive set of experiments (four datasets over three domains), we outline several data properties as influencing distribution shift dynamics.
- We reveal that these influences are highly modular, with generated content being mostly influenced by human data properties from the same domain.
- We find that distribution shifts also occur in terms of political lean, and that the type of shift (bias amplification, reduction or inversion) depends on the political lean of the human data.
Iterative chain In each generation, a fresh base model is fine-tuned on texts sampled from the Accumulated data pool (except generation 0, where it's trained only on human posts). The model generates posts, which are added to the pool alongside some newly sampled human posts.
This work was published as a conference paper at the EMNLP2025 conference 49.
8.2.4 Intrinsic motivation is key to understanding peer cultures
Participants: Jérémy Perez [correspondant], Maxime Derex, Pierre-Yves Oudeyer, Clément Moulin-Frier.
This paper 64 is a commentary to 123, as part of the call for open-peer commentary on this target article in Behavioral and Brain Sciences, 1–68. In the target paper, the authors make an intriguing case that peer cultures could play a key role in cultural adaptation by generating qualitatively different cultural variation compared to adult cultures. However, the mechanisms responsible for this distinction remain unclear. In out commentary, we discuss how accounting for the role of intrinsic motivation in shaping the content of peer cultures may help explain their evolutionary dynamics.
8.2.5 Cultural variation and regularities in intrinsically motivated exploration: investigating autonomous goal selection in BaYaka foragers and Bandongo fisher-farmers
Participants: Jérémy Perez [correspondant], Sarah Pope-Caldwell, Sheina Lew-Levy, Pierre-Yves Oudeyer, Maxime Derex, Clément Moulin-Frier.
TLDR: This study investigates how recent performance and recent progress influence autonomous goal selection in children and adults from two cultural groups in the Congo Basin. All data necessary for this project has been collected during Jérémy Perez's mission in Congo in July-August 2025, and analyses are still ongoing. This project was made possible through a collaboration with Sheina Lew-Levy from Durham University and Sarah Pope-Caldwell from Georgia State University. An abstract for this project will be submitted to the 2026 conference of the European Human Behaviour & Evolution Association.
Objective
: By influencing which goals individuals set for themselves, intrinsic motivation plays a central role in structuring autonomous learning trajectories. Grounded in theoretical work, recent empirical studies have uncovered the features that make an activity intrinsically motivating. For instance, having experienced recent progress towards a goal was found to influence the probability of selecting it, reflecting curiosity-driven exploration. However, these studies have exclusively focused on humans from Western cultures. The role of the cultural environment in determining the strategies used during intrinsically-motivated goal exploration thus remains unclear.
Method:
In the present study, we investigated how recent performance and recent progress influence autonomous goal selection in 60 Congolese BaYaka foragers (30 children, 30 adults) and 57 Bandongo fisher-farmers (29 children, 28 adults). To do so, we adapted the free-choice paradigm used in the previous studies, in which participants are free to select, and switch between, learning activities of different difficulties. Pre-registered analyses were used to uncover how recent performance and recent progress predict activity choices.
Preliminary results:
Preliminary results indicate that the strategies used by participants in the present study are qualitatively similar to those previously observed in western participants. Specifically, many Bandongo and BaYaka participants rely on recent progress to guide their activity choices. However, clear cross-cultural differences exist: for instance, recent performance had a greater influence on goal choices in Bandongo participants than in BaYaka participants. Our results also indicate noticeable heterogeneity within cultural groups with respect to the strategies guiding self-directed learning.
Conclusion:
By taking a cross-cultural perspective on intrinsic motivation, this study highlights the role of the cultural niche in shaping the mechanisms underlying self-directed learning, and contributes to building a more representative picture of human curiosity-driven exploration.
8.2.6 The cultural evolution of human goals: How individuals generate, select, and transmit goals
Participants: Jérémy Perez [correspondant], Cédric Colas, Gaia Molinaro, Pierre-Yves Oudeyer, Maxime Derex, Clément Moulin-Frier.
This work has been submitted to the special issue on Goal Dynamics in Cognition of the journal Topics in Cognitive Science, and is currently under review.
Abstract:
Humans pursue goals that are remarkably diverse and vary over time and cultures. These goals shape which behaviors are explored, valued, and socially transmitted, yet most theories of cultural evolution focus on how behaviors evolve while leaving the origins of goals unexamined. We argue that a complete understanding of cultural evolution requires explaining how goals themselves emerge, vary, and persist across generations. Building on studies of motivation and curiosity in cognitive science and artificial intelligence, we introduce the notion of cultural autotelic agents—individuals who actively generate, select, and transmit their own goals within social environments. By highlighting the cognitive and motivational mechanisms that drive goal formation and selection, this framework extends existing models of cultural evolution and helps explain the open-ended, self-propelling character of human culture.
We introduce the notion of cultural autotelic agents, i.e. agents that combine individual and social learning to represent, generate, select, and transmit their own goals. This model departs from the historical conceptualization as problem-solvers (left column). This standard perspective focuses on how agents optimize behaviors toward goals that are externally imposed. This view is largely present in research on individual cognition (top-left) and has inspired most experimental paradigms in cultural evolution (bottom left, e.g. transmission chains). Research on motivation, developmental psychology, and developmental artificial intelligence has extended this view toward the concept of autotelic agents, i.e. agents able to self-generate and purse their own goals (top-right). This conceptualization affords a more complete understanding of proactively exploratory behaviors in humans, in particular how their past behavior influence their goal generation and selection mechanisms. Here, we propose integrating such insights from cultural evolution and autotelic learning to introduce the concept of cultural autotelic agents (bottom-right). Under this view, agents are active in the generation and selection of the goals they pursue. These goal generation and selection mechanisms are influenced both by social information and individually collected information. We argue that this conceptualization is necessary to think about the cultural evolutionary dynamics of goals.
8.2.7 Evolving Interaction Protocols for Open-Ended Collective Innovation
Participants: Akhi Mocherla, Jérémy Perez, Eleni Nisioti, Cédric Colas.
In exploratory domains such as science, art, and design, progress emerges not from achieving predefined objectives but from accumulating novel and meaningful discoveries 140. Lab and field studies of human collective innovation have shown that a group's exploration and, thus, innovation abilities critically depend on how individuals communicate with each other 94, 130, 124. For example, increasing group connectivity speeds up innovation in the short-term but reduces diversity within the collective, negatively impacting long-term innovation. Partially connected groups thus accumulate the most innovations in deceptive search spaces. Computational studies have confirmed this in groups of evolving agents 121, 78, reinforcement learning agents 137, and Large Language Models (LLMs) 138, highlighting the key role of collective dynamics in engineered multi-agent systems. Despite this, systematic approaches for optimising how groups interact remain underdeveloped. In this work, we propose an approach for designing interaction protocols (IPs) that govern who communicates with whom, what is communicated, and when. Similarly to past computational studies 137, 138, we use the text-based game Little Alchemy 2 (LA2) as a test-bed of collective innovation. To explore the IP space systematically, we employ a Quality-Diversity (QD) algorithm 152 discovering repertoires with high performance and behavioral diversity. We maintain an archive of IPs, each evaluated via multiple trials. Similarly to previous works 100 , our approach follows the Novelty Search with Local Competition (NS-LC) paradigm and employs LLMs within QD for solution generation and novelty estimation.
This work has been presented as a poster at the 2025 workshop on Intrinsically Motivated Open-ended Learning (IMOL 2025).
Overview of the framework used to evolve Interaction Protocols for groups of agents playing the Little Alchemy 2 game. The system iteratively generates new IPs using a language model (LLM), evaluates their performance, and maintains an archive of candidate solutions. Candidate IPs are debugged and tested, then evaluated for fitness and novelty relative to archived solutions. The archive is updated based on fitness, and novel or improved protocols are used to guide further LLM generations, enabling continual improvement and diversity in discovered solutions. Comparison of the performance of an evolved IP to dynamic and fully-connected IPs from past studies.
8.2.8 Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks
Participants: Nicolas Yax [correspondant], Pierre-Yves Oudeyer, Stefano Palminteri.
In recent month the number of Large Language Models (LLMs) released has never been that high. On one hand, multiple private companies such as OPENAI, Claude, Google, Mistral, etc. are making cutting-edge models that have a lot of visibility in our modern society and science. However, as the number of LLMs is raising, the training methods are becoming more secretive making the field increasingly obscur to science. On the other hand, everyday, a few hundreds of open-access language models are uploaded on the hugging face hub which is far too much to keep track of the evolution of LLMs in the field. Knowing that not all of these open models are perfectly transparent about the training methods and only very few of them are benchmarked (due to the high cost of benchmarking) there is an increasing need for methods to help keep track of the progress and evolution of these models in the field.
We developped an algorithm, named PhyloLM, inspired from phylogenetics to compute evolutionary trees in LLMs. We show this method efficient in reconstructing the evolutionary history of LLMs within families 22, in discriminating the different families, and also in finding similarities between these families. Additionaly, the genetic information can be used to predict LLM capabilities like benchmark scores showing a very significant correlation between predicted and true scores. These advances could be instrumental in our way to navigate the field of LLMs by making the world of LLM more transparent at a very low cost. This was published at ICLR 2025.
Phylogenetic tree reconstruction. On the left it is shown the ground truth concerning the relation of some LLMs of the Mistral family. Right is the reconstruction from the phylogenetic algorithm for the five latest models of this family ("leaves" of the phylogenetic tree) on which we run PhyloLM. On the right, it is shown the reconstructed phylogenetic tree PhyloLM on the 5 "leafs" models. The numerical labels (0:3) map the true common ancestors (on the right, "ground truth") to the inferred ones (on the left, "reconstructed"). It can be seen that the true and the reconstructed trees are topologically equivalent
8.3 An Eco-Evo-Devo perspective on Artificial Intelligence
8.3.1 Research perspective: The Ecology of Open-Ended skill Acquisition
Participants: Clément Moulin-Frier [correspondant], Eleni Nisioti, Pierre-Yves Oudeyer.
An intriguing feature of the human species is our ability to continuously invent new problems and to proactively acquiring new skills in order to solve them: what is called Open-Ended Skill Acquisition (OESA). Understanding the mechanisms underlying OESA is an important scientific challenge in both cognitive science (e.g. by studying infant cognitive development) and in artificial intelligence (aiming at computational architectures capable of open-ended learning). Both fields, however, mostly focus on cognitive and social mechanisms at the scale of an individual’s life. It is rarely acknowledged that OESA, an ability that is fundamentally related to the characteristics of human intelligence, has been necessarily shaped by ecological, evolutionary and cultural mechanisms interacting at multiple spatiotemporal scales.
The ORIGINS framework identifies central components (boxes) and their interactions (arrows) driving Open-Ended Skill Acquisition, both in terms of its evolution from environmental complexity (roughly: left to right arrows) as well its open-ended aspect through feedback mechanisms (right to left arrows). The employed terminology reflects a diversity of mechanisms considered in both Artificial Intelligence and Human Behavioral Ecology.
We have recently initiated a new research direction aiming at understanding, modeling and simulating the dynamics of OESA in artificial systems, grounded in theories studying its eco-evolutionary bases in the human species. For this aim, we have proposed a conceptual framework, called ORIGINS (illustrated Fig. 23 and developed in 131), expressing the complex interactions between environmental, adaptive, multi-agent and cultural dynamics. This framework raises three main research questions:
- What are the ecological conditions favoring the evolution of autotelic agents?
- How to bootstrap the formation of a cultural repertoire in populations of adaptive agents?
- What is the role of cultural feedback effects in the open-ended dynamics of human skill acquisition?
The contributions described below are addressing some aspects of these research questions. Note that there might be a thematic overlap between the two last research questions outlined above and the previous section on Models of Cultural Evolution 8.2, where we also present related results.
8.3.2 Eco-evolutionary Dynamics of Non-episodic Neuroevolution in Large Multi-agent Environments
Participants: Gautier Hamon [correspondant], Eleni Nisioti, Clément Moulin-Frier.
This work was published in 2023 but we keep it in this report as it introduces a general computational framework, called non-episodic neuroevolution, that forms the basis of the two next contributions.
This contribution focuses on eco-evolutionary dynamics where "organisms are not solely products but, by modifying their niche and therefore its associated fitness landscape, are also causes of evolution" 120. The main objective of this paper is to propose a method for studying large-scale eco-evolutionary dynamics in agent-based simulations with a reasonable level of biological and ecological plausibility. For this aim, we implement a system with the following properties (see Fig. 24 for illustration):
- Non-episodic simulation environment with complex intrinsic dynamics. We model our environment after common-pool resource (CPR) appropriation problems, where a group of agents competes for finite resources. We extend an existing environment of CPR appropriation 145 with the presence of multiple niches, where resources regrow proportionally to the density of nearby resources at different rates in different regions of the environment (Fig 24). We prevent any environment or population reset during a whole simulation run, enabling coupled environmental and population dynamics leading to complex eco-evolutionary feedback effects.
- Continuous neuroevolution in a large, size-varying agent population The environment contains thousands of agents, each controlled by a neural network whose weights are optimized using neuroevolution 161
- Physiology-driven death and reproduction There is no notion of rewards, agents are instead equipped with a physiological system modulating their energy level according to the resources they consume, in a non-linear way. At the evolutionary scale, agents reproduce as long as they are able to maintain their energy level within a reasonable range and die if this level goes below a minimum threshold. This is departure from the notion of fitness-based selection and more in line with a minimal criterion selection 76. Note that the population size can vary with time.
Our simulation environment (Left) is an extension of the Common Pool Resource (CPR) environment 145, 122 : a two-dimensional grid-world where some cells contain resources (in green) that the agents (in black) can collect. Resources grow depending on the presence of other resources around them (local growth, Middle) with an additional very sparse spontaneous growth, which means that over-consumption may lead to their local depletion. We introduce a latitudinal model of resource regrowth. We prevent any environment and population reset during a whole simulation, enabling continual eco-evolutionary dynamics to take place. Each agent may reproduce or die according to a physiological model modulating its energy level as a function of life time and resource consumption (Top-Right). The population size varies during the simulation according to the current amount of available resources and the current ability of agents to collect them. Evolution occurs through the mutation of a parent's network weights when it produces an offspring.
In addition to experiments conducted in the large environment presented, we also conduct experiments in "lab environment" (as opposed to the "natural environment") to isolate the study of certain behavior (which are often intertwined with a lot of dynamics in the natural environment).
One interesting results of these simulation is the emergence of sustainable foragers which as shown in lab environment Fig.25 tends to not overconsume when there is enough resource in their neighbourhood. This allows to keep a certain amount of resource to spread which is therefore beneficial for their future survival as well as the survival of their offspring. (as there is no reset of the environment)
Greediness of a sustainable forager agent across evaluation environments that differ in the amount of resources. Sustainable agents are far less greedy in environments where there is a certain amount of resources available. This strategy allows to keep resources so that they spread and avoid overdepletion of resources.
This work was published at the Genetic and Evolutionary Computation Conference (GECCO) 2023. The computational framework it introduced led to the two next recent contributions.
8.3.3 Emergent kin selection of altruistic feeding via non-episodic neuroevolution
Participants: Max Taylor-Davies, Gautier Hamon, Timothe Boulet, Clément Moulin-Frier [correspondant].
This work extends the project presented in previous contribution Sec.8.3.2. It is the result from the visit in the team of Max Taylor-Davies doing his PhD at School of Informatics, University of Edinburgh, Scotland. It has been accepted at the EvoStar conference the International Conference on the Applications of Evolutionary Computation (Part of EvoStar) 55.
At first glance, it seems difficult to square the phenomenon of purely altruistic behaviour (acts which confer a benefit to the recipient at a cost to the actor) with the basic principle of natural selection: how can a gene be selected for when it decreases, rather than increases, the fitness of its host? One plausible account can be made through the theory of inclusive fitness. Key to this theory is the recognition that individual organisms within a social environment are not isolated from their conspecifics in terms of fitness. Whether a given gene is selected for is thus determined by its effect(s) on the fitness of any bearers of copies of that gene. Under this view, we can think of an altruistic act as an exchange of fitness from one agent to another. If the exchange is positive-sum and both sides are bearers of the gene in question, then from the gene's perspective the behaviour confers a fitness benefit–even while it decreases the fitness of the acting individual.
Kin selection theory 160 has proven to be a popular and widely accepted account of how altruistic behaviour can evolve under natural selection. Hamilton's rule, first published in 1964 107, 108, has since been experimentally validated across a range of different species and social behaviours. In contrast to this large body of work in natural populations, however, there has been relatively little study of kin selection in silico. In the current work, we offer what is to our knowledge the first demonstration of kin selection emerging naturally within a population of agents undergoing continuous neuroevolution. Specifically, we find that zero-sum transfer of resources from parents to their infant offspring evolves through kin selection in environments where it is hard for offspring to survive alone. In an additional experiment, we show that kin selection in our simulations relies on a combination of kin recognition and population viscosity. We believe that our work may contribute to the understanding of kin selection in minimal evolutionary systems, without explicit notions of genes and fitness maximisation.
The relationship between the estimated benefit to infants of being fed and both the amount and selectivity of feeding observed, shown separately for each of the three experimental parameters we varied (and combined in the rightmost column). Each scatterplot point represents a single 500k-timestep simulation run (with values averaged over the final 50k timesteps); regression lines (with 95% confidence intervals) are shown in green. Note that the -axis shows (measure) for both amount and selectivity.
This paper was accepted at The International Conference on the Applications of Evolutionary Computation (EvoAPPS) 2025 (part of EvoStar).
8.3.4 Evolving large populations of adaptive neural agents in ecologically plausible environments
Participants: Timothé Boulet [correspondant], Gautier Hamon, Clément Moulin-Frier.
This work continues the project presented in the previous paragraph, with a focus on the ability of agents to develop adaptability behaviors. Specifically, we extend the framework by adding fruits, a spatially variable ressource, and a memory of the values of each type of fruits for the agents. The goal is to observe whether the agents manage to exploit the knowledge of the fruits values to decide which fruit exploit.
Results : the agent were able to exploit the fruit value information to optimize their behavior. There were also some results that we were not necessarily expecting and that comes from our choice of model. Notably, it seems the agents choice for exploiting a cluster is heavily influenced by social criteria (the number of agents already exploiting it) and cultural criteria (whether the cluster is empty or full of fruits). This effect exceeds the adapatability effect in the latest stages of the simulations.
8.4 Theories and experiments on human curiosity-driven learning
8.4.1 DevCur Project: studying the co-development of curiosity, metacognition and agency in adolescents
Participants: Julien Rosenberger [correspondent], Pierre-Yves Oudeyer, Hélène Sauzéon.
Under the scope of the DevCur project, the PhD of Julien Rosenberg was started on the following topic: “How curiosity enhances learning across childhood and adolescence: Models and experimentation of the role of metacognition and agency”. After exploring the literature, a specific project was settled that aims to compare the personality constructs around the intellect. The investigated personality constructs are metacognitive skills (eg, 148), curiosity traits (eg, 115), sense of agency (eg, 162) and intellectual humility (eg, 73). Intellectual humility is about correctly setting one’s cognitive limitations 169. The self-report scale of Alfano et al 73 hinges intellectual humility on other intellectual traits: open-mindedness (recognizing one’s cognitive limitations and having appetite for knowledge without concerns for social status), intellectual modesty (having low concern for being deemed smart), engagement (being able to confront oneself to what one doesn’t understand or is different from one’s perspective) and corrigibility (being emotionally stable when one is intellectually challenged). The labels are slightly odd but emphasize the diversity of intellectual traits one could consider in learning.
A first axis is to understand the organization of those constructs. For instance, intellectual humility has been linked to greater general knowledge and a tendency to underestimate one’s cognitive ability 119. Those dependent variables are also respectively related to curiosity trait 167 and low self-esteem and low metacognition 165. A second axis is to obtain behavioral markers of those constructs. This need for situationally-bounded measures is crucial for intellectual humility 164. It is currently measured through self-reports or other-reports. Yet, self-reports pose an issue because being intellectually humble is socially desirable 119, requires some recall to form that self-referenced attribute and faces the paradox of self-attribution (ie some humility is required to say if one is humble). The other reports are alternatively resource intensive and brings other factors (context, relationship…).
8.5 Generative AI and educational technologies
8.5.1 Investigating the use of LLM in middle school.
Participants: Pierre-Yves Oudeyer, Hélène Sauzéon [correspondant], Rania Abdelghani.
ChatGPT, one of the most widely used generative AI (LLM) tools, has made accessing mass and personalized information easy and straightforward, even for users without expertise in AI. More particularly, recent reports indicate that the majority of surveyed students aged nine and older have already used this tool for school-related tasks. However, while we know that students are using ChatGPT, there is limited understanding of how they use it and its effects on their learning processes and outcomes, particularly among middle and high school students and in subjects outside programming.
Investigating these patterns of use is a critical step toward identifying the necessary educational interventions to mitigate risks associated with misuse or harmful interactions with ChatGPT, which are particularly likely among non-expert users. To address this, we recruited 63 students aged 14 to 15 and asked them to solve science problems using ChatGPT. We examined their prompt choices, evaluations of ChatGPT's responses, and final problem-solving outcomes. Overall, our results indicate that students are still inefficient users of AI tools such as ChatGPT and are vulnerable to incorporating its misinformation, even when they report high domain knowledge and previous experience with generative AI. This highlights potential misconceptions about these tools’ capabilities and the skills required to use them effectively. Furthermore, domain knowledge alone appears insufficient to shield students from adopting misinformation generated by ChatGPT. Implementing formal educational interventions to correct these misconceptions and train students for informed usage thus seems both timely and essential, given the growing reliance on generative AI tools in education. On the longer term, fostering metacognitive skills may further promote responsible and effective use of such tools (paper in preparation)
8.5.2 Study impact of a pedagogical intervention on GenAI in middle school students
Participants: Pierre-Yves Oudeyer, Hélène Sauzeon, Olivier Clerc [correspondant], Chloé Desvaux, Rania Abdelghani, Eliott Poisson, Kan Yao, Didier Roy.
Context and Objective.
Generative AI (GenAI) systems such as ChatGPT are increasingly used by students, including for schoolwork. A pilot study conducted in 2024 by the Flowers team with 63 students aged 14–15 showed that students experience major difficulties in formulating effective prompts and in evaluating the quality of AI-generated answers, which negatively impacts their performance in scientific problem-solving tasks. Building on this work, we evaluated the impact of a short pedagogical intervention (2 hours) aimed at improving students’ ability to formulate and critically evaluate prompts before querying a large language model (LLM).
Task
Students had to solve six middle-school science problems using ChatGPT (or a similar system such as DuckDuckAI). Each problem included a statement, an image, a question, and a suggested prompt. Students could choose to use or ignore the suggested prompt. Two types of prompts were provided: valid prompts (clear context and precise instructions) and invalid prompts (insufficient or vague context).
Schematic representation of the science exercise proposed to the children. The experimental task consisted of six science exercises to be completed within 90 minutes. Exercises were provided on paper sheets to prevent students from directly copying and pasting the task description and accompanying image into the chatbot interface.
Pedagogical Intervention.
For the experimental group, the study took place in two phases. Two days before the task session, students participated in a two-hour classroom workshop designed to strengthen their theoretical and practical understanding of GenAI. The workshop consisted of three parts:
- an introduction explaining how generative AI systems work,
- a discussion of their limitations, risks, and biases,
- a practical session in which students trained to analyze and reformulate prompts.
Main Results.
Overall, students who benefited from the pedagogical intervention achieved higher performance than those in the control group. In quantitative terms, the mean score (out of 20) was approximately 10.3 in the control group—comparable to the 2024 pilot study—and approximately 11.4 in the experimental group. This difference is statistically significant (p < .05). The experimental group not only obtained significantly higher scores, but also demonstrated more strategic use of the AI system. In particular, they rejected invalid prompts more frequently and were more likely to reformulate or refine their queries when the initial answer was unsatisfactory. Moreover, formulating their own prompts tended to maintain or improve performance, even in cases where the suggested prompt was already valid. Importantly, self-reported prior knowledge about AI was not associated with better performance, suggesting that explicit instruction and practice played a more decisive role than familiarity with the technology alone.
Workshop effects on performance and prompt acceptance
Future Research.
Future work will aim to extend and consolidate these findings in several directions. First, longitudinal studies will be needed to assess whether the strategies acquired during the workshop are retained over time and whether students continue to apply them beyond the immediate post-intervention period. Second, the present study focused on science problem solving with middle-school students. Future research should examine whether similar short pedagogical interventions yield comparable benefits in other subjects (e.g., mathematics, history, writing) and with learners of different age groups.
8.5.3 LLM4Humanities: An Open-Source Toolkit for LLM-Assisted Qualitative Research
Participants: Olivier Clerc [correspondant], Grgur Kovač, Chloé Desvaux, Gaia Molinaro, Pierre-Yves Oudeyer.
Context and Objective.
Qualitative research in experimental psychology and the humanities often relies on manual annotation of textual data using defined codebooks. This process is indispensable but time-consuming and costly. Moreover, best practices require at least two independent annotators in order to compute inter-rater reliability (IRR), which further increases the required resources. IRR is crucial to distinguish variance due to coder subjectivity from variance due to the phenomenon under study, yet in practice it is frequently omitted, misreported, or computed using inadequate metrics (e.g., raw percentage agreement or simple correlations). The objective of the LLM4Humanities project is to design an open-source, Python-based toolkit and web application that leverages LLMs to support, accelerate, and improve the methodological rigor of qualitative annotation workflows.
System and Workflow.
LLM4Humanities provides an end-to-end pipeline combining manual annotation, automated classification, and statistical evaluation. In a typical workflow, researchers first manually annotate a small subset of the dataset. An LLM is then used to automatically classify the remaining data. The system subsequently compares the model’s predictions to the human-annotated subset using appropriate IRR metrics, confidence intervals, and decision guidance, allowing researchers to assess both annotation reliability and model performance.
Generation Mode.
In addition to annotation assistance, LLM4Humanities includes a generation mode designed to support the creation of experimental material. In this mode, users can select one or several template items (e.g., a mathematics exercise) and specify a set of constraints. The system then generates multiple new variants of the item. These generated items can subsequently be passed through the same annotation and evaluation pipeline, providing a first automated assessment of the quality and consistency of the generated content.
8.5.4 GAIMHE: Generative AI and Hybrid Models for Education
Participants: Pierre-Yves Oudeyer, Olivier Clerc, Hélène Sauzéon, EvidenceB .
Context and Objective.
Recent advances in generative AI have opened new possibilities for personalized education, but fully LLM-based educational systems raise major concerns in terms of cost, scalability, robustness, pedagogical control, and environmental impact. At the same time, classical Intelligent Tutoring Systems (ITS) offer strong pedagogical structure and efficiency, but lack flexibility for open-ended interaction and content generation. The GAIMHE project aims to design and evaluate hybrid educational architectures that combine the strengths of both approaches: frugal and pedagogically robust ITS for macro-level orchestration, and generative AI models (LLMs/SLMs) for micro-level personalization, feedback, and content generation. Beyond technical integration, the project also aims to structure an open ecosystem of methods, data, and benchmarks to support reproducible and scalable uses of generative AI in education.
Project Architecture
The proposed architecture is organized around two complementary modes of use of generative models. First, large banks of pedagogical exercises are pre-generated using large language models and then validated by human experts and orchestrated by structured teaching algorithms within ITS. This content is stored and reused in order to minimize live calls to large models during learner interactions. Second, smaller language models, or external APIs to larger proprietary models when needed, are used in real time to provide specific feedback. This design ensures personalized support while preserving computational efficiency, pedagogical control, and scalability.
Data Generation and Evaluation Strategy.
A central component of the project concerns the large-scale generation, structuring, and validation of pedagogical datasets. This work relies on two existing software tools: Sphinx, an internal platform used for the annotation and creation of pedagogical content, and LLM4Humanities, an open-source toolkit providing similar functionalities through a Streamlit-based interface. In parallel, the project is developing unified data structures for representing exercises and real student learning trajectories collected from educational platforms, with the goal of sharing these resources as digital commons through open repositories such as GitHub and Hugging Face. We are also developing a web-based visualization platform for exploring learning trajectories and learner profiles, aimed at both researchers and non-technical stakeholders. A first prototype of this platform has already been implemented.
8.6 Curiosity-driven learning in educational technologies
Since 2019 (Idex cooperation fund between the University of Bordeaux and the University of Waterloo, Canada) and the recent creation of CuriousTECH associate team in 2022 (led by the Flowers team and involving F. Lotte from the Potioc team and M. Fernendes and E. Law from the Waterloo University), we continue our work on the development of new curiosity-driven interaction systems. Substantial progress has been made in this area of application of FLOWERS works (see the website of CuriousTECH team.)
8.6.1 New digital approaches for studying curiosity-driven learning
Participants: Hélène Sauzeon [correspondant], Pierre-Yves Oudeyer [correspondant], Rania Abdelghani, Mehdi Alaimi, Fabien Lotte, Aurélien appriou, Myra Fernandes, Edith Law, Yadurshana Sivashankar.
As curiosity is a recent research topic, we studied some basic mechanisms of curiosity-based learning, thanks to three studies have been completed.
The first one regards a new interactive educational application to foster curiosity-driven question-asking in children. Determined to improve children’s curiosity, we developed a new interactive system aiming to foster curiosity-related question-asking from texts and their perception of curiosity. To assess its efficiency, we conducted a study with 95 fifth grade students of Bordeaux elementary schools. Two types of interventions were designed, one trying to focus children on the construction of low-level question (i.e. convergent) and one focusing them on high-level questions (i.e. divergent) with the help of prompts or questions starters models. We observed that both interventions increased the number of divergent questions, the question fluency performance, while they did not significantly improve the curiosity perception despite high intrinsic motivation scores they have elicited in children. The curiosity-trait score positively impacted the divergent question score under divergent condition, but not under convergent condition. The overall results supported the efficiency and usefulness of digital applications for fostering children’s curiosity that we need to explore further. The overall results are published in CHI'20 72. In parallel to these first experimental works, we wrote this year a review of the existing works on the subject 80.
The second study investigates the neurophysiological underpinnings of curiosity and the opportunities of their use for Brain-computer interactions 74. Understanding the neurophysiological mechanisms underlying curiosity and therefore being able to identify the curiosity level of a person, would provide useful information for researchers and designers in numerous fields such as neuroscience, psychology, and computer science. A first step to uncovering the neural correlates of curiosity is to collect neurophysiological signals during states of curiosity, in order to develop signal processing and machine learning (ML) tools to recognize the curious states from the non-curious ones. Thus, we ran an experiment in which we used electroencephalography (EEG) to measure the brain activity of participants as they were induced into states of curiosity, using trivia question and answer chains. We used two ML algorithms, i.e. Filter Bank Common Spatial Pattern (FBCSP) coupled with a Linear Discriminant Algorithm (LDA), as well as a Filter Bank Tangent Space Classifier (FBTSC), to classify the curious EEG signals from the non-curious ones. Global results indicate that both algorithms obtained better performances in the 3-to-5s time windows, suggesting an optimal time window length of 4 seconds to go towards curiosity states estimation based on EEG signals. These results have been published 74.
Thanks to Virtual reality device, a third study investigates the role of intrinsic motivation in spatial learning in children 159. In this study, the state curiosity is manipulated as a preference for a level of uncertainty during the exploration of new virtual environments. To this end, a series of virtual environments have been created and is presented to children. During encoding, participants explore routes in environments according the three levels of uncertainty (low, medium, and high), thanks to a virtual reality headset and controllers and, are later asked to retrace their travelled routes. The exploration area and the wayfinding. ie the route overlap between encoding and retrieval phase, (an indicator of spatial memory accuracy) are measured. Neuropsychological tests are also performed. The results showed that there are better performances under the medium uncertainty condition in terms of exploration area and wayfinding score. These first results supports the idea that curiosity states are a learning booster. In Sivashankar et al. study, 10-year-old children (20 females; 22 males) with low to high trait curiosity actively explored virtual environments 29 containing varying levels of uncertainty (low, medium, high) (Fig. 30), after which memory for the route travelled was assessed 159.
First-person view and bird’s-eye view of the three styles of virtual environments. Participants only experienced the environments from a first-person perspective.
From left to right: Condition 1 with Low Uncertainty (1 character); Condition 2 with Medium Uncertainty (3 characters); and Condition 3 with High uncertainty (7 characters)
As trait curiosity increased (31), so did memory performance in the high uncertainty condition, suggesting that children with high levels of curiosity can better recruit cognitive resources within such environments. Children with high compared to low curiosity also had higher feelings of presence during the immersive experience. Importantly, in environments with medium uncertainty, children with low trait curiosity were able to perform as well as those with high curiosity. Results suggest that individual differences in trait curiosity influences route memory in environments with varying levels of uncertainty.
Route Memory Score (black circles) and Intrinsic Motivation Score (white circles) in Low-and High-curiosity Groups as a Function of the Three Uncertainty Conditions (Low, Medium and High)
8.6.2 Fostering curiosity and metacognition in classrooms
Participants: Pierre-Yves Oudeyer, Hélène Sauzéon [correspondant], Rania Abdelghani, Chloé Desvaux.
Promoting curiosity by supporting divergent thniking
Previous work aimed to propose new educational technologies driven by epistemic curiosity. A central question of this work was to specify the impact of self-questioning aroused by states of curiosity (i.e., the identification of knowledge gaps and formulation of learning goals) on student performance. To this end, a web platform called "Kids Ask" was designed, developed, and tested in primary schools. The tool offered an interaction with a conversational agent that trained children's abilities to generate curiosity-driven questions and use these questions to explore a learning environment and acquire new knowledge. Results from this study suggested that the configuration helped enhance children's questioning and exploratory behaviors; they also showed that learning progress differences in children can be explained by differences in their curiosity-driven behaviors 69.
Illustration of KidsAsk application interface
The ability to formulate curiosity-driven questions (i.e., new learning goals) likely relies upon divergent thinking mechanisms, as suggested by literature highlighting links between curiosity and creativity 117156. In this regard, a novel version of the Kids Ask training was proposed and tested in a field study involving a total of 130 children aged 9 to 11 years. These experiments aimed to further assess the interplay between curiosity and creativity in question-asking behaviors. Drawing from creativity literature, we examined the process of question formulation through associative thinking involved in creativity. To do so, the conversational agent's behavior in "Kids Ask" was modified to prompt children to identify important keywords from a text, then generate free associations based on their prior knowledge. Given the intricate interplay between curiosity and creativity, it was hypothesized that this associative guidance would further enhance children's ability to formulate divergent, curiosity-driven questions (as shown in figure 33)
Screen shot of the associative method of prompting in Kids Ask. Children start off by reading a text containing highlighted keywords. They are prompted by the conversational agent to choose one from the list and make a free association with it, based on prior knowledge. They are to use one of or both words to ask a divergent question
Promoting curiosity and metacognition in authentic settings
Curiosity-driven learning is crucial for academic achievement and autonomous learning, yet remains scarce in primary classrooms. Building on our previous work with the IGSA framework (Identify-Guess-Seek-Assess) introduced in 68, we developed a training paradigm that teaches curiosity-driven learning through metacognitive skills training. This approach leverages Murayama's framework 133 by personifying the four basic metacognitive skills as animated characters: the referee (identify knowledge gaps), the detective (formulate predictions), the explorer (seek information), and the second referee (assess information quality).
|
Curiosity-driven learning framework and link with the metacognitive skills we propose to train as facilitators during our IGSA-based intervention
The two-part intervention combined declarative knowledge about curiosity and metacognition with procedural training of the four metacognitive strategies. The first step consisted of animated videos explaining key concepts related to curiosity, metacognition, and the four skills through 2D characters. The second step involved the "Kids Reflect" web-based platform, where conversational agents with the same appearance and roles as the video characters prompted children to use these skills appropriately during reading-comprehension tasks (see figure below).
|
Screenshot of the ”Kids Reflect”” platform during the training, given one text
Our earlier pilot studies with small classroom samples demonstrated the accessibility and positive impact of this training on metacognitive efficiency, curiosity-driven question-asking, and learning outcomes. These promising initial results motivated a larger-scale validation study to assess both the intervention's effectiveness and its scalability in authentic educational settings.
Study design and implementation
This implies considering the interventions' effectiveness when teachers implement it themselves with their classroom. Therefore, in a field study conducted with 159 students aged 9-10 years across five elementary schools in Bordeaux Métropole and 4 teachers, the multimedia-based metacognitive intervention was tested using a pseudo-RCT design in collaboration with the Académie de Bordeaux. Three main experimental conditions were compared: intervention led by researchers, intervention led by trained in-service teachers, and a control group. Additionally, complete and partial versions of the intervention were contrasted. Prior to the intervention, teachers underwent short training sessions delivering curiosity and metacognitive concept knowledge and to familiarize themselves with the format and content, enabling them to autonomously implement the intervention in their classrooms during regular school hours.
Main findings
Results demonstrated that intervention groups significantly improved their divergent question-asking abilities and developed more positive perceptions of curiosity compared to the control group. Importantly, this was the case in the ecological setting of classrooms where teachers managed the intervention themselves, but also with a lighter easy-to-implement version of the training (see figure below).
|
Post-interventions results of question-asking abilities of children in each condition
However, nuanced findings emerged regarding teacher delivery conditions. These groups showed lower performance during the intervention and poorer learning outcomes, alongside higher cognitive load, compared to researcher-led groups. This suggests that while the intervention can be effectively scaled to teacher-led implementations, some avenues for improvement have been identified. This point was further informed by qualitative interviews conducted with volunteered teachers who were animators in the study. Teachers rated the intervention highly on acceptability and usefulness, recognizing its pedagogical value. However, they provided lower ratings on usability, citing the complexity of metacognitive concepts and digital interface challenges as primary obstacles. The impact of these lower usability reports on students' performance highlights critical considerations for scaling educational interventions. While teachers appreciated the theoretical foundations and goals of the training, the cognitive demands of simultaneously managing complex pedagogical concepts and digital tools during classroom implementation appeared to affect their delivery quality, which in turn influenced student outcomes.
Implications and future directions
Together, these findings demonstrate that the metacognitive intervention can enhance curiosity-driven learning in authentic classroom settings. However, successful scaling requires strengthened teacher training. Future iterations of this work will focus on simplifying the intervention, providing more comprehensive teacher training programs, and developing materials increasing perceived usability for teachers as a way to favor adoption of such workshops. In response to these identified needs, we initiated in 2025 the creation of comprehensive resources for teachers around metacognitive interventions, motivation, and curiosity-driven learning. This development work focuses on providing teachers with accessible, evidence-based materials that bridge the gap between research findings and classroom practice. The resources include short, evidence-based exercises designed for direct implementation in the classroom, accompanied by detailed recommendations and pedagogical guidance. These materials aim to reduce the cognitive load on teachers by providing ready-to-use activities while maintaining the theoretical rigor and pedagogical effectiveness demonstrated in our research. The exercises are structured to be modular and adaptable to different classroom contexts, addressing the complexity concerns raised by teachers in our scalability study.
This latter point contributes to a broader research agenda of developing practical teacher resources on curiosity-driven learning in educational settings as a way to bridge the research-to-practice gap in educational interventions focused on curiosity and metacognition.
8.6.3 Machine Learning for Adaptive Personalization in Intelligent Tutoring Systems
Participants: Pierre-Yves Oudeyer [correspondant], Hélène Sauzeon [correspondant], Benjamin Clément, Didier Roy, Cécile Mazon.
The Kidlearn project.
is a research project studying how machine learning can be applied to intelligent tutoring systems. It aims at developing methodologies and software which adaptively personalize sequences of learning activities to the particularities of each individual student. Our systems aim at proposing to the student the right activity at the right time, maximizing concurrently his learning progress and his motivation. In addition to contributing to the efficiency of learning and motivation, the approach is also made to reduce the time needed to design ITS systems.
We continued to develop an approach to Intelligent Tutoring Systems which adaptively personalizes sequences of learning activities to maximize skills acquired by students, taking into account the limited time and motivational resources. At a given point in time, the system proposes to the students the activity which makes them progress faster. We introduced two algorithms that rely on the empirical estimation of the learning progress, RiARiT that uses information about the difficulty of each exercise and ZPDES that uses much less knowledge about the problem.
The system is based on the combination of three approaches. First, it leverages recent models of intrinsically motivated learning by transposing them to active teaching, relying on empirical estimation of learning progress provided by specific activities to particular students. Second, it uses state-of-the-art Multi-Arm Bandit (MAB) techniques to efficiently manage the exploration/exploitation challenge of this optimization process. Third, it leverages expert knowledge to constrain and bootstrap initial exploration of the MAB, while requiring only coarse guidance information of the expert and allowing the system to deal with didactic gaps in its knowledge. The system was evaluated in several large-scale experiments relying on a scenario where 7-8 year old schoolchildren learn how to decompose numbers while manipulating money 87. Systematic experiments were also presented with simulated students.
Kidlearn Experiments 2018-2019: Evaluating the impact of ZPDES and choice on learning efficiency and motivation.
An experiment was held between March 2018 and July 2019 in order to test the Kidlearn framework in classrooms in Bordeaux Metropole. 600 students from Bordeaux Metropole participated in the experiment. This study had several goals. The first goal was to evaluate the impact of the Kidlearn framework on motivation and learning compared to an Expert Sequence without machine learning. The second goal was to observe the impact of using learning progress to select exercise types within the ZPDES algorithm compared to a random policy. The third goal was to observe the impact of combining ZPDES with the ability to let children make different kinds of choices during the use of the ITS. The last goal was to use the psychological and contextual data measures to see if correlation can be observed between the students psychological state evolution, their profile, their motivation and their learning. We first show that LP-based personalization improves learning performance (reproducing and solidifying previous results) while producing a positive and motivating learning experience. We then show that the addition of self-choice as a playful feature triggers intrinsic motivation in the learner and reinforces the learning effectiveness of the LP-based personalizing. In doing so, it strengthens the links between intrinsic motivation and performance progress during the serious game. Conversely, deleterious effects of the playful feature are observed for hand-designed linear paths. Thus, the intrinsic motivation elicited by a playful feature is beneficial only if the curriculum personalization is effective for the learner. Such a result deserves great attention due to the increased use of playful features in non adaptive educational technologies available in the market. Details of these new results, as well as the overall results of this project, are presented in Benjamin Clément PhD thesis 86 and are currently being processed to be published.
Kidlearn and Adaptiv'Math.
The algorithms developed during the Kidlearn project and Benjamin Clement thesis 86 are being used in an innovation partnership for the development of a pedagogical assistant based on artificial intelligence intended for teachers and students of cycle 2. The algorithms are being written in typescript for the need of the project. The expertise of the team in creating the pedagogical graph and defining the graph parameters used for the algorithms is also a crucial part of the role of the team for the project. One of the main goal of the team here is to transfer technologies developed in the team in a project with the perspective of industrial scaling and see the impact and the feasibility of such scaling.
Kidlearn for numeracy skills with individuals with autism spectrum disorders.
Few digital interventions targeting numeracy skills have been evaluated with individuals with autism spectrum disorder (ASD) 128127. Yet, some children and adolescents with ASD have learning difficulties and/or a significant academic delay in mathematics. While ITS are successfully developed for typically developed students to personalize learning curriculum and then to foster the motivation-learning coupling, they are not or fewly proposed today to student with specific needs. The objective of this pilot study is to test the feasibility of a digital intervention using an STI with high school students with ASD and/or intellectual disability. This application (KidLearn) provides calculation training through currency exchange activities, with a dynamic exercise sequence selection algorithm (ZPDES). 24 students with ASD and/or DI enrolled in specialized classrooms were recruited and divided into two groups: 14 students used the KidLearn application, and 10 students received a control application. Pre-post evaluations show that students using KidLearn improved their calculation performance, and had a higher level of motivation at the end of the intervention than the control group. These results encourage the use of an STI with students with specific needs to teach numeracy skills, but need to be replicated on a larger scale. Suggestions for adjusting the interface and teaching method are suggested to improve the impact of the application on students with autism. 125.
8.6.4 Machine learning for adaptive cognitive training
Participants: Pierre-Yves Oudeyer, Hélène Sauzéon [correspondant], Masataka Sawayama, Benjamin Clément, Maxime Adolphe, Marion Pech, Juliette Deyts.
Because of its cross-cutting nature to all cognitive activities such as learning tasks, attention is a hallmark of good cognitive health throughout life and more particularly in the current context of societal crisis of attention. Recent works have shown the great potential of computerized attention training for an example of attention training, with efficient training transfers to other cognitive activities, and this, over a wide spectrum of individuals (children, elderly, individuals with cognitive pathology such as Attention Deficit and Hyperactivity Disorders). Despite this promising result, a major hurdle is challenging: the high inter-individual variability in responding to such interventions. Some individuals are good responders (significant improvement) to the intervention, others respond variably, and finally some respond poorly, not at all, or occasionally. A central limitation of computerized attention training systems is that the training sequences operate in a linear, non-personalized manner: difficulty increases in the same way and along the same dimensions for all subjects. However, different subjects require in principle a progression at a different, personalized pace according to the different dimensions that characterize attentional training exercises.
To tackle the issue of inter-individual variability, the present project proposes to apply some principles from intelligent tutoring systems (ITS) to the field of attention training. In this context, we have already developed automatic curriculum learning algorithms such as those developed in the KidLearn project, which allow to customize the learner's path according to his/her progress and thus optimize his/her learning trajectory while stimulating his/her motivation by the progress made. ITS are widely identified in intervention research as a successful way to address the challenge of personalization, but no studies to date have actually been conducted for attention training. Thus, whether ITS, and in particular personalization algorithms, can optimize the number of respondents to an attention training program remains an open question.
Grounded state-of-the-art.
To investigate this question, we first conducted a systematic review aiming at exploring existing methods in computerized CT and analyzing their outcomes in terms of learning mechanics (intra-training performance) and effectiveness (near, far and everyday life transfer effects of CT) 71. A search up to June 2023 with multiple databases selecting 19 computerized CT studies revealed that only two studies emphasized the favorable influence of individualization on CT effectiveness, while five underscored its capacity to enhance the training experience by boosting motivation, engagement, and offering diverse learning pathways. In sum, despite promising results in this new research avenue, more research is needed to fully understand and empirically support individualized techniques in cognitive training.
Distribution of AI techniques depending on type of CT studied (multi or single domain) from Adolphe et al., 2024
Complementing the study of adaptive methods applied to cognitive training, we have attempted through a review of the subjective literature to gain a better understanding of the Multiple Object Tracking (MOT) task, which seems to have the best results in terms of attentional training efficiency in young and older adults. Our investigation pursues three main objectives: (1) identifying the cognitive processes influenced by each adjustable parameter of the MOT task; (2) determining which parameters, when progressively adapted during repeated MOT practice, produce the greatest enhancements in task performance; and (3) evaluating how improvements in MOT performance translate into effective transfer effects, including practical, real-world outcomes. The evidence suggests that the MOT task involves a nuanced interplay of visual processing, attentional resources, and working memory, shaped by the intrinsic properties of the objects and the task conditions. The results of this work highlight that: (1) Multiple cognitive mechanisms are identified as active in the task (divided and sustained attention; foveal and peripheric attention ; automatic and controlled inhibition, etc. ); (2) a limited number of studies have actually implemented the MOT task in computer-assisted cognitive training; and (3) tIt's the near (attention tasks) and far (other cognitive tasks) effects that are well documented as positive outcomes of MOT-based training while there is a scarcity of research that has thoroughly analyzed the ecological effects of attentional training, namely the potential transfer effects in everyday life (paper in progress).
ZPDES calibration for MOT training (Young participants).
In parallel to this, a web platform has been designed for planning and implementing remote behavioural studies. This tool provides means for registering recruited participants remotely and executing complete experimental protocols: from presenting instructions and obtaining informed consents, to administering behavioural tasks and questionnaires, potentially throughout multiple sessions spanning days or weeks. In addition to this platform, a cognitive test battery composed of seven classical behavioural tasks has been developed. This battery aims to evaluate the evolution of the cognitive performance of participants before and after training. Fully open-source, it mainly targets attention and memory. A preliminary study on a large sample of 50 healthy participants showed that the developed tasks reproduced the results of previous studies, that there were large differences between individuals (no ceiling effect) and that the results were significantly reliable between two measurements taken on two days separated by one night 4.
Randomized and controlled Trial in Young and Olders adults : Predifined vs. ZPDES condition.
Utilizing these tools, a pilot study campaign was conducted to evaluate the impact of our AI-based personalized cognitive training program. The first pilot experiment involved n=27 participants and aimed to compare the effectiveness of a cognitive training program using a linear difficulty management procedure (staircase procedure) to a program using an ITS for difficulty manipulation. The online training lasted for 10 hours over a period of 2 weeks. The results indicated that the ITS-based intervention produced diverse learning trajectories compared to the linear procedure 38, leading to broader improvements in pre-post cognitive assessment. However, no significant differences were observed in subjective measures of motivation and engagement between the two groups. Subsequent to this initial experiment, two pilot studies (n=11 and n=10, respectively) were conducted with the goal of enhancing motivation and engagement in the game. The first study implemented gamified components such as scores and feedback, while the second study examined hyperparameter updates to the ITS. The analysis of learning trajectories, learning outcomes, and subjective measures yielded promising results in favor of the AI-based personalized procedure.
Different learning trajectories for a selected participant in the staircase group (left) and the ITS group (right). The color of a dot indicates the initial presentation of the parameter value, while the size of the dot represents the frequency of the parameter value.
Building on the preliminary findings, we expanded our research scope with a more comprehensive experimental setup involving two distinct studies. The first study encompassed 64 young adults, sourced through the Prolific platform, while the second study consisted of 50 older adults, recruited from the "Université du temps libre". Our experimental methodology mirrored that of our initial pilot studies, with a notable enhancement: the integration of new gamified elements (including mini-story creation and new visual content) aimed at boosting participant motivation and engagement.
a) The MOT task. (b) Several visual snapshots of intervention. (c) Schedule proposed to participants
The data analysis encompassed three primary dimensions: initially, an exploratory phase to delineate learning trajectories between control and intervention groups; subsequently, a comparative analysis of pre- and post-test performance on the cognitive battery; and lastly, an examination of participants' self-reported experiences during training, providing insights into their subjective perceptions of the experiment.
The pilot studies' preliminary outcomes were corroborated in these larger sample groups. Notably, learning trajectories exhibited greater diversity in the group undergoing the intervention procedure. This group also demonstrated a more pronounced improvement across a wider range of cognitive assessment tasks. Although participants engaging in the personalized cognitive training reported a higher cognitive load via questionnaires, the levels of engagement and frustration did not significantly differ between the two groups.
The results showed that ZPDES could be more effective than a control condition, with improved performance on trained tasks in both studies, underlining the benefits of individualized training paths. However, motivation and engagement were lower in the groups using ZPDES, probably due to cognitive load and metacognitive factors. Overall, individualizing cognitive training through systems like ZPDES provides a promising direction for future research by providing automatic methods for taking individual differences into account in CT programs while respecting methodological standards for evaluating the effectiveness of CT. As a result, our work contributes to the growing body of knowledge in both ITS and CT domains while stressing the crucial role of challenges related to motivation and engagement to optimize the effectiveness of these individualized approaches for cognitive and educational outcomes.
As part of the creation of the new University Hospital Institute (UHI) VBHI (VASCULAR BRAIN HEALTH INSTITUTE), we aim to develop and test a personalized, multimodal digital therapeutic approach to slow down the functional consequences of small vessel disease. More specifically:
- Evaluate the impact of personalized cognitive training compared to non-personalized conditions (comparative efficacy).
- Identify potential ElectroEncephaloGraphic (EEG) biomarkers that reflect cognitive activity impacted by small vessel disease and could later (in a subsequent study) be used as targets for exploratory EEG neurofeedback therapy.
- Identify brain areas to target for delivering non-invasive HD-tACS electrical stimulation, using previously acquired MRI data.
- Evaluate the impact of this stimulation on brain activity, neural synchronization, and cognitive performance.
To achieve this, 80 participants from the SHIVA cohort (n=80) will be divided into two subgroups according to the severity of the disease:
- Severe group: presenting multiple lesions on MRI
- Non-severe group: presenting a few lesions
These groups will then be further divided based on the type of training: personalized tests (ZPDES) versus standard tests.


SHIVA study protocol and materials.
During the pre- and post-training sessions, participants will perform cognitive tests on a computer. Participants will be equipped with an EEG headset, which, combined with a tACS stimulator, will allow for both brain activity recording and stimulation.
We are carrying out an ancillary study with Myra Fernandez's laboratory in Canada, thanks to my participation with the Inria Curiositytech international associate team. We have proposed to collaboratively analyze certain data and dimensions of interest in our respective laboratories (e.g. physical activities) associated with the cognitive training proposed in the SHIVA-DTX-COG project.
Qualitative Analysis with LLMs:
As it is well known that there are more dropouts in older adults compared to young ones, we aimed to better understand the learning experience of trainees with feeback analyses. For this, we designed a new way throught several Large Language Models (LLM) enabling to extract hot topics or main dropout's motivations in verbatim that are related to pragmatic, hedonist and/or aesthetic dimensions of cogntive training . The results analyzed through various LLM are encouraging (paper in progress). To support this new approach, we are exploring different prompts on other data corpora in order to ultimately propose a tutorial accessible to anyone wishing to carry out a LLM-based thematic qualitative analysis.
8.6.5 ToGather : Interactive website to foster collaboration among stakeholders of school inclusion for pupils with neurodevelopmental disorders
Participants: Hélène Sauzéon [correspondant], Cécile Mazon, Eric Meyer, Isabeau Saint-Supery, Christelle Maillart [Uni. Liège, Belgium], Kamélia Belassel, Mathieu Périé, Valentin Strahm.
Sustain and support the follow-up of the school inclusion of children with neurodevelopmental disorders (e.g., autism, attention disorders, intellectual deficiencies) has become an emergency : the higher is the school level, the lower is the amount of schooled pupils with cognitive disabilities.
Technology-based interventions to improve school inclusion of children with neurodevelopmental disorders have mostly been individual centered, focusing on their socio-adaptive, and cognitive impairments and implying they have to adapt themselves in order to fit in our society's expectations. Although this approach centered on the normalization of the person has some advantages (reduction of clinical, symptoms), it carries social stereotypes and misconceptions of cognitive disability that are not respectful of the cognitive diversity and intrinsic motivations of the person, and in particular of the student's wishes in terms of school curriculum to achieve his or her future life project 129.
The "ToGather" project aims at enlightening the field of educational technologies for special education by proposing an approach centered on the educational needs of the students and bringing a concerted and informed answer between all the stakeholders including the student and all their support spheres (family, school, medico-social care). To this end, ToGather project that emanates from participatory design methods, primarily consists of having developed a pragmatic tool (interactive website) to help students with cognitive disability and their caregivers to formalize and to visualize the repertoire of academic skills of the student and to make it evolve according to his or her proximal zone of development (in the sense of Vygotsky) on the one hand, and to the intrinsic motivations of the student (his or her own educational and life project) on the other 126.
This project is in partnership with the School Academy of Bordeaux of the French Education Minestery, the ARI association, the Centre of Autism of Aquitaine. It is funded by the FIRAH (foundation) and the Nouvelle-Aquitaine Region (see the dedicated webpages).
First, usability studies have been conducted for evaluating ergonomic qualities of the ToGather website, yielding positive resultats in French and Belgian contexts. Then, we conducted a large field-study to assess the effectiveness of the tool in helping stakeholders to support children with neurodevelopmental disorders (NDD) 155 153 154.
The study protocol consisted in a longitudinal non-randomized controlled trial, with baseline, 3-months, and 6-months fllow-up assessments. The recruitment was conducted across the entire French territory. Our local partners facilitated the dissemination of the call for participation in Gironde and provided us with contacts to extend it to other regions. Additionally, a recruitment campaign through social media was carried out to communicate about the study and encourage participants to test the ToGather tool.
As the tool was designed to support co-educational process between parents and professionals, a support team had to consist of at least two stakeholders, including at least one of the parents. Initially, 157 participants were recruited in 37 support teams, but 30 individuals did not answer to baseline questionnaire, leading to the exclusion of 11 support teams. After baseline assessment, 13 support teams were allocated to the experimental condition (ToGather app) and 11 to the control condition (usual follow-up).
Primary outcomes measures covered stakeholders’ relationships, self-efficacy, and attitudes towards inclusive education, while secondary outcomes measures were related to stakeholders’ burden and quality of life, as well as children’s school well-being and quality of life.
As the study ended recently, data analysis is still ongoing. Preliminary results after 3 months of use showed encouraging results with an improvement in communication between stakeholders and their respective quality of life (paper in progress)
8.6.6 Curious and therefore not overloaded : Study of the links between curiosity and cognitive load in learning mediated by immersive technologies
Participants: Hélène Sauzéon [correspondant], Matisse Poupard, André Tricot [Cosupervisor - Univ. Montpellier], Florian Larrue [Industrialist - Le Catie].
Conducted in collaboration with CATIE (industrial partner) and the EPSYLON laboratory at the University of Montpellier (under the supervision of Prof. André Tricot), this research program was initiated in April 2022 and defended on September 11th, 2025. It pursued two main objectives:
- To establish theoretical links between cognitive load theory and models of curiosity-driven learning.
- To experimentally examine how the choice of educational technology modulates the relationship between pedagogical approaches (guided instruction vs. exploration) and learner expertise.
To address these objectives, the thesis was structured into three main phases.
Literature Review.
A systematic review examining the contributions and limitations of Virtual Reality (VR) and Augmented Reality (AR) for learning was conducted, with a specific focus on their effects on cognitive load and intrinsic motivation. This review identified both the pedagogical potential of immersive technologies and persistent methodological limitations in the field, particularly regarding the measurement of motivation and cognitive processes. The results were published in the British Journal of Educational Technology (BJET) 39.
Experimental Research in XR-Based Anatomy Learning.
Two experimental studies were conducted in 2023 with 131 second-year medical students and replicated in 2024 with 164 medical students from the second to fifth years.
The first experiment investigated whether supporting students’ drawing activity during lectures using augmented and mixed reality could reduce cognitive load and enhance motivation. Participants followed a 20-minute neuroanatomy video lecture while simultaneously reproducing drawings demonstrated by the instructor. Four experimental conditions were compared:
- Spatial Augmented Reality (SAR): A digital overlay of the anatomical structure was projected onto paper, allowing learners to trace it using a projector and tracking system.
- Mixed Reality (MR): The digital overlay was displayed through a HoloLens 2 headset.
- Mixed Reality with 3D Model (MR+3D): In addition to the digital overlay, learners could manipulate a 3D anatomical model.
- Control Condition: No digital overlay was provided.
Experimental conditions for experiment 1 : Support Drawing with Augmented Reality
Results from the 2023 dataset showed that both AR- and MR-supported drawing conditions significantly reduced extraneous cognitive load, increased intrinsic motivation, and improved drawing accuracy. However, no significant differences in knowledge acquisition were observed between conditions. Notably, in the stereoscopic 3D visualization condition, learners with higher intrinsic motivation exhibited poorer learning outcomes, possibly due to increased attentional focus on system interaction rather than conceptual understanding. Visuospatial ability and prior knowledge moderated the effectiveness of AR and MR interventions, with more experienced learners benefiting the most. These results are reported in a manuscript currently under review in the Journal of Computing in Higher Education65.
The second experiment explored a different learning paradigm using virtual reality (VR), manipulating levels of interactivity and instructional guidance. This design enabled the examination of how exploration and embodied interaction with a 3D anatomical model affect learning outcomes, cognitive load, and curiosity.
Experimental conditions for experiment 2 : Embodied learning in virtual reality, effect of interactivity
Analyses, published in Computers & Education38, showed that VR conditions led to superior learning performance, particularly in the passive and active interaction conditions. These conditions were associated with higher intrinsic motivation and a more optimized cognitive load profile. Moreover, intrinsic motivation was positively correlated with germane cognitive load (i.e., cognitive resources devoted to learning) and negatively correlated with extraneous cognitive load. In other words, highly motivated learners experienced fewer irrelevant cognitive demands, allowing them to allocate more resources to meaningful learning processes.
Following the systematic review, which highlighted the lack of reliable and context-sensitive measures of intrinsic motivation in XR research, a third study leveraged a key affordance of VR: the continuous collection of behavioral data. By analyzing head and hand movements during the neuroanatomy learning task, this study aimed to identify implicit behavioral indicators of curiosity and cognitive engagement. Results showed that increased hand movement was associated with lower intrinsic motivation, whereas greater head movement was positively associated with both germane cognitive load and intrinsic motivation, suggesting deeper cognitive engagement. Additionally, movement entropy emerged as a significant predictor of curiosity-driven learning, highlighting its potential as an implicit marker of learning-related behaviors in immersive environments. These findings are presented in a manuscript currently under review in the International Journal of Human–Computer Studies37.
Illustration of movement entropy calculations in the virtual environment
Toward a Unified Model: Cognitive Load, Motivation, and Expertise.
Building on the empirical results of the previous studies, which revealed dynamic interactions between cognitive load and intrinsic motivation, this final phase addressed the second overarching objective of the thesis: the empirical integration of cognitive load theory and the Learning Progress Hypothesis. Using structural equation modeling (SEM), this study tested a comprehensive model describing the relationships between XR technologies, cognitive load, intrinsic motivation, perceived learning progress, and learning outcomes.
Results indicated that both AR and VR significantly reduced extraneous cognitive load and increased intrinsic motivation. However, intrinsic motivation did not directly predict immediate learning performance. Instead, extraneous cognitive load negatively affected perceived learning progress and autonomy, which in turn predicted intrinsic motivation, revealing a key mediating pathway.
Resulting model from the SEM.
Overall, these findings demonstrate that unnecessary cognitive demands not only hinder learning efficiency but also disrupt learners’ perceived progress and sense of control, thereby undermining curiosity and intrinsic motivation. This work contributes to a unified theoretical framework by showing how optimizing extraneous cognitive load in XR environments supports both cognitive efficiency and curiosity-driven learning. The results are presented in a manuscript currently under review in Educational Psychology Review66.
Effect of XR technology displays on everyday memory
In addition to this work, the co-design of an augmented reality (AR) application simulating a museum visit (Co-led with P. Dragicevic, Bivouac, under the I-am AEx project, 2023) and integrated with an evaluation of involuntary and uncontrollable memory revival has originally demonstrated that AR enhances this type of memory compared to 3D images 53, suggesting potential cognitive manipulations with AR (Honorable Mention at CHI 2025).
Display features and personal factors (e.g., intellectual curiosity/humility) are being studied 54 to develop robust usage recommendations (L. Petiot’s PhD thesis co-supervized by; H. Sauzéon).
8.6.7 Self-determination-driven digital services for supporting aging-in place and well-being: a study of relationships between longitudinal data from smart home and clinical data
Participants: Hélène Sauzéon, Juliette Deyts, Lucile Dupuy, Rafik Belloum.
This work relies on longitudinal data collected from frail older adults living alone at home who used the HomeAssist ambient assisted living platform for up to 24 months. HomeAssist was designed according to a self-determination and user-centered approach, covering three domains of need: daily activities, home safety, and social participation. The objective of this research is to analyze relationships between clinical data (e.g., cognitive assessments, frailty, autonomy, self-determination) and usage-related data (user experience questionnaires, usage diaries, and actimetric data derived from environmental sensors), in order to both assess the benefits of assistive and monitoring services and explore the predictive value of sensor-based data for explaining clinical outcomes.
A first study focused on identifying factors influencing user experience (UX) and long-term adoption of HomeAssist, based on data from 131 participants. Despite a user-centered design, long-term adoption remained limited, with only 18 users continuing after 24 months and 38 requesting removal within the first six months. Regression analyses showed that UX dimensions were mainly predicted by other UX dimensions rather than by individual health or psychosocial characteristics. In contrast, long-term adoption was weakly predicted by level of education and computer ownership, suggesting that while user-centered design may reduce the impact of individual characteristics on user experience, adoption remains influenced by digital literacy and social inequalities.
Overall, these activities contribute to the design of a user-centered visualization tool intended for clinicians (psychologists and physicians), enabling them to better understand the links between long-term usage data and clinical evolution, and to detect early “weak signals” of decline (e.g., changes in sleep patterns), thereby facilitating timely and targeted interventions.
8.7 Curiosity-driven AI for assisted scientific discovery
8.7.1 Design of an Interactive Software for Automated Discovery in Complex Systems
Participants: Clément Romac [correspondant], Zacharie Bugaud, Clément Moulin-Frier, Pierre-Yves Oudeyer.
We further developed our Automated Discovery software and particularly focused on adding and experimenting with new systems.
Our public software now features more than ten examples ranging from artificial life, to physics or protein docking. The software was publicly released in 2024: presentation thread.
Technical architecture of our software.
8.7.2 Discovering Sensorimotor Agency in Cellular Automata using Diversity Search
Participants: Gautier Hamon [correspondant], Mayalen Etcheverry, Bert Chan, Clément Moulin-Frier, Pierre-Yves Oudeyer.
As a continuation of the previous projects in Automated Discovery in Self-Organizing Systems, we have been working on expanding the set of discoveries of possible structures in continuous CAs such as Lenia 82, 81, and in particular we have been interested to search for emerging agents with sensorimotor capabilities. Understanding what has led to the emergence of life and sensorimotor agency as we observe in living organisms is a fundamental question. In our work, we initially only assume environments made of low-level elements of matter (called atoms, molecules or cells) locally interacting via physics-like rules. There is no predefined notion of agent embodiment and yet we aim to answer the following scientific question: is it possible to find environments in which there exists/emerge a subpart that could be called a sensorimotor agent?
We use Lenia continuous cellular automaton as our artificial "world" 81. We introduce a novel method based on gradient descent and curriculum learning combined within an intrinsically-motivated goal exploration process (IMGEP) to automatically search parameters of the CA rule that can self-organize spatially localized 1 and moving patterns 2 within Lenia. The IMGEP defines an outer exploratory loop (generation of training goal/loss) and an inner optimization loop (goal-conditioned). We use a population-based version of IMGEP 12,91 but introduce two novel elements compared to previous papers in the IMGEP literature. First, whereas previous work in 29 and 10 used a very basic nearest-neighbor goal-achievement strategy, our work relies on gradient descent for the local optimization of the (sensitive) parameters of the complex system, which has shown to be very powerful. To do so we made a differentiable version of the Lenia framework, which is also a contribution of this work. Secondly, we propose to control subparts of the environmental dynamics with functional constraints (through predefined channels and kernels in Lenia) to build a curriculum of tasks; and to integrate this stochasticity in the inner optimization loop. This has shown central to train the system to emerge sensorimotor agents that are robust to stochastic perturbations in the environment. In particular, we focus on modeling obstacles in the environment physics and propose to probe the agent sensorimotor capability as its performance to move forward under a variety of obstacle configurations. We also provide in this work tests and metrics to measure the robustness of the obtained agents.
Robustness test to harder/unseen obstacle configurations: straight wall, bigger obstacle, dead ends.
Change of scale changing the kernel size and initialization, the grid is the same size in both
While many complex behaviors have already been observed in Lenia, among which some could qualify as sensorimotor behaviors, they have so far been discovered "by chance" as the result of time-consuming manual search or with simple evolutionary algorithms. Our method provides a more systematic way to automatically learn the CA rules leading to the emergence of basic sensorimotor structures, as shown in Figure 48. Moreover, we investigated and provided ways to measure the (zero-shot) generalization of the discovered sensorimotor agents to several out-of-distribution perturbations that were not encountered during training. Impressively, even though the agents still fail to preserve their integrity in certain configurations, they show very strong robustness to most of the tested variations. The agents are able to navigate in unseen and harder environmental configurations while self-maintaining their individuality (Figure 46). Not only the agents are able to recover their individuality when subjected to external perturbations but also when subjected to internal perturbations: they resist variations of the morphogenetic processes such that less frequent cell updates, quite drastic changes of scales as well as changes of initialization (Figure 47). Furthermore, when tested in a multi-entity initialization and despite hav,ing been trained alone, not only the agents are able to preserve their individuality but they show forms of coordinated interactions (attractiveness and reproduction). Our results sug,gest that, contrary to the (still predominant) mechanistic view on embodiment, biologically-inspired embodiment could pave the way toward agents with strong coherence and generalization to out-of-distribution changes, mimicking the remarkable robustness of living systems to maintain specific functions despite environmental and body perturbations 116. Searching for rules at the cell-level in order to give rise to higher-level cognitive processes at the level of the organism and at the level of the group of organisms opens many exciting opportunities to the development of embodied approaches in AI in general.
Scatter plot of the agents as their measured performances of robustness to obstacles (y axis) and speed in obstacles (x axis) obtained by IMGEP (red), random search with the same compute resources as IMGEP(blue) and the one from the original lenia paper (green)
The work has been released in 2022 as a distill-like article which is currently hosted at this link. This article contains an interactive demo in webGL and javascript, as well as many videos and animations of the results. A colab notebook with the source code of the work is publicly available at.
In 2024, additional quantitative experiments were conducted as well as ablations. This work was published in 2025in the Science Advances journal.
8.7.3 Semantic Open-Endedness in Flow-Lenia using Vision Language Models and IMGEP
Participants: Sina Khajehabdollahi [correspondent], Gautier Hamon, Marko Cvjetko, Cédric Colas, Pierre-Yves Oudeyer, Clément Moulin-Frier.
Discovering diverse visual patterns in continuous cellular automata (CA) is challenging due to the vastness and redundancy of high-dimensional behavioral spaces. Traditional exploration methods like Novelty Search (NS) expand locally by mutating known novel solutions but often plateau when local novelty is exhausted, failing to reach distant, unexplored regions. We introduce Expedition & Expansion (E&E), a hybrid strategy where exploration alternates between local novelty-driven expansions and goal-directed expeditions. During expeditions, E&E leverages a Vision-Language Model (VLM) to generate linguistic goals—descriptions of interesting but hypothetical patterns that drive exploration toward uncharted regions. By operating in semantic spaces that align with human perception, E&E both evaluates novelty and generates goals in conceptually meaningful ways, enhancing the interpretability and relevance of discovered behaviors. Tested on Flow Lenia, a continuous CA known for its rich, emergent behaviors, E&E consistently uncovers more diverse solutions than existing exploration methods. A genealogical analysis further reveals that solutions originating from expeditions disproportionately influence long-term exploration, unlocking new behavioral niches that serve as stepping stones for subsequent search. These findings highlight E&E's capacity to break through local novelty boundaries and explore behavioral landscapes in human-aligned, interpretable ways, offering a promising template for open-ended exploration in artificial life and beyond. The project was published at the Artificial Life 2025 conference 48. A summary and the result visualization are available on the project website.
8.7.4 Exploring Flow-Lenia Universes with a Curiosity-driven AI Scientist: Discovering Diverse Ecosystem Dynamics
Participants: Thomas Michel [correspondent], Marko Cvjetko [correspondent], Gautier Hamon, Pierre-Yves Oudeyer, Clément Moulin-Frier.
We present a method for the automated discovery of system-level dynamics in Flow-Lenia—a continuous cellular automaton with mass conservation and parameter localization—using a curiosity-driven AI scientist. This method aims to uncover processes leading to self-organization of evolutionary and ecosystemic dynamics in CAs. We build on previous work which uses diversity search algorithms in Lenia to find self- organized individual patterns, and extend it to large environments that support distinct interacting patterns. We adapt Intrinsically Motivated Goal Exploration Processes (IMGEPs) to drive exploration of diverse Flow-Lenia environments using simulation-wide metrics, such as evolutionary activity, compression-based complexity, and multi-scale entropy. We test our method in two experiments, showcasing its ability to illuminate significantly more diverse dynamics compared to random search. We show qualitative results illustrating how ecosystemic simulations enable self-organization of complex collective behaviors not captured by previous individual pattern search and analysis. We complement automated discovery with an interactive exploration tool, creating an effective human-AI collaborative workflow for scientific investigation. Though demonstrated specifically with Flow-Lenia, this methodology provides a framework potentially applicable to other parameterizable complex systems where understanding emergent collective properties is of interest.
This work was published at the Artificial Life 2025 conference 51, with a companion website containing videos of the discoveries, the interactive exploration tool and source code.
|
|
|
|
A showcase of discovered diversity while searching for ecosystemic dynamics.
8.7.5 Discovering and Controlling Diverse Self-Organized Patterns in Cellular Automata Using Autotelic Reinforcement Learning
Participants: Marko Cvjetko [correspondent], Gautier Hamon, Pierre-Yves Oudeyer, Clément Moulin-Frier.
Autotelic AI algorithms, which pursue self-generated goals, have proven to be effective as automated discovery assistants in cellular automata. Previous work in this domain focused on algorithms which produce diverse behaviors by setting the automaton’s initial conditions. Here, we extend these methods beyond initial-condition search and adapt them to systems that support sequences of closed-loop interventions. Using Lenia (a continuous cellular automaton) as a test environment, we train goal-conditioned reinforcement learning agents to perform targeted interventions during the system’s evolution, guiding it towards desired states. The resulting agent behaviors are robust and diverse, demonstrating the potential of closed-loop interaction for discovery and control. Furthermore, we show that goal-conditioned RL agents performing interventions can discover novel self-organising patterns and generalize to previously unseen and noisy environments. The project was presented as a late-breaking abstract at the Artificial Life 2025 conference 45, and disseminated through a website.
9 Bilateral contracts and grants with industry
9.1 Bilateral contracts with industry
- CATIE: CIFRE PhD grant of Matisse Poupard with CATIE and EPSYLON Lab (Univ. Montpellier) until April 2025.
- Hugging Face PhD of Clément Romac with Hugging Face on "Augmenting curiosity-driven exploration with very large language models in deep reinforcement learning agents"
We received a 70keuros grant from Google, as a PhD fellowship for Julien Pourcel.
9.2 Bilateral Grants with Fundation
CLEMENCE Cohort (Fondation de France and Théa Pharma)
Participants: Hélène Sauzéon [correspondant], Cécile Mazon, Cécile Delcourt.
The project "Cohorte LongitudinalE sur la Myopie et le développement oculaire dans l’ENfanCE(CLEMENCE) is led by C. Delcourt from the lab of Bordeaux Populational Health (2M€). Hélène Sauzéon and Cécile Mazon participate to the research program with the study of developemental changes due to Myopa in visual attention.
10 Partnerships and cooperations
10.1 International initiatives
10.1.1 Inria associate team not involved in an IIL or an international program
Participants: Helene Sauzéon, Edith Law.
CuriousTECH
-
Title:
Curiosity-Driven Learning Across the Lifespan
-
Duration:
2023 -> 2025
-
Coordinator:
Edith Law (edith.law@uwaterloo.ca)
-
Partners:
- University of Waterloo Waterloo (Canada)
-
Inria contact:
Helene Sauzéon
-
Summary:
Since several years, the HCI lab and the cognitive neuroscience lab of the University of Waterloo (Canada) have been collaborating with researchers from the Bordeaux site, especially the Flowers team and the Flowers team from inria, as well as the ACTIVE team from the BPH laboratory (Inserm-Uni. Bordeaux ). This collaboration is motivated by a common desire to better understand the role of curiosity in lifelong learning, and to constitute a new multidisciplinary research avenue on the design of original interactive systems for (re)education. Several studies report that curiosity is not only beneficial to children and young adults but also to older adults and neurodiverse individuals. This field of study is in its infancy and deserves collaborative efforts to identify the underlying cognitive mechanisms, the learning situations that benefit them in order to ultimately design and develop curiosity-driven (re)educational technologies (ETs), and then deploy them in natural environments (school, home) to be reliably and rigorously tested. For this multidisciplinary purpose, the consortium gathers competences in AI, HCI, cognitive science, psychology in order to cover the objectives given by the proposed associated team, i.e. CuriousTech team. In addition to the scientific potential, this team structuring also includes the will of a quick transfer of the ET in France and Canada towards the socio-economic fields of Ed Tech but also of e-health.
10.2 European initiatives
10.2.1 Horizon Europe
Participants: Cédric Colas.
INTERACT
INTERACT project on cordis.europa.eu
-
Title:
Help Me Grow: Artificial Cognitive Development via Human-Agent Interactions Supported by New Interactive, Intrinsically Motivated Program Synthesis Methods.
-
Duration:
From October 1, 2022 to August 31, 2026
-
Partners:
- INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET AUTOMATIQUE (INRIA), France
- MASSACHUSETTS INSTITUTE OF TECHNOLOGY (MIT), United States
-
Inria contact:
Cedric Colas
- Coordinator:
-
Summary:
Building machines that interact with their world, discover interesting interactions and learn open-ended repertoires of skills is a long-standing goal in AI. This project aims at tackling the limits of current AI systems by building on three families of methods: Bayesian program induction, intrinsically motivated learning and human-machine linguistic interactions. It targets three objectives: 1) building autonomous agents that learn to generate programs to solve problems with occasional human guidance; 2) studying linguistic interactions between humans and machines via web-based experiments (e.g. properties of human guidance, its impact on learning, human subjective evaluations); and 3) scaling the approach to the generation of constructions in Minecraft, guided by real players. The researcher will collaborate with scientific pioneers and experts in the key fields and methods supporting the project. This includes supervisors Joshua Tenenbaum (program synthesis, MIT) and Pierre-Yves Oudeyer (autonomous learning, Inria); diverse collaborators, and an advisory board composed of an entrepreneur and leading scientists in developmental psychology and human-robot interactions. The 3rd objective will be pursued via a secondment with Thomas Wolf (CSO) at HuggingFace, a world-leading company in the open source development of natural language processing methods and their transfer to the industry. By enabling users to participate in the training of artificial agents, the project aims to open research avenues for more interpretable, performant and adaptive AI systems. This will result in scientific (e.g. interactive program synthesis approaches), societal (e.g. democratized AI training) and economic impacts (e.g. adaptive AI assistants). The dissemination, communication and exploitation plans support these objectives by targeting scientific (AI, cognitive science), industrial (video games, smart homes) and larger communities (gamers, software engineers, large public).
10.2.2 Other european programs/initiatives
Participants: Helene Sauzéon, Pierre-Yves Oudeyer, Mathias Grüber.
DEVCUR:
ORA project 2024-2027 - Open Research Area (ORA) for the Social Sciences 8th call for proposals
-
Title:
How curiosity enhances learning across childhood and adolescence: The role of metacognition and agency.
-
Duration:
From Sept 1, 2025 to December 31, 2027
-
Partners:
- INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET AUTOMATIQUE (INRIA), France
- Cardiff University, UK
- MaxPlanck Institute, Berlin, Germany
-
Inria contact:
Pierre-Yves Oudeyer and Hélène Sauzéon
-
Coordinator:
Mathias Grüber, Brain and Imagery centre, Cardiff University, UK / Funds : 1,177 k€
-
Summary:
This project investigates the bidirectional relationship between curiosity-based learning and metacognition during late childhood and adolescence, a critical period when both abilities develop. Using five experiments with behavioral, neuroimaging, training, and longitudinal methods, three research teams from Cardiff, Bordeaux, and Trier will examine how metacognition and agency enhance curiosity-driven learning. The study will explore both individual differences and developmental changes in how metacognitive awareness strengthens curiosity's learning benefits. Findings will be translated into classroom interventions to stimulate curiosity and metacognition in educational settings. This interdisciplinary collaboration aims to advance understanding of curiosity development with significant scientific and societal impact.
10.3 National initiatives
GAIMHE project (BPI France 2030):
GAIMHE is a strategic research and innovation project funded by Bpifrance, coordinated by EvidenceB in partnership with the Flowers AI & Cognitive Science Laboratory at Inria, Café pédagogique, and Association Class'Code. The project aims to develop next-generation intelligent tutoring systems that combine the pedagogical rigor of traditional adaptive learning algorithms with the flexibility of generative AI. Current educational AI technologies present a fundamental trade-off. Intelligent Tutoring Systems (ITS) offer pedagogically grounded, personalized curricula through algorithms such as ZPDES, but require substantial manual content development. Conversely, generative AI provides interactional flexibility yet lacks pedagogical structure, cannot support long-term curriculum personalization, and presents significant computational costs. GAIMHE proposes a hybrid methodology structured around three axes: automated content generation leveraging generative AI for the rapid creation of pedagogically validated exercises to populate ITS knowledge graphs; targeted generative assistance deploying optimally-sized models to provide pedagogically principled guidance at key learning moments; and advanced personalization through compact student models capable of predicting and adapting learning trajectories across extensive exercise spaces, building on prior MAGELLAN research. The project benefits from EvidenceB's existing infrastructure, which serves tens of thousands of classrooms across primary and secondary education in France. This enables large-scale evaluation with authentic learning data. In partnership with Région Île-de-France, the consortium will release annotated datasets, learning analytics tools, and software components as open-source digital commons to support France's educational technology ecosystem.
ANR Chaire Individuelle Deep Curiosity
- PY Oudeyer continued to work on the research program of this Chaire, funding 2 PhDs and 3 postdocs for five years (until 2025).
ANR JCJC ECOCURL
- C. Moulin-Frier obtained an ANR JCJC grant. The project is entitled "ECOCURL: Emergent communication through curiosity-driven multi-agent reinforcement learning". The project starts in Feb 2021 for a duration of 48 months. It will fund a PhD student (36 months) and a Research Engineer (18 months) as well as 4 Master internships (one per year).
Projet AIxIA: "Analyse d’Interférences par Intelligence Artificielle".
Pierre-Yves Oudeyer and Clément Moulin-Frier obtained a grant from the call for project AIRSTRIP "L'intelligence Artificielle au service de l'IngénieRie des SysTèmes aéRonautIques et sPatiaux", in collaboration with the IRT Saint Exupery. The project was accepted in 2023 and will fund 18 months of a research engineer position starting in 2024.
Inria Exploratory Action AIDE
- Didier Roy is collaborator of the Inria Exploratory Action AIDE "Artificial Intelligence Devoted to Education", ported by Frédéric Alexandre (Inria Mnemosyne Project-Team), Margarida Romero (LINE Lab) and Thierry Viéville (Inria Mnemosyne Project-Team, LINE Lab). The aim of this Exploratory Action consists to explore to what extent approaches or methods from cognitive neuroscience, linked to machine learning and knowledge representation, could help to better formalize human learning as studied in educational sciences. AIDE is a four year project started middle 2020 until 2024 see.
Inria Exploratory Action I'AM
- Hélène Sauzéon is co-PI with P. Dragicevic of the Inria Exploratory Action I'AM "Impact of Augmented Reality on Autobiographical Memory: Examining Involuntary Memories and False Memories" (174,5k€). Starting in last september, the aim of this Exploratory Action consists to explore to what extent augmented reality based devices can produce erroneous autobiographical memories, and more particularly in vulnerable people (Children and older adults or yound adults with low memory abilities of source monitoring).
New collaboration with Maxime Derex from IAST Toulouse
for the co-direction of the PhD thesis of Jeremy Perez with Clément Moulin-Frier and Pierre-Yves Oudeyer on "Interactions between intrinsically motivated goal-exploration processes and cummulative cultural evolution" (see section 8.2.2).
France 2030 - PPR AUTONOMIE : Vieillissement Et Situations De Handicap - Projet INNOVCare (Lechevalier S., 3,5M€) (2023-26)
- Hélène Sauzéon and AS Rigaud will supervize the WP5 dedicated to two care-led innovation experiments with assistive technologies (400k € for Bordeaux). - Hélène Sauzéon is responsible of the WP3 « Digital technology for aging in place » (470k€/3,5M€), Défi 4 - Numérique, Innovcare (PPR Autonomie PIA2030, 2023-28).
VBHI project(Vascular Brain Health Institute -IHU, led by S. Debette, 5M€)) (2023-26)
- Hélène Sauzéon will supervize the WP4.3 dedicated to "Explore Digital Therapeutics To Slow Down Cognitive Decline In Covert Csvd" (150k€)
11 Dissemination
11.1 Promoting scientific activities
11.1.1 Scientific events: organisation
PY Oudeyer continued to be a member of the organization committe of the Life, Structure and Cognition symposium series at IHES, France. H sauzeon continued to be a member of the Technical Program committee of ACHI conference.
11.1.2 Reviewer - reviewing activities
Matisse Poupard has reviewed for Computers & Education, Education and Information Technologies, Frontiers in Psychology and Frontiers in Virtual Reality. Jeremy Perez has reviewed for the Judgment and Decision Making Journal and Topics in Cognitive Science. Hélène Sauzéon has reviewed for ACM-CHI25 , British Journal of Psychology, Computer in human behavior, and for Journal of Research in Science Teaching. Cécile Mazon has reviewed for Nature Scientific Reports, Education and Information Technology, and BMC Psychology.
PY Oudeyer was a reviewer for the journals Developmental Science and Child Development.
11.1.3 Invited talks
Matisse Poupard gave 3 invited talks:
- (December 2025) "Technologies immersives pour l’enseignement : étude des relations entre charge cognitive et curiosité des apprenants", ReSCi “Ma Recherche j’en parle” - Education, outils numériques et intelligence artificielle", ANRT, Paris
- (November 2025) Round-table discussion to mark the end of the Dem’UP project, University of Poitiers
- (October 2025) "Curieux et cognitivement engagé", Séminaire « Numérique pour l’Education », R3NumED
Clément Romac gave 1 invited talks:
- (May 2025) Invited talk at the ISIR lab from Sorbonne University on “Grounding LLMs through curiosity-driven online RL”.
- (Spetember 2025) Invited talk at the SMILES workshop at ICDL on “Grounding LLMs through curiosity-driven online RL”.
Marie-Sarah Desvaux gave 1 invited talk:
- (November 2025) "Curiosity-driven learning as a ZPD window for self-regulated learning" during International Society of Cultural-historical Activity Research, Southern Europe and Middle Eastern Conference 2025, Barcelona
Hélène Sauzéon gave 3 invited talk:
- (October 2025) "Supporting digital accessibility of MOOC based-learning for individuals with cognitive impairments: The Aïana project" Intersections: Translation, Accessibility, Inclusion, Forum des Savoirs, MSH Dijon
- (Mai 2025) Les interventions de santé pour le bien-viellir des personnes âgés à l'aide de technologies numériques Journée IFRATH - Troubles sociocognitifs et technologies : Perspectives sur l'enfance et le vieillissement, 'Institut National de Jeunes Sourds de Paris, 254 Rue Saint-Jacques, 75005 Paris.
- (November 2025) The intrinsic motivations as design principles of technologies for cognition : Examples about Educational Technologies and Technologies for aging in place, Institut de Psychologie, Université Paris Cité, Paris
Loris Gaven gave 2 invited talks:
- (January 2026) "Toward Artificial Curiosity" at University of Padua (Online)
- (August 2025) "MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces" at the Metacognitive Satellite of CCN in Amsterdam.
PY Oudeyer gave these invited talks:
- (Jan 2025) Curiosity-driven learning in humans: learning progress, autotelic exploration and open-ended development, Keynote lecture at the Budapest Conference on Cognitive Development, see video.
- (Jan 2025) IA générative et éducation: enjeux sociétaux, for the "Conférence jumelle du Cnesco" co-organized by Cnesco, Canope and CARDIE Charentes et Charentes-Maritime.
- (Jan 2025) Les enjeux sociétaux de l'IA dans l'éducation, at ETAPP-IA conference on AI and education organized by Nouvelle Aquitaine academy.
- (May 2025) IA générative, société et éducation: les enjeux de la formation des futures citoyens, for the conference "Printemps de la recherche en éducation", organized by INSPE, Paris.
- (Oct 2025) How curiosity drives human learning, and how this can be leveraged in educational technologies, LEAD symposium organized by University of Tuebingen.
- (Oct 2025) IA générative: enjeux, interventions et résultats expérimentaux, for the AI seminar organized by DRANE from Grenoble academy.
- (Dec 2025) with Julien Pourcel and Cédric Colas, SOAR, a self-improving LLM-based evolutionary algorithm, for the ARC-Prize.
11.1.4 Scientific expertise
Hélène Sauzéon was:
- Vice-president of Pluridisciplinary committee (Digital sciences and Humanities) of French National agency for Research (CES 38 -Interface ANR), since 2025
- Member of Research Council of Finland POC (12 projects on applied computer sciences) in 2025
- Member of the Scientific Committee of Calyxis, a center focused on research and development of technological solutions to prevent daily accidents through public laboratory-enterprise collaborations, since 2019.
- Expert for grant applications: Evaluation of 2CIFRE-ANRT PhD proposals ; Evaluation of 1 GATES (Grenoble ATtractiveness and ExcellenceS) proposal for the SHS Cluster of Université Grenoble in 2025.
- Member of committe for a permanent Professor position in psychology at the university of Bordeaux
- Member of committe for a permanent Assistant Professor position in Occupational Science (91 section) at the university of Limoges
PY Oudeyer was:
- a reviewer and expert for ANR (National Research Agency) as well as for the European Research Council (ERC), the Cyprus Research Council and for the Swedish Foundation for Strategic Research.
- an invited expert for the "Curiosity Convening" event organized by the Scratch Foundation at OECD, Paris.
- invited to be a member of the scientific council of the "Main à la Pate" foundation.
- a member of the GT "IA et éducation" at Conseil Scientifique de l'Education Nationale.
Cécile Mazon reviewed one proposal for ANR-AAPG JCJC.
11.1.5 Research administration
Hélène Sauzéon was:
- Member of the Research Committee of IMT Atlantique since 2025, working to promote Human and Social Sciences (SHS) in engineering education.
- Co-organizer (Inria) since 2024 of the annual "JS & GT 'Handicap'" (Thematic Days & Working Groups on Disability) and contributor to the consultation for Inria's 2025 Disability Roadmap.
- Member of the extented "BCP" of BSO Inria centre, since 2020. Advisory roles for the center's “surrounding” scientific policy and strategy, recruitment of permanent researchers, especially , monitoring and assistance in setting up Inria teams, organization of intern scientific events, writting support to communication staff for popularization contents on AI, disabilty, health and Education, etc.
- PIQ Referal for the centre Inria of Univ. of Bordeaux covering 3 universities (Bordeaux, LaRochelle, Limoges), since 2024. My role is twofold: 1) to follow up and help site referents and applicants to define and draft projects, while ensuring compliance with PIQ program policy, i.e. close dialogue with PIQ staff, and 2) to inform the center's scientific management of applications in progress in New Aquitaine via a dedicated "pad", and their positioning in relation to the PIQ program's national results.
- Referal of Education topic for the centre Inria of Univ. of Bordeaux (covering 3 teams : Bivouac, Flowers, Mnemosyne), and for which I'm the centre proxis at RTP CNRS Éducation.
- Head of an Associate Inria Team- CuriousTech Inria-UW–Univ. Waterloo (Canada), since 2023. The multi-disciplinary program (Prof. M. Fernandes' psychology lab, and Edith Law's HCI lab) involves designing innovative interactive systems for education and cognitive health at all ages, with the singularity of leveraging intrinsic motivations (self-determination and curiosity) as reinforcers of human performance.
- Member of the Direction Committee of IFR Handicap (Inserm) labelled Fedhra since 2023, since 2019
- Member of the Direction Committee of BIND - centre excellence BIND de Bordeaux, since 2019
- Member of the scientific Committee of SOUND - centre excellence TND Bordeaux, since 2025
- Resp. of Research Axis on Innovating Interventions at ACTIVE Team (BPH Lab), since 2022
Cécile Mazon is:
- Co-responsible of the Digital Tools workpackage of the PIA Atypie-Friendly
- Local contact for Inria HandiTechLab
- Member of the Digital Tools axis of the Bordeaux Excellence center for Neurodevelopmental disorders (SOUND project)
PY Oudeyer was:
- head of the Flowers AI & CogSci lab
- member of the piloting committee of the France 2030 BPI project GAIMHE
- representative of Inria in the piloting committee of the Nouvelle Aquitaine Research Network on Educational Technologies (R3NumEd)
11.2 Teaching - Supervision - Juries - Educational and pedagogical outreach
11.2.1 Teaching
Cécile Mazon is responsible of:
- Cognitive science curriculum in MIASHS bachelor (Mathematics and Computer Science applied to Social and Human Sciences) - since 2024
- Technology, Ergonomy, Cognition, Disability curriculum in Cognitive Sciences master, since 2022
- Apprenticeship academic coordination for Technology, Ergonomy, Cognition, Disability curriculum in Cognitive Sciences master - since 2023
Leslie Tricoche , as ATER, gave the following courses:
- L2 MIASHS - UFR Sciences and Technology, Bordeaux University: Neurobiology (lectures and tutorials)
- L3 MIASHS - UFR Sciences and Technology, Bordeaux University: Neuropathology (lectures and tutorials)
- M1 Cognitive Sciences - UFR Sciences and Technology, Bordeaux University: Cognitive functions in situations and disabilities (lectures and tutorials)
- M2 Cognitive Sciences - UFR Sciences and Technology, Bordeaux University: Multiple forms of the profession (lectures and tutorials)
Marie-Sarah Desvaux , as Teaching Assistant, gave the following courses:
- M2 Cognitive Sciences - UFR Sciences and Technology, Bordeaux University: Multiple forms of the profession - Project management
- L3 MIASHS - UFR Sciences and Technology, Bordeaux University: Web Accessibility
Juliette Deyts , as Teaching Assistant, gave the following courses:
- M2 Cognitive Sciences - UFR Sciences and Technology, Bordeaux University: Disability, Autonomy, Cognition and Technology
Matisse Poupard , as ATER, gave the following courses:
- L1 MIASHS - UFR Sciences and Technology, Bordeaux University: Introduction to Cognitive Science
- L2 MIASHS - UFR Sciences and Technology, Bordeaux University: Neurological Foundations, Cognitive Fundamentals, and Learning
- L3 MIASHS - UFR Sciences and Technology, Bordeaux University: Knowledge and Representations, Language, and Natural Language Processing
- M1 Cognitive Sciences - UFR Sciences and Technology, Bordeaux University:
- Scientific Foundations
- Cognitive Functions in Situations and Disabilities
- M2 Cognitive Sciences - UFR Sciences and Technology, Bordeaux University:
- Disability, Activity, Cognition, Technology
- Multiple Forms of the Profession
- Virtual Reality, Interaction, and Health Applications
Cécile Mazon , as assistant professor, gave lectures and tutorials (280hETD) in cognitive sciences to students in MIASHS bachelor (L1-2-3) and Cognitive sciences master (M1-M2). Key teaching topics include introduction to cognitive sciences, cognitive psychology (main cognitive functions, experimental methods), cognitive sciences applied to disability and/or technology design, as well as methodology and statistics.
Helene Sauzéon participated to the inria mentoring program as mentor of one PhD student from the centre Inria of Paris
11.2.2 Supervision
PY Oudeyer (co-)supervised the following PhD students:
- PhD defended in 2025: Grgur Kovac, "Developmental training of socio-cognitive abilities in AI systems", (supervisors:PF. Dominey and PY. Oudeyer)
- PhD defended in 2025: Gauthier Hamon, "Open-endedness in artificial life and articial intelligence: an eco-evo-devo perspective" (supervisor: C. Moulin-Frier)
- PhD defended in 2025: Nicolas Yax, "Studying cognitive and metacognitive skills in foundation models" (supervisors: S. Palminteri, PY. Oudeyer)
- PhD defended in 2025: Clément Romac, "Grounding LLMs with online RL", (supervisors: T. Wolf and PY. Oudeyer) item PhD in progress: Julien Rosenberg, "Models and experimental study of the co-development of curiosity and metacognition in adolescents" (supervisors: H. Sauzéon, PY Oudeyer)
- PhD in progress: Paul Tabbara, "Autotelic generative AI systems for automated discovery in mathematics" (supervisors: G. Baudart, PY. Oudeyer)
- PhD in progress: Julien Pourcel, "Autotelic LLMs that learn how to code", (supervisors: C. Moulin-Frier and PY. Oudeyer)
- PhD in progress: Thomas Carta, "LLM-based Autotelic deep reinforcement learning agents", (supervisors: O. Sigaud, S. Lamprier and PY. Oudeyer)
- PhD in progress: Jeremy Perez, "Studying mechanisms and roles of curiosity in socio-cultural contexts" (supervisors: C. Moulin-Frier, M. Derex, PY. Oudeyer)
- PhD in progress: Timothé Boulet, "Controller synthesis for artificial agents in simulated environments using generative AI" (supervisors C. Moulin-Frier, X. Hinault, N. Fijalkow)
- PhD in progress: Marko Cvjetko, "Autotelic exploration algorithms for automated search of open-endedness in artificial life" (supervisors: C. Moulin-Frier, PY. Oudeyer) item PhD in progress: Loris Gaven, "Metacognitive prediction of learning progress for guiding autotelic agents" (supervisors: PY. Oudeyer and C. Moulin-Frier)
H. Sauzéon (co-)supervised the following PhD students:
- PhD defended in 2025: M. POUPARD " Curious and thus not overloaded !". (supervisors:H. sauzeon and A. tricot / CIFRE with CATIE)
- PhD in progress: L. PETIOT " AR effect on memory distorsions" (supervisors : H. sauzeon and P. Dragicevic) ( AEx IAM, 2023-25).
- PhD in progress: C. DESVAUX "Design and Asssement of metacognitive interventions supporting curiosity and creativy at school" (Alloc. MESRI, ED SP2).
- PhD in progress: J. DEYTS "Self-determination driven technologies for healthy aging" (Alloc. from Projet ANR Innovcare)
- hD in progress: J. ROSENBERGER Curiosity-driven learnig as developmental function of metacognition in adolescents aged of 12 to 16 y/o. (supervisors : H. sauzeon and PY oudeyer(Alloc. from ORA funds, ED SP2 -Univ. Bordeaux)
- PhD in progress: M. BOURDIL "A neurotechnological approach using EEG for the characterising and the therapeutic treatment of smal vessels syndrome. (supervisors: F Lotte and H. sauzeon) (Alloc from IHU-VBHI project)
11.2.3 Juries
PY Oudeyer was a member of:
- the selection committee of the Inria Prizes from Académie des Sciences.
- the PhD juries of Marie Martin (Université Interdisciplinaire de Paris), Théo Cachet (Sorbonne University) and J. Daly (Univ. Texas, Austin)
- the PhD "comité de suivi" of Reem al Najjar (Sorbonne Université), Matthis Poupard (Univ. Bordeaux), Paul Pacaud (Université Paris Sciences Lettres)
Hélène Sauzéon was a part of 6 PhD boards :
- "Conception, développement et évaluation d'un exergame en réalité augmentée pour la rééducation cognitivo-motrice d'enfants atteints de Paralysie Cérébrale ou de Lésions Cérébrales Acquises : le projet TERAPACE by Maxime Balloufaud - Limoges
- "Optimizing sensory feedback and manual interaction efficiency within XR experiments" by Julien Cauquis - Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire
- "Careless or care-led innovation? : socio-ethnography of social robots and social tiesin eldercare settings in France and Japan : tensions and contradictions in needs, temporalities and representations" by Yuko Tamaki - Paris, EHESS
- "Neurocognitive mechanisms of self-referenced memory encoding: a naturalistic and embodied approach to episodic memory" by Sylvain Penaud - Université Paris Cité
- "Intrinsic vs. Extrinsic Motivation: Computational Modelling, Neural Bases, and Clinical Applications" by Jade Seguin - Sorbonne université
- "Prise de décision lors de la planification d'itinéraires avec des applications : une approche cognitive pour la régulation des flux voyageurs dans les transports en commun" by Archana Prabhakar - Université Paris Cité
Cécile Mazon is permanent member of the jury for Cognitive Sciences master thesis defenses (M1/M2) and for bachelor undergraduate projects (L3 MIASHS).
11.2.4 Support to public policies
PY. Oudeyer and H. Sauzéon and the whole team were involved in several major actions to support public policies on the topic of AI and education. Members of the team designed and conducted training sessions in different academies for supervisory staff and teachers, e.g. ETAPP-IA day in Nouvelle-Aquitaine (January 2025); departmental training of CPE and documentary teachers of Nouvelle-Aquitaine during a day at the Lycée Les Iris in Lormont (May 2025); Academic Days of Innovation for teachers of Nouvelle-Aquitaine, Spring Days of Education Research at INSPEs, (June 2025); PhilosophIA Citizens' Convention (April 2025), twin conference of Cnesco/Cardie Charente-Maritime (January 2025), working group Education and Cognitive Sciences of the academies of Créteil, Versailles and Paris, scheduled for March 2026.
H. Sauzéon and PY. Oudeyer were interviewed and wrote reports to contribute to the report of French Senate on AI and education.
PY Oudeyer was auditioned by the commission on cultural and educational affairs in the French parliament, to discuss the major challenges and opportunities of AI and education.
11.2.5 Educational and pedagogical outreach
Cécile Mazon participated to events for promoting university programs in cognitive science: the Salon de l’Étudiant (January 2025), the University of Bordeaux Open Days (January 2025), and the Orientation Days (May 2025).
11.3 Popularization
11.3.1 Specific official responsibilities in science outreach structures
PY. Oudeyer collaborated with the Pix organization as main scientific and editorial design consultant for the Pix IA training modules, which will be dissemnated to all French students in 4ème, 2nde and CAP in 2026.
11.3.2 Productions (articles, videos, podcasts, serious games, ...)
PY. Oudeyer gave several public talks on AI and education available on a youtube channel.
D. Roy and P-Y. Oudeyer wrote a popular science book to introduce generative AI (mechanisms, applications, societal dimensions) to adolescents, as well as to their teachers and families. It is entitled "C'est (pas) moi, c'est l'IA", and was published in september 2024 by Nathan. It was reviewed in widely distributed magazines (e.g. Magazine de l'APEL) and radios (e.g. France Culture, RFI). The web page of the book is here: link.
A. Torres-Leguet, C. Romac, T. Carta and PY. Oudeyer produced the pedagogical video series "ChatGPT explained in 5 mn", aimed at training generative AI literacy in a wide diversity of students (e.g. high school), available here: link. They are under a Creative Commons licence, CC-BY, enabling open and free reuse. They were already integrated in the MOOC AI4T (link), as well as in an internal training platform of "Académie du Numérique du Ministère de la défense", in a mobile app made by Inria with educational materials related to AI (link), and are being adapted and integrated in a training platform for the whole population of civil servants in France, coordinated by DINUM.
PY Oudeyer wrote a note for the French educational institutions on "IA générative, société et éducation: En quoi l’IA générative représente-elle un enjeu dans la formation des citoyens ?", in the context of the Conférence de Consensus on Nouveaux Savoirs et Nouvelles Compétences des Jeunes of Cnesco, (Nov. 2024)
Hélène Sauzéon wrote a web article on the following topic: "Why agency is a key ability in the workplace"
Hélène Sauzéon participated to the "mental health and Technology" podcast organized by BPH -Inserm (october 2025) in Bordeaux
Marie-Sarah Desvaux walk interviewed by Curieux! Live on Educational Technologies for learning
11.3.3 Participation in Live events
Jeremy Perez and Clément Romac gave a presentation on Artificial Intelligence to high school teachers as part of the "Journée formation IA pour les enseignant.es" at the Bordeaux INRIA Center on February 5th.
Clément Romac gave a talk on generative AIs to La main à la pâte, a French association promoting science in classrooms.
Hélène Sauzéon , Cécile Mazon , Sophie Lepennetier , Julien Rosenberger , Loris Gaven , Paul Tabbara and Julien Pourcel hosted a stand at the Village des Sciences on October 11th and 12th. It gave an opportunity to introduce curiosity to visitors of CapScience, especially kids and parents.
Hélène Sauzéon and Marie-Sarah Desvaux animated a workshop on curiosity-driven learning to teachers and trainers during the "Learning Show" 2025 (13th of October) in Rennes
Hélène Sauzéon gave a talk at the event organized by "Science with and for Society" by Université of Bordeaux: Samedis Sciences #4 "Artificial Intelligence and Education : the future of learning?"
Hélène Sauzéon gave a talk at the event « Journée académique de l’expérimentation" organized by CARDIE Grenoble, Grenoble (14 mai 2025)
Hélène Sauzéon participated in "CoAnimation" for the Portes Fermées at INRIA, for a workshop to promote dialogue between digital and social sciences
Hélène Sauzéon participated in "Circuit scientifique Hors les murs" in October 2025
Hélène Sauzéon participated in the Chiche program, visiting 2 to 3 classrooms
Marie-Sarah Desvaux gave a talk on the use of Generative AI in classrooms to INSPE students of University of Bordeaux (March 2025)
Marie-Sarah Desvaux animated a workshop on the use of Generative AI in classrooms during the Journée Académique (August 2025) organized by CARDIE Poitiers
Marie-Sarah Desvaux gave an interactive talk on curisoity-driven learning in classrooms to teachers during the Cogni'Forum 2025 (October) organized by "Apprendre et Former avec les Sciences Cognitives"
PY Oudeyer participated to several live events:
- (Feb 2025) Presentation of the researcher job to high school students at La Sauque high-school (Nouvelle-Aquitaine)
- (March 2025) La créativité dans tous ses états, with E. Koechlin, F. Guedy, P. Ribault, organized by Institute of Advanced Studies, Paris.
- (Nov 2025) Presentation of the societal stakes of AI to a group of high-school students in Compiègne, in the context of the Roberval prize event.
- (Jan 2025) D. Roy and PY. Oudeyer, interview et présentation du livre "C'est (pas) moi, c'est l'IA", lors d'une rencontre avec des collégiens organisée par la libraire Mollat.
- (Feb. 2025) PY Oudeyer, Curiosité, cognition et intelligence artificielle : Comment mieux apprendre à apprendre ?, Série de conférence MIA Seconde.
- (Oct 2025) PY Oudeyer, IA générative, société et éducation, for the conference on AI organized by PolarIA at Fleurance, in the context of the annual science festival.
11.3.4 Others science outreach relevant activities
Press:
PY. Oudeyer was interviewed, or the work of the team was discussed, in various newspapers, magazines and radios/podcasts:
- Espiloon (oct. 2025), IA, et maintenant les robots !
- Telerama (Aout 2025), Le risque, c’est l’affaiblissement de la pensée: quand l’IA met l’école à l’épreuve
- Version Femina (Aout 2025), Comment leur apprendre à utiliser l’IA?
- Télérama (Mai 2025), Intelligence artificielle : comment bien accompagner les enfants ?
- Le Monde (Feb. 2025), L’IA à l’école, une révolution déjà en marche
- Magazine de l'APEL (Jan 2025) PY. Oudeyer, L'IA, un outil pour la différenciation pédagogique, entretien réalisé pour le Magazing de l'APEL.
12 Scientific production
12.1 Major publications
- 1 inproceedingsInteractive environments for training children’s curiosity through the practice of metacognitive skills : a pilot study.IDC 2023 - The 22nd annual ACM Interaction Design and Children ConferenceChicago IL, United StatesACM; ACMNovember 2023, 495-501HALDOI
- 2 articleConversational agents for fostering curiosity-driven learning in children.International Journal of Human-Computer Studies167November 2022, 102887HALDOI
- 3 articleGPT-3-driven pedagogical agents for training children's curious question-asking skills.International Journal of Artificial Intelligence in EducationJune 2023HALDOI
- 4 articleAn Open-Source Cognitive Test Battery to Assess Human Attention and Memory.Frontiers in Psychology13June 2022HALDOIback to text
- 5 articleActive Learning of Inverse Models with Intrinsically Motivated Goal Exploration in Robots.Robotics and Autonomous Systems611January 2013, 69-73HALDOI
- 6 inproceedingsGrounding Large Language Models in Interactive Environments with Online Reinforcement Learning.International Conference on Machine Learning 20232023676-3713Honololu, Hawaii, United States2023HAL
- 7 articleTowards Truly Accessible MOOCs for Persons with Cognitive Impairments: a Field Study.Human-Computer Interaction2021HAL
- 8 inproceedingsCURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning.International Conference on Machine LearningLong Beach, FranceJune 2019HAL
- 9 inproceedingsLanguage as a Cognitive Tool to Imagine Goals in Curiosity-Driven Exploration.NeurIPS 2020 - 34th Conference on Neural Information Processing SystemsContains main article and supplementariesVancouver / Virtual, CanadaDecember 2020HALback to text
- 10 inproceedingsHierarchically Organized Latent Modules for Exploratory Search in Morphogenetic Systems.NeurIPS 2020 - 34th Conference on Neural Information Processing SystemsVancouver / Virtual, CanadaDecember 2020HALback to text
- 11 articleAI-driven Automated Discovery Tools Reveal Diverse Behavioral Competencies of Biological Networks.eLifeAugust 2024HALDOI
- 12 articleIntrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning.Journal of Machine Learning ResearchApril 2022HALback to text
- 13 inproceedingsMAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces.ICML 2025 - 42nd International Conference on Machine Learning267Vancouver (BC), Canada2025HAL
- 14 articleTowards a neuroscience of active sampling and curiosity.Nature Reviews Neuroscience1912December 2018, 758-770HAL
- 15 articleDiscovering Sensorimotor Agency in Cellular Automata using Diversity Search.Science Advances 11442025HALDOI
- 16 articleStick to your role! Stability of personal values expressed in large language models.PLoS ONE198August 2024, e0309114HALDOI
- 17 inproceedingsCuriosity Driven Exploration of Learned Disentangled Goal Spaces.CoRL 2018 - Conference on Robot LearningZürich, SwitzerlandOctober 2018HAL
- 18 articlePilot study of an intervention based on an intelligent tutoring system (ITS) for instructing mathematical skills of students with ASD and/or ID.Education and Information Technologies2022HALDOI
- 19 inproceedingsGrounding an Ecological Theory of Artificial Intelligence in Human Evolution.NeurIPS 2021 - Conference on Neural Information Processing Systems / Workshop: Ecological Theory of Reinforcement Learningvirtual event, FranceDecember 2021HAL
- 20 inproceedingsAutotelic Reinforcement Learning in Multi-Agent Environments.CoLLAs 2023, Conference on Lifelong Learning AgentsMontréal, CanadaAugust 2023HAL
- 21 inproceedingsCollective Innovation in Groups of Large Language Models.ALIFE 2024 - The Conference on Artificial LifeCopenhagen, DenmarkMIT Press2024HALDOI
- 22 inproceedingsUnsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration.ICLR2018 - 6th International Conference on Learning RepresentationsVancouver, CanadaApril 2018HAL
- 23 inproceedingsWhen LLMs Play the Telephone Game: Cultural Attractors as Conceptual Tools to Evaluate LLMs in Multi-turn Settings.The Thirteenth International Conference on Learning Representations (ICLR 2025)Singapour, Singapore2025HAL
- 24 inproceedingsFlow-Lenia: Towards open-ended evolution in cellular automata through mass conservation and parameter localization.The 2023 Conference on Artificial LifeTokyo, JapanMIT Press; MIT PressJuly 2023HALDOI
- 25 inproceedingsAutomatic Curriculum Learning For Deep RL: A Short Survey.IJCAI 2020 - International Joint Conference on Artificial IntelligenceKyoto / Virtuelle, JapanJanuary 2021HAL
- 26 articleUsing virtual reality for enhancing neuroanatomy learning by optimizing cognitive load and intrinsic motivation..Computers and Education235October 2025, 105332HALDOI
- 27 inproceedingsACES: Generating diverse programming puzzles with autotelic language models and semantic descriptors.NeurIPS 2024 - The 38th Annual Conference on Neural Information Processing SystemsVancouver, Canada2024HAL
- 28 articleSelf-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI.Proceedings of Machine Learning Research2025HALDOI
- 29 inproceedingsIntrinsically Motivated Discovery of Diverse Patterns in Self-Organizing Systems.International Conference on Learning Representations (ICLR)Source code and videos athttps://automated-discovery.github.io/Addis Ababa, EthiopiaApril 2020HALback to text
- 30 articleThe beneficial role of curiosity on route memory in children.Frontiers in Cognition3March 2024HALDOI
- 31 articleHumans monitor learning progress in curiosity-driven exploration.Nature Communications121December 2021HALDOI
- 32 inproceedingsSupporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding.IUI 2023 - 28th International Conference on Intelligent User InterfacesSydney, AustraliaACMMarch 2023, 75-78HALDOI
- 33 inproceedingsPhyloLM : Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks.ICLR 2025Singapore, Singapore2025HAL
12.2 Publications of the year
International journals
International peer-reviewed conferences
National peer-reviewed Conferences
Conferences without proceedings
Doctoral dissertations and habilitation theses
Reports & preprints
Scientific popularization
12.3 Cited publications
- 68 inproceedingsInteractive environments for training children's curiosity through the practice of metacognitive skills : a pilot study.IDC 2023 - The 22nd annual ACM Interaction Design and Children ConferenceChicago IL, United StatesACMJune 2023, 495-501HALDOIback to text
- 69 articleConversational agents for fostering curiosity-driven learning in children.International Journal of Human-Computer Studies167November 2022, 102887HALDOIback to textback to text
- 70 inproceedings Generative AI in the Classroom: Can Students Remain Active Learners? NeurIPS 2023 - GAIED Workshop - Conference on Neural Information Processing Systems New orleans, USA, United States arXiv December 2023 HAL DOI back to text
- 71 unpublishedExploring the Potential of Artificial Intelligence in Individualized Cognitive Training: a Systematic Review.December 2023, working paper or preprintHALDOIback to text
- 72 inproceedingsPedagogical Agents for Fostering Question-Asking Skills in Children.CHI '20 - CHI Conference on Human Factors in Computing SystemsHonolulu / Virtual, United StatesApril 2020HALDOIback to text
- 73 articleDevelopment and validation of a multi-dimensional measure of intellectual humility.PloS one1282017, e0182950back to textback to text
- 74 inproceedingsTowards measuring states of epistemic curiosity through electroencephalographic signals.IEEE SMC 2020 - IEEE International conference on Systems, Man and CyberneticsToronto / Virtual, CanadaOctober 2020HALback to textback to text
- 75 inproceedingsLearning to Guide and to Be Guided in the Architect-Builder Problem.International Conference on Learning RepresentationsVirtual, FranceApril 2022HALback to text
- 76 inproceedingsMinimal Criterion Coevolution: A New Approach to Open-Ended Search.Proceedings of the Genetic and Evolutionary Computation ConferenceGECCO '172017, 67--74back to text
- 77 articleMachine Culture.Nature Human Behaviour711November 2023, 1855--1868DOIback to text
- 78 articleSocial network architecture and the tempo of cumulative cultural evolution.9back to text
- 79 articleGrounding large language models in interactive environments with online reinforcement learning.arXiv preprint arXiv:2302.026622023back to text
- 80 articleIdentifying Functions and Behaviours of Social Robots for In-Class Learning Activities: Teachers' Perspective.International Journal of Social RoboticsSeptember 2021HALDOIback to text
- 81 proceedingsLenia and Expanded Universe.ALIFE 2020: The 2020 Conference on Artificial LifeALIFE 2021: The 2021 Conference on Artificial Life07 2020, 221-229URL: https://doi.org/10.1162/isal_a_00297DOIback to textback to text
- 82 articleLenia-biology of artificial life.Complex Systems2832019, 251-286back to text
- 83 miscOn the Measure of Intelligence.November 2019DOIback to text
- 84 articlePlay, Curiosity, and Cognition.Annual Review of Developmental Psychology212020, 317-343URL: https://doi.org/10.1146/annurev-devpsych-070120-014806DOIback to text
- 85 articleIn Praise of Folly: Flexible Goals and Human Cognition.Trends in Cognitive Sciences287July 2024, 628--642DOIback to text
- 86 phdthesisAdaptive Personalization of Pedagogical Sequences using Machine Learning.Université de BordeauxDecember 2018HALback to textback to text
- 87 articleMulti-Armed Bandits for Intelligent Tutoring Systems.Journal of Educational Data Mining (JEDM)72June 2015, 20--48HALback to textback to text
- 88 inproceedingsLanguage as a Cognitive Tool to Imagine Goals in Curiosity Driven Exploration.Advances in Neural Information Processing Systems33Curran Associates, Inc.2020, 3761--3774URL: https://proceedings.neurips.cc/paper/2020/hash/274e6fcf4a583de4a81c6376f17673e7-Abstract.htmlback to text
- 89 articleLanguage and culture internalization for human-like autotelic AI.412December 2022, 1068--1076URL: https://doi.org/10.1038/s42256-022-00591-4DOIback to textback to text
- 90 articleAutotelic Agents with Intrinsically Motivated Goal-Conditioned Reinforcement Learning: A Short Survey.Journal of Artificial Intelligence Research74July 2022, 1159--1199URL: https://www.jair.org/index.php/jair/article/view/13554DOIback to text
- 91 unpublishedIntrinsically Motivated Goal-Conditioned Reinforcement Learning: a Short Survey.January 2021, working paper or preprintHALback to text
- 92 articleEvolution of multicellularity by collective integration of spatial information.eLife9oct 2020, e56349URL: https://doi.org/10.7554/eLife.56349DOIback to text
- 93 articleGoals as Reward-Producing Programs.Nature Machine Intelligence72February 2025, 205--220DOIback to textback to text
- 94 articlePartial connectivity increases cultural accumulation within groups.Proceedings of the National Academy of Sciences11311March 2016, 2982--2987URL: http://www.pnas.org/lookup/doi/10.1073/pnas.1518798113DOIback to textback to text
- 95 articleCumulative Cultural Evolution within Evolving Population Structures.Trends in Cognitive Sciences2482020, 654--667DOIback to text
- 96 phdthesisCuriosity-driven AI for Science : Automated Discovery of Self-Organized Structures.Université de BordeauxNovember 2023HALback to textback to text
- 97 miscIntrinsically Motivated Discovery of Diverse Patterns in Self-Organizing Systems.Self-organisation occurs in many physical, chemical and biological systems, as well as in artificial systems like the Game of Life. Yet, these systems are still full of mysteries and we are far from fully grasping what structures can self-organize, how to represent and classify them, and how to predict their evolution. In this blog post, we present our recent paper which formulates the problem of automated discovery of diverse self-organized patterns in such systems. Using a continuous Game of Life as a testbed, we show how intrinsically-motivated goal exploration processes, initially developed for learning of inverse models in robotics, can efficiently be transposed to this novel application area.March 2020HALback to text
- 98 inproceedingsHierarchically Organized Latent Modules for Exploratory Search in Morphogenetic Systems.NeurIPS 2020 - 34th Conference on Neural Information Processing SystemsVancouver / Virtual, CanadaDecember 2020HALback to text
- 99 articleAI-driven Automated Discovery Tools Reveal Diverse Behavioral Competencies of Biological Networks.eLifeAugust 2024HALDOIback to textback to textback to text
- 100 miscOMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code.2025, URL: https://arxiv.org/abs/2405.15568back to text
- 101 articleUsing Confounded Data in Latent Model-Based Reinforcement Learning.Transactions on Machine Learning Research JournalAugust 2023HALback to text
- 102 articleInformation-seeking, curiosity, and attention: computational and neural mechanisms.Trends in Cognitive Sciences1711November 2013, 585-93HALDOIback to text
- 103 articleCuriosity as a Metacognitive Feeling.Cognition231February 2023, 105325DOIback to text
- 104 articleCreativity: Yesterday, Today, and Tomorrow.The Journal of Creative Behavior111967, 3--14DOIback to text
- 105 articleWhat's in It for Them? The Role of Social Curiosity and Social Needs in Motivating and Retaining Hospitality Employees.International Journal of Hospitality Management1152023, 1--12DOIback to textback to text
- 106 articleCuriosity Made the Cat More Creative: Specific Curiosity as a Driver of Creativity.Organizational Behavior and Human Decision Processes1502019, 1--13DOIback to text
- 107 articleThe genetical evolution of social behaviour. I.Journal of Theoretical Biology71July 1964, 1--16URL: https://www.sciencedirect.com/science/article/pii/0022519364900384DOIback to text
- 108 articleThe genetical evolution of social behaviour. II.Journal of Theoretical Biology711964, 17-52URL: https://www.sciencedirect.com/science/article/pii/0022519364900396DOIback to text
- 109 articleThe Impact of Metacognitive Instruction on Creative Problem Solving.Journal of Experimental Education8332015, 291--318DOIback to text
- 110 articlePerceived and Actual Social Discrimination: The Case of Overweight and Social Inclusion.Frontiers in Psychology42013DOIback to textback to text
- 111 articleUnderstanding cumulative cultural evolution.Proceedings of the National Academy of Sciences113442016, E6724--E6725back to textback to text
- 112 articleThe Role of Metacognitive Components in Creative Thinking.Frontiers in Psychology102019DOIback to text
- 113 articleIn Search of the Neural Circuits of Intrinsic Motivation.Frontiers in Neuroscience11October 2007, 225--236URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2518057/DOIback to text
- 114 articleThe Five-Dimensional Curiosity Scale Revised (5DCR): Briefer Subscales While Separating Overt and Covert Social Curiosity.Personality and Individual Differences157April 2020, 109836DOIback to textback to textback to text
- 115 articleThe five-dimensional curiosity scale: Capturing the bandwidth of curiosity and identifying four unique subgroups of curious people.Journal of Research in Personality732018, 130--149back to text
- 116 articleBiological robustness.Nature Reviews Genetics5112004, 826--837back to text
- 117 articleCapturing, Clarifying, and Consolidating the Curiosity-Creativity Connection.121September 2022, 15300DOIback to textback to text
- 118 unpublishedSocialAI: Benchmarking Socio-Cognitive Abilities in Deep Reinforcement Learning Agents.October 2021, working paper or preprintHALback to text
- 119 articleLinks between intellectual humility and acquiring knowledge.The Journal of Positive Psychology1522020, 155--170back to textback to text
- 120 articleThe extended evolutionary synthesis: its structure, assumptions and predictions.Proceedings of the royal society B: biological sciences28218132015, 20151019back to text
- 121 articleThe Network Structure of Exploration and Exploitation.Administrative Science Quarterly524December 2007, 667--694URL: http://journals.sagepub.com/doi/10.2189/asqu.52.4.667DOIback to text
- 122 inproceedingsMulti-Agent Reinforcement Learning in Sequential Social Dilemmas.Proceedings of the 16th Conference on Autonomous Agents and MultiAgent SystemsAAMAS '17São Paulo, Brazil2017, 464–473back to textback to text
- 123 articleChildren as Agents of Cultural Adaptation.The Behavioral and Brain SciencesDecember 2024, 1--68DOIback to text
- 124 articlePropagation of innovations in networked groups.Journal of Experimental Psychology: General13732008, 422--433URL: http://doi.apa.org/getdoi.cfm?doi=10.1037/a0012798DOIback to text
- 125 articlePilot study of an intervention based on an intelligent tutoring system (ITS) for instructing mathematical skills of students with ASD and/or ID.Education and Information Technologies2022HALDOIback to text
- 126 articleFostering parents-professional collaboration for facilitating the school inclusion of students with ASD: Design of the ''ToGather'' web-based prototype.Educational Technology Research and DevelopmentDecember 2021HALDOIback to text
- 127 articleEffectiveness and usability of technology-based interventions for children and adolescents with ASD: A systematic review of reliability, consistency, generalization and durability related to the effects of intervention.Computers in Human Behavior93April 2019HALDOIback to text
- 128 incollectionUtilisation des technologies mobiles auprès des enfants avec TSA..Autisme et usages du numériques en éducation2022HALback to text
- 129 inproceedingsSystematic review of technologies to collaborate and co-educate students with special educational needs and supporting their schooling.IHIET 2023 - 10th International Conference on Human Interaction and Emerging Technologies111Nice, FranceAHFE InternationalAugust 2023, 1-12HALDOIback to text
- 130 articleHunter-gatherer multilevel sociality accelerates cumulative cultural evolution.Science Advances69February 2020, eaax5913DOIback to text
- 131 phdthesisThe Ecology of Open-Ended Skill Acquisition.Université de Bordeaux (UB)December 2022HALback to text
- 132 articleA Reward-Learning Framework of Knowledge Acquisition: An Integrated Account of Curiosity, Interest, and Intrinsic--Extrinsic Rewards.Psychological Review12912022, 175--198DOIback to text
- 133 articleProcess Account of Curiosity and Interest: A Reward-Learning Perspective.Educational Psychology Review314December 2019, 875--895URL: http://link.springer.com/10.1007/s10648-019-09499-9DOIback to text
- 134 techreportMassively Parallel Methods for Deep Reinforcement Learning.arXiv:1507.04296arXiv:1507.04296 [cs]arXivJuly 2015, URL: http://arxiv.org/abs/1507.04296back to text
- 135 articleThe impact of generative artificial intelligence on students' higher order thinking: Evidence from a three-level meta-analysis.Education and Information Technologies2025, 1--32back to text
- 136 unpublishedSocial Network Structure Shapes Innovation: Experience-sharing in RL with SAPIENS.July 2022, working paper or preprintHALback to text
- 137 miscSocial Network Structure Shapes Innovation: Experience-sharing in RL with SAPIENS.arXiv:2206.05060 [cs]November 2022, URL: http://arxiv.org/abs/2206.05060DOIback to textback to textback to text
- 138 inproceedingsCollective Innovation in Groups of Large Language Models.MIT PressJuly 2024, URL: https://dx.doi.org/10.1162/isal_a_00730DOIback to textback to text
- 139 inproceedingsCollective Innovation in Groups of Large Language Models.ALIFE 2024 - The Conference on Artificial LifeCopenhagen, DenmarkMIT PressJuly 2024HALDOIback to text
- 140 miscOpen-endedness: The last grand challenge you've never heard of.December 2017, URL: https://www.oreilly.com/radar/open-endedness-the-last-grand-challenge-youve-never-heard-of/back to text
- 141 articleIntrinsic Motivation Systems for Autonomous Mental Development.IEEE Transactions on Evolutionary Computation1122007, 265--286DOIback to text
- 142 articleIntrinsic Motivation for Autonomous Mental Development.IEEE Transactions on Evolutionary Computation112January 2007, 265-286HALDOIback to text
- 143 articleIntrinsic motivation, curiosity, and learning: Theory and applications in educational technologies.Progress in brain research2292016, 257--284back to text
- 144 inproceedingsWhen LLMs Play the Telephone Game: Cultural Attractors as Conceptual Tools to Evaluate LLMs in Multi-turn Settings.The Thirteenth International Conference on Learning Representations (ICLR 2025)Singapour, SingaporeApril 2025HALback to text
- 145 articleA multi-agent reinforcement learning model of common-pool resource appropriation.Advances in neural information processing systems302017back to textback to textback to text
- 146 articleCurious about Others: Relational and Empathetic Curiosity for Diverse Societies.New Formations8888March 2016, 123--142DOIback to text
- 147 bookThe Language and Thought of the Child.The Language and Thought of the ChildOxford, EnglandHarcourt, Brace1926, xxiii, 246back to text
- 148 articleA manual for the use of the Motivated Strategies for Learning Questionnaire (MSLQ)..1991back to text
- 149 inproceedingsFlow-Lenia: Towards open-ended evolution in cellular automata through mass conservation and parameter localization.The 2023 Conference on Artificial LifeTokyo, JapanMIT PressJuly 2023HALDOIback to text
- 150 inproceedingsACES: Generating diverse programming puzzles with autotelic language models and semantic descriptors.NeurIPS 2024 - The 38th Annual Conference on Neural Information Processing SystemsVancouver, CanadaDecember 2024HALback to text
- 151 articleCreative Metacognitive Feelings as a Source of Information for Creative Self-efficacy, Creativity Potential, Intrapersonal Idea Selection, and Task Enjoyment.The Journal of Creative Behavior5432020, 499--507DOIback to text
- 152 articleQuality Diversity: A New Frontier for Evolutionary Computation.Frontiers in Robotics and AI32016, URL: https://www.frontiersin.org/articles/10.3389/frobt.2016.00040back to text
- 153 incollectionConception d'une application de soutien à la coéducation pour l'inclusion scolaire des élèves TSA.Éthiques inclusives en éducation. Recherches, contextes et pratiques (p. 145-160)Parentalité & HandicapChamps Social2023, 260HALback to text
- 154 inproceedingsCross-cultural evaluation of a web application to support communication and collaboration among stakeholders of the school inclusion of children with ASD.AAATE 2023 - The 17h International Conference of the Association for the Advancement of Assistive Technology in EuropeAAATEParis, FranceAugust 2023HALback to text
- 155 unpublishedToGather, an interactive website for the stakeholders of school inclusion of children with ASD: an iterative design including user testing.2022, working paper or preprintHALback to text
- 156 articleConnections between Curiosity, Flow and Creativity.152January 2020, 109555DOIback to textback to text
- 157 articleStrange new universes: Proof assistants and synthetic foundations.Bulletin of the American Mathematical Society6122024, 257--270back to text
- 158 miscA Definition of Open-Ended Learning Problems for Goal-Conditioned Agents.June 2024DOIback to text
- 159 unpublishedThe Beneficial Role of Curiosity on Route memory in Children.January 2024, working paper or preprintHALDOIback to textback to text
- 160 articleGroup selection and kin selection.Nature2011964, 1145-1147URL: https://doi.org/10.1038/2011145a0back to text
- 161 articleDesigning Neural Networks through Neuroevolution.Nature Machine Intelligence11January 2019, 24--35DOIback to text
- 162 articleThe sense of agency scale: A measure of consciously perceived control over one's mind, body, and the immediate environment.Frontiers in psychology82017, 1552back to text
- 163 incollectionCuriosity-Driven Exploration: Diversity of Mechanisms and Functions.The Drive for Knowledge: The Science of Human Information Seeking2022DOIback to text
- 164 articleBehavioral measures of humility: Part 1. Theoretical and methodological review.The Journal of Positive Psychology1852023, 711--721back to text
- 165 articleMetacognition bridges experiences and beliefs in sense of agency.Consciousness and Cognition1242024, 103745back to text
- 166 articleTrustworthy artificial intelligence (AI) in education: Promises and challenges.OECD education working papers2182020, 0_1--17back to text
- 167 articleInvestment and intellect: a review and meta-analysis..Psychological bulletin13942013, 841back to text
- 168 bookMind in society: Development of higher psychological processes.Harvard university press1978back to text
- 169 articleIntellectual humility.Philosophy and Phenomenological Research9432017, 509--539back to text
- 170 inproceedingsSupporting Qualitative Analysis with Large Language Models: Combining Codebook with GPT-3 for Deductive Coding.IUI 2023 - 28th International Conference on Intelligent User InterfacesSydney, AustraliaACMMarch 2023, 75-78HALDOIback to text