2025Activity reportProject-TeamLARSEN
RNSR: 201521241C- Research center Inria Centre at Université de Lorraine
- In partnership with:Université de Lorraine, CNRS
- Team name: Lifelong Autonomy and interaction skills for Robots in a Sensing ENvironment
- In collaboration with:Laboratoire lorrain de recherche en informatique et ses applications (LORIA)
Creation of the Project-Team: 2017 December 01
Each year, Inria research teams publish an Activity Report presenting their work and results over the reporting period. These reports follow a common structure, with some optional sections depending on the specific team. They typically begin by outlining the overall objectives and research programme, including the main research themes, goals, and methodological approaches. They also describe the application domains targeted by the team, highlighting the scientific or societal contexts in which their work is situated.
The reports then present the highlights of the year, covering major scientific achievements, software developments, or teaching contributions. When relevant, they include sections on software, platforms, and open data, detailing the tools developed and how they are shared. A substantial part is dedicated to new results, where scientific contributions are described in detail, often with subsections specifying participants and associated keywords.
Finally, the Activity Report addresses funding, contracts, partnerships, and collaborations at various levels, from industrial agreements to international cooperations. It also covers dissemination and teaching activities, such as participation in scientific events, outreach, and supervision. The document concludes with a presentation of scientific production, including major publications and those produced during the year.
Keywords
Computer Science and Digital Science
- A5. Interaction, multimedia and robotics
- A5.10. Robotics
- A5.10.3. Planning
- A5.10.4. Robot control
- A5.10.5. Robot interaction (with the environment, humans, other robots)
- A5.10.6. Swarm robotics
- A5.10.8. Cognitive robotics and systems
- A8.2. Optimization
- A8.2.2. Evolutionary algorithms
- A8.11. Game Theory
- A9. Artificial intelligence
- A9.2. Machine learning
- A9.2.3. Reinforcement learning
- A9.2.5. Bayesian methods
- A9.5. Robotics and AI
- A9.7. AI algorithmics
- A9.9. Distributed AI, Multi-agent
Other Research Topics and Application Domains
- B1.2.2. Cognitive science
- B4. Energy
- B5.1. Factory of the future
- B5.6. Robotic systems
- B7. Transport and logistics
- B7.2.1. Smart vehicles
- B9.6. Humanities
- B9.6.1. Psychology
1 Team members, visitors, external collaborators
Research Scientists
- Francis Colas [Team leader, INRIA, Researcher, HDR]
- Olivier Buffet [INRIA, Researcher, HDR]
- Bruno Scherrer [INRIA, Researcher, HDR]
Faculty Members
- Amine Boumaza [UL, Associate Professor Delegation, from Sep 2025]
- Amine Boumaza [UL, Associate Professor, until Aug 2025]
- Sophie Lemonnier [UL, Associate Professor Delegation, until Aug 2025]
- Alexis Scheuer [UL, Associate Professor]
- Vincent Thomas [UL, Associate Professor]
PhD Students
- Raphael Boige [UL]
- Aubin Delaveau [AIRBUS, CIFRE, from Feb 2025]
- Salome Lepers [UL, until Nov 2025]
- Antonin Rousseau [UL, from Nov 2025]
- Aya Yaacoub [CNRS, until Sep 2025]
Technical Staff
- Olivier Rochel [INRIA, Engineer]
Interns and Apprentices
- Emna Debbech [UL, from Jun 2025 until Sep 2025]
- Camille Desplas [INRIA, Intern, from May 2025 until Jun 2025]
- Nada El Hanafi [INRIA, Intern, from Jun 2025 until Aug 2025]
- Jarod Galbrun [ENS DE LYON, Intern, from Jun 2025 until Jul 2025]
- Paul Loisil [UL, Intern, from Mar 2025 until Aug 2025]
- Adrien Naigeon [INRIA, Intern, from Apr 2025 until Jul 2025]
- Nicolas Queignec [INRIA, Intern, from Mar 2025 until Jul 2025]
- Bryan Rosenstiehl [UL, Intern, from Mar 2025 until Sep 2025]
Administrative Assistants
- Véronique Constant [INRIA]
- Antoinette Courrier [CNRS]
External Collaborator
- Sophie Lemonnier [UL, from Nov 2025]
2 Overall objectives
The goal of the Larsen team is to move robots beyond the research laboratories and manufacturing industries: current robots are far from being the fully autonomous, reliable, and interactive robots that could co-exist with us in our society and run for days, weeks, or months. While there is undoubtedly progress to be made on the hardware side, robotic platforms are quickly maturing and we believe the main challenges to achieve our goal are now on the software side. We want our software to be able to run on low-cost mobile robots that are therefore not equipped with high-performance sensors or actuators, so that our techniques can realistically be deployed and evaluated in real settings, such as in service and assistive robotic applications. We envision that these robots will be able to cooperate with each other but also with intelligent spaces or apartments which can also be seen as robots spread in the environment. Like robots, intelligent spaces are equipped with sensors that make them sensitive to human needs, habits, gestures, etc., and actuators to be adaptive and responsive to environment changes and human needs. These intelligent spaces can give robots improved skills, with less expensive sensors and actuators enlarging their field of view of human activities, making them able to behave more intelligently and with better awareness of people evolving in their environment. As robots and intelligent spaces share common characteristics, we will use, for the sake of simplicity, the term robot for both mobile robots and intelligent spaces.
Among the particular issues we want to address, we aim at designing robots that are able to:
- handle dynamic environments and unforeseen situations;
- cope with physical damage;
- interact physically and socially with humans;
- collaborate with each other;
- exploit the multitude of sensor measurements from their surroundings;
- enhance their acceptability and usability by end-users without robotics background.
All these abilities can be summarized by the following two major objectives:
- life-long autonomy: continuously perform tasks while adapting to sudden or gradual changes in both the environment and the morphology of the robot;
- natural interaction with robotics systems: interact with both other robots and humans for long periods of time, taking into account that people and robots learn from each other when they live together.
Note that, this year, the Hucebot spin-off team has separated from the Larsen team. The rest of the team is proposing the Magda follow-up team, which is under examination.
3 Research program
3.1 Lifelong autonomy
Scientific context
So far, only a few autonomous robots have been deployed for a long time (weeks, months, or years) outside of factories and laboratories. They are mostly mobile robots that simply “move around” (e.g., vacuum cleaners or museum “guides”) and data collecting robots (e.g., boats or underwater “gliders” that collect data about the water of the ocean).
A large part of the long-term autonomy community is focused on simultaneous localization and mapping (SLAM), with a recent emphasis on changing and outdoor environments 21, 28. A more recent theme is life-long learning: during long-term deployment, we cannot hope to equip robots with everything they need to know, therefore some things will have to be learned along the way. Most of the work on this topic leverages machine learning and/or evolutionary algorithms to improve the ability of robots to react to unforeseen changes 21, 26.
Main challenges
The first major challenge is to endow robots with a stable situation awareness in open and dynamic environments. This covers both the state estimation of the robot by itself as well as the perception/representation of the environment. Both problems have been claimed to be solved but it is only the case for static environments 25.
In the Larsen team, we aim at deployment in environments shared with humans which imply dynamic objects that degrade both the mapping and localization of a robot, especially in cluttered spaces. Moreover, when robots stay longer in the environment than for the acquisition of a snapshot map, they have to face structural changes, such as the displacement of a piece of furniture or the opening or closing of a door. The current approach is to simply update an implicitly static map with all observations but without attempt at distinguishing the suitable changes. For localization in not-too-cluttered or not-too-empty environments, this is generally sufficient since a significant fraction of the environment should remain stable. But for life-long autonomy, and in particular for navigation, the quality of the map, and especially the knowledge of the stable parts, is primordial.
A second major obstacle to moving robots outside of labs and factories is their fragility: Current robots often break in a few hours, if not a few minutes. This fragility mainly stems from the overall complexity of robotic systems, which involve many actuators, many sensors, and complex decisions, and from the diversity of situations that robots can encounter. Low-cost robots exacerbate this issue because they can be broken in many ways (high-quality material is expensive), because they have low self-sensing abilities (sensors are expensive and increase the overall complexity), and because they are typically targeted towards non-controlled environments (e.g., houses rather than factories, in which robots are protected from most unexpected events). More generally, this fragility is a symptom of the lack of adaptive abilities in current robots.
Angle of attack
To solve the state estimation problem, our approach is to combine classical estimation filters (Extended Kalman Filters, Unscented Kalman Filters, or particle filters) with a Bayesian reasoning model in order to internally simulate various configurations of the robot in its environment. This should allow for adaptive estimation that can be used as one aspect of long-term adaptation. To handle dynamic and structural changes in an environment, we aim at assessing, for each piece of observation, whether it is static or not.
We also plan to address active sensing to improve the situation awareness of robots. Literally, active sensing is the ability of an interacting agent to act so as to control what it senses from its environment with the typical objective of acquiring information about this environment. A formalism for representing and solving active sensing problems has already been proposed by members of the team 20 and we aim to use it to formalize decision-making problems for improving situation awareness.
Situation awareness of robots can also be tackled by cooperation, whether it be between robots or between robots and sensors in the environment (deployed in sensorized environments) or between robots and humans. This is in rupture with classical robotics, in which robots are conceived as self-contained. But, in order to cope with as diverse environments as possible, these classical robots use precise, expensive, and specialized sensors, whose cost prohibits their use in large-scale deployments for service or assistance applications. Furthermore, when all sensors are on the robot, they share the same point of view on the environment, which is a limit for perception. Therefore, we propose to complement a cheaper robot with sensors distributed in a target environment.
To address the fragility problem, the traditional approach is to first diagnose the situation, then use a planning algorithm to create/select a contingency plan. But, again, this calls for both expensive sensors on the robot for the diagnosis and extensive work to predict and plan for all the possible faults that, in an open and dynamic environment, are almost infinite. An alternative approach is then to skip the diagnosis and let the robot discover by trial and error a behavior that works in spite of the damage with a reinforcement learning algorithm 33, 26. However, current reinforcement learning algorithms require hundreds of trials/episodes to learn a single, often simplified, task 26, which makes them impossible to use for real robots and more ambitious tasks. We therefore need to design new trial-and-error algorithms that will allow robots to learn with a much smaller number of trials (typically, a dozen). We think the key idea is to guide online learning on the physical robot with dynamic simulations. For instance, in our recent work, we successfully mixed evolutionary search in simulation, physical tests on the robot, and machine learning to allow a robot to recover from physical damage 27, 1.
A key functionality of autonomy is the capacity to make decision. Our approach is to address it within the framework of sequential decision making which can be studied using Markov Decision Processes and other derived models. A stronger research direction of the team consists in their theoretical study, which involves stochastic (or sometimes deterministic) games.
A final approach to address fragility is to deploy several robots or a swarm of robots or to make robots evolve in an active environment. We will consider several paradigms such as (1) those inspired from collective natural phenomena in which the environment plays an active role for coordinating the activity of a huge number of biological entities such as ants and (2) those based on online learning 24. We envision to transfer our knowledge of such phenomenon to engineer new artificial devices such as an intelligent floor (which is in fact a spatially distributed network in which each node can sense, compute and communicate with contiguous nodes and can interact with moving entities on top of it) in order to assist people and robots (see the principle in 31, 24, 19).
3.2 Natural interaction with robotic systems
Scientific context
Interaction with the environment is a primordial requirement for an autonomous robot. When the environment is sensorized, the interaction can include localizing, tracking, and recognizing the behavior of robots and humans. One specific issue lies in the lack of predictive models for human behavior and a critical constraint arises from the incomplete knowledge of the environment and the other agents.
On the other hand, when working in the proximity of or directly with humans, robots must be capable of safely interacting with them, which calls upon a mixture of physical and social skills. Currently, robot operators are usually trained and specialized but potential end-users of robots for service or personal assistance are not skilled robotics experts, which means that the robot needs to be accepted as reliable, trustworthy and efficient 36. Most Human-Robot Interaction (HRI) studies focus on verbal communication 32 but applications such as assistance robotics require a deeper knowledge of the intertwined exchange of social and physical signals to provide suitable robot controllers.
Main challenges
We are here interested in building the bricks for a situated HRI addressing both the physical and social dimension of the close interaction, and the cognitive aspects related to the analysis and interpretation of human movement and activity.
The combination of physical and social signals into robot control is a crucial investigation for assistance robots 34 and robotic co-workers 30. A major obstacle is the control of physical interaction (precisely, the control of contact forces) between the robot and the human while both partners are moving. In mobile robots, this problem is usually addressed by planning the robot movement taking into account the human as an obstacle or as a target, then delegating the execution of this “high-level” motion to whole-body controllers, where a mixture of weighted tasks is used to account for the robot balance, constraints, and desired end-effector trajectories 18.
The first challenge is to make these controllers easier to deploy in real robotics systems, as currently they require a lot of tuning and can become very complex to handle the interaction with unknown dynamical systems such as humans. Here, the key is to combine machine learning techniques with such controllers.
The second challenge is to make the robot react and adapt online to the human feedback, exploiting the whole set of measurable verbal and non-verbal signals that humans naturally produce during a physical or social interaction. Technically, this means finding the optimal policy that adapts the robot controllers online, taking into account feedback from the human. Here, we need to carefully identify the significant feedback signals or some metrics of human feedback. In real-world conditions (i.e., outside the research laboratory environment) the set of signals is technologically limited by the robot's and environmental sensors and the onboard processing capabilities.
The third challenge is for a robot to be able to identify and track people on board. The motivation is to be able to estimate online either the position, the posture, or even moods and intentions of persons surrounding the robot. The main challenge is to be able to do that online, in real-time and in cluttered environments.
Angle of attack
Our key idea is to exploit the physical and social signals produced by the human during the interaction with the robot and the environment in controlled conditions, to learn simple models of human behavior and consequently to use these models to optimize the robot movements and actions. In a first phase, we will exploit human physical signals (e.g., posture and force measurements) to identify the elementary posture tasks during balance and physical interaction. The identified model will be used to optimize the robot whole-body control as prior knowledge to improve both the robot balance and the control of the interaction forces. Technically, we will combine weighted and prioritized controllers with stochastic optimization techniques. To adapt online the control of physical interaction and make it possible with human partners that are not robotics experts, we will exploit verbal and non-verbal signals (e.g., gaze, touch, prosody). The idea here is to estimate online from these signals the human intent along with some inter-individual factors that the robot can exploit to adapt its behavior, maximizing the engagement and acceptability during the interaction.
Another promising approach already investigated in the Larsen team is the capability for a robot and/or an intelligent space to localize humans in its surrounding environment and to understand their activities. This is an important issue to handle both for safe and efficient human-robot interaction.
Simultaneous Tracking and Activity Recognition (STAR) 35 is an approach we want to develop. The activity of a person is highly correlated with his position, and this approach aims at combining tracking and activity recognition to make one benefit from the other. By tracking the individual, the system may help infer its possible activity, while by estimating the activity of the individual, the system may make a better prediction of his/her possible future positions (especially in the case of occlusions). This direction has been tested with simulator and particle filters 23, and one promising direction would be to couple STAR with decision making formalisms like partially observable Markov decision processes (POMDPs). This would allow us to formalize problems such as deciding which action to take given an estimate of the human location and activity. This could also formalize other problems linked to the active sensing direction of the team: how should the robotic system choose its actions in order to better estimate the human location and activity (for instance by moving in the environment or by changing the orientation of its cameras)?
Another issue we want to address is robotic human body pose estimation. Human body pose estimation consists of tracking body parts by analyzing a sequence of input images from single or multiple cameras.
Human posture analysis is of high value for human robot interaction and activity recognition. However, even though the arrival of new sensors like RGB-D cameras has simplified the problem, it still poses a great challenge, especially if we want to do it online, on a robot and in realistic world conditions (cluttered environment). This is even more difficult for a robot to bring together different capabilities both at the perception and navigation level 22. This will be tackled through different techniques, going from Bayesian state estimation (particle filtering), to learning, active and distributed sensing.
4 Application domains
4.1 Personal assistance
During the last fifty years, many medical advances as well as the improvement of the quality of life have resulted in a longer life expectancy in industrial societies. The increase in the number of elderly people is a matter of public health because although elderly people can age in good health, old age also causes embrittlement, in particular on the physical plan which can result in a loss of autonomy. That will lead us to re-think the current model regarding the care of elderly people.1 Capacity limits in specialized institutes, along with the preference of elderly people to stay at home as long as possible, explain a growing need for specific services at home.
Ambient intelligence technologies and robotics could contribute to this societal challenge. The spectrum of possible actions in the field of elderly assistance is very large, going from activity monitoring services to mobility or daily activity aids, medical rehabilitation, and social interactions. This will be based on the experimental infrastructure we have built in Nancy (Smart apartment platform) as well as the deep collaboration we have with OHS 2 and the company Pharmagest and its subsidiary Diatelic,created in 2002 by a member of the team and others.
At the same time, these technologies can be beneficial to address the increasing development of musculo-skeletal disorders and diseases that is caused by the non-ergonomic postures of the workers, subject to physically stressful tasks. Wearable technologies, sensors and robotics, can be used to monitor the worker's activity, its impact on their health, and anticipate risky movements. Two application domains have been particularly addressed in the last years: industry, and more specifically manufacturing, and healthcare.
4.2 Civil robotics
Many applications for robotics technology exist within the services provided by national and local government. Typical applications include civil infrastructure services 3 such as: urban maintenance and cleaning; civil security services; emergency services involved in disaster management including search and rescue; environmental services such as surveillance of rivers, air quality, and pollution. These applications may be carried out by a wide variety of robots and operating modalities, ranging from single robots to small fleets of homogeneous or heterogeneous robots. Often robot teams will need to cooperate to span a large workspace, for example in urban rubbish collection, and operate in potentially hostile environments, for example in disaster management. These systems are also likely to have extensive interaction with people and their environments.
The skills required for civil robots match those developed in the Larsen project: operating for a long time in potentially hostile environment, potentially with small fleets of robots, and potentially in interaction with people.
5 Latest software developments, platforms, open data
5.1 Latest software developments
5.1.1 pepper_driver_ext
-
Name:
Additional ROS drivers for the Pepper robot.
-
Keywords:
Pepper robot, Driver
-
Functional Description:
These drivers extend the robot functionalities accessible in ROS in complement to the original drivers.
Namely they add:
- handling of the autonomous state of the robot,
- (un)loading and (de)activating dialog topics,
- text-to-speech functionality,
- ALMemory access,
- publication of detected landmarks,
- tablet handling,
- LED control,
- fixed /cmd_vel driver.
They are designed for Ubuntu 16.04 with ROS kinetic and Python 2.7 to work alongside the official ROS drivers. For newer platforms (Ubuntu 20.04 and later), tools and Dockerfiles are available to use the Pepper and its drivers from ROS 2.
-
Release Contributions:
Lower resource consumption via on-demand subscription.
- URL:
-
Contact:
Francis Colas
-
Participants:
Francis Colas, Vincent Colotte
5.1.2 PACR Project
-
Keywords:
Robotics, Simulation, Teaching
-
Functional Description:
Set of tools and documentation to implement and learn a complex robotic navigation task. The task requires probabilistic planning, path planning, and path following with obstacle avoidance. The software proposes a simulated warehouse environment with several robots and libraries and code to accomplish the project.
This is planned as a 3x2h project lab work for the PACR module on computer science M2 IA²VR at Université de Lorraine.
-
Release Contributions:
Support for:
- Ubuntu 22.04 with ROS 2 Humble and Gazebo Fortress
- Ubuntu 24.04 with ROS 2 Jazzy and Gazebo Harmonic
- URL:
-
Contact:
Francis Colas
-
Participants:
Francis Colas, Alexis Scheuer, Vincent Thomas
-
Partner:
Université de Lorraine
5.1.3 ILIAR Project
-
Name:
Autonomous driving simulation and project for computer science teaching
-
Keywords:
Robotics, Autonomous Cars, Teaching, Machine learning, Computer vision
-
Functional Description:
Set of tools and documentation defining a simulation of an autonomous car on a circuit so as to implement a controller based on machine learning.
The documentation is a step-by-step tutorial to implement the following steps:
- system discovery,
- teleoperation,
- data collection,
- actual machine learning,
- controller implementation,
- controller evaluation.
This project is the main part of the ILIAR module in computer science M2 IA²VR at Université de Lorraine.
-
Release Contributions:
Support for:
- Ubuntu 22.04 with ROS 2 Humble and Gazebo Fortress.
- Ubuntu 24.04 with ROS 2 Jazzy and Gazebo Harmonic.
- URL:
-
Contact:
Francis Colas
-
Participants:
Francis Colas, Jérémy Fix
-
Partner:
CentraleSupélec
6 New results
6.1 Algorithms for planning and optimization
A simple random game model for a better analysis of deterministic game-solving algorithms
Participants: Raphael Boige, Amine Boumaza, Bruno Scherrer.
Deterministic game-solving algorithms are conventionally analyzed in the light of their average-case complexity on some random model. We have introduced a new simple probabilistic model that incrementally constructs game-trees using a fixed level-wise conditional distribution. By enforcing ancestor dependencies, a critical structural feature of real-world games, our framework generates problems with adjustable difficulty while retaining some form of analytical tractability. For several algorithms, including AlphaBeta and Scout, we have derived recursive formulas characterizing their average-case complexities under this model. These allow us to rigorously compare algorithms on deep game-trees, where Monte-Carlo simulations are no longer feasible. This work has been published in 6
-Optimally Solving Two-Player Zero-Sum POSGs
Participants: Olivier Buffet.
Collaboration with Jilles Dibangoye, Erwan Escudie, and Matthia Sabatelli (University of Groningen).
Many robotic scenarios involve multiple interacting agents, robots or humans, e.g., security robots in public areas. After addressing in the past the collaborative setting, where all agents share one objective 2, we have applied a similar approach in the important 2-player zero-sum setting, i.e., with two competing agents, and proposed an algorithm for partially observable Stochastic Games (POSGs), turning the problem into an occupancy Markov game, and deriving bounding approximators that build on two types of continuity properties: Lipschitz-continuity, and convexity and concavity properties.
This year, we have introduced a lossless reduction from zs-POSGs to transition-independent zs-SGs, enabling the principled application of a broad class of DP-based methods. We show empirically that point-based value iteration (PBVI) algorithms, applied via this reduction, produce -optimal strategies across a range of benchmark domains, consistently matching or outperforming existing state-of-the-art methods. This work and the obtained results have been published in 8.
Partially Observable Monte-Carlo Graph Search
Participants: Yang You, Olivier Buffet, Vincent Thomas.
This work also involves the Oxford Robotics Institute and the UK Atomic Energy Authority where Yang You (former member of Larsen project-team) is pursuing post-doctoral research.
Currently, large partially observable Markov decision processes (POMDPs) are often solved by sampling-based online methods which interleave planning and execution phases. However, a pre-computed offline policy is more desirable in POMDP applications with time or energy constraints. But previous offline algorithms are not able to scale up to large POMDPs.
We worked on a new sampling-based algorithm, the partially observable Monte-Carlo graph search (POMCGS) to solve large POMDPs offline. Instead of developing a tree while performing Monte-Carlo simulations, POMCGS folds this tree on the fly, thus generating a policy graph, reducing computations and increasing the interpretability of the policy. By adding progressive widening and observation clustering, POMCSGs is able to address some continuous POMDPs. This work and the obtained results on classical POMDP benchmarks have been published in 12.
Post-Hoc Interpretation of POMDP Policies
Participants: Olivier Buffet.
Collaboration with Geoffrey Laforest, Alexandre Niveau, and Bruno Zanuttini (GREYC, Université de Caen Normandie) as part of the ANR project EpiRL.
Dynamic epistemic logic allows reasoning about an agent's knowledge and its evolution given the occurrence of events. Recent work has developed epistemic planning, i.e., seeking for a sequence of actions that leads to some state of knowledge (e.g., knowing the value of some variable, or whether some fact is true or not). The EpiRL ANR project aims at performing a similar task through (possibly deep) reinforcement learning, i.e., learning a behavior by trial and error.
In this context, the PhD thesis of Geoffrey Laforest in Caen (supervised by Bruno Zanuttini and Alexandre Niveau) looks in particular at the choice of suitable representations for the state of knowledge (which should be compact, and ideally embed probabilities, due to the stochastic dynamics). These representations can then be used as input of control policies.
We proposed to redescribe policies into mappings defined on features of the current belief state, built in a systematic manner from state features. Such a mapping can in turn be represented by an intelligible object, like a decision tree, thereby providing an interpretable representation of the policy as a whole. We moreover showed how our approach allows to explain the decision taken by an agent at each step of an interaction with the environment. This provides an end-to-end process, starting from a policy computed by any solver, and ending with an explanation of each decision made at execution time. In 14, 9, 15, we formally define our approach, investigate related computational problems, and report on experiments on several families of problems.
6.2 Planning for collaborative and mobile robotics
Task-planning for human robot collaboration
Participants: Yang You, Francis Colas, Olivier Buffet, Vincent Thomas.
This work has been done in part during the former ANR project Flying CoWorker.
This work focuses on high-level decision making for collaborative robotics. When a robot has to assist a human worker, it has no direct access to the worker's current intention or preferences but has to adapt its behavior to help the human complete his task.
Human-robot collaboration often necessitates the robot to adapt to the uncertainty of human objectives and their induced behaviors. This may require the robot to have a human model to anticipate human partners' objectives and predict their actions, which is typically learned by the robot through available human data. However, in complex collaboration tasks, a chicken-and-egg problem arises because human data cannot be collected without a collaborative robot policy in the first place. We had previously proposed to describe the human-robot collaboration task with Markov decision models and to solve the chicken-and-egg problem through a probabilistic planning algorithm. This year, we have contributed an online version of this approach. This online framework can automatically derive a human model without real human data and plan robust robot actions to support human partners with respect to their uncertain objectives and behaviors. Through experiments with a human-robot co-working scenario, we demonstrate that our online method outperforms the previous offline approach in terms of scalability and the ability to plan robot actions within a bounded time. This work has been presented in 11.
Explicability and interpretability in probabilistic planning
Participants: Salomé Lepers, Sophie Lemonnier, Olivier Buffet, Vincent Thomas.
Part of this work is a collaboration with Shuwa Miura and Shlomo Zilberstein from University of Massachusetts (UMass) at Amherst.
In a human-agent collaboration scenario, some properties of the agent behavior can be useful for the human and sometimes allow a better collaboration. These properties include, for instance, legibility (legible behaviors convey intentions, i.e., actual task at hand, via action choices), explicability (explicable behaviors conform to observers' expectations, i.e., they appear to have some purpose), and predictability (a behavior is usually considered predictable if it is easy to guess the end of an on-going trajectory). Recent theoretical frameworks allow formalizing such properties and proposing algorithms to enforce them. In Salomé Lepers' PhD thesis, we build in particular on Miura and Zilberstein's OAMDP framework (observer aware Markov decision process), where an agent interacts with a stochastic environment while trying to optimize a criterion that depends on an external observer's belief.
We have first looked in particular at predictability, where the end of the current trajectory may highly depend on the outcome of each action, and thus proposed that predictability should be about minimizing the number of errors when an external observer is asked repeatedly to guess the next action or next state. This has been formalized in a variant of the observer-aware Markov Decision Process (OAMDP) formalism, naturally coming with simple algorithms that efficiently find optimal solutions. We conducted in silico and in vivo experiments (where actual humans observe the behavior of artificial agents) on simple grid-world problems to validate the approach, as presented in 5.
The main direction we have then been pursuing as part of Salomé Lepers' PhD thesis is to allow for an observer with partial and noisy observability (the agent knowing exactly what the observer perceives), which led to introducing the PO-OAMDP formalism (partially observable OAMDPs). We have shown that this formalism is a strict generalization of OAMDPs, and that the current state of the system, along with the observer's belief about that state, make for a sufficient statistic for optimal planning. This allowed proposing a variant of the heuristic search value iteration algorithm that relies on pointwise and cone approximators, the later leveraging Lipschitz-continuity. We also demonstrated that, in stochastic shortest-path problems, some information-oriented criteria may not induce policies that reach a terminal state with probability 1, what can be fixed by adding a per-step cost. These results, with illustrations of the resulting behaviors on various problems, are presented in 10, 16.
During this year, Salomé also wrote her PhD thesis manuscript and defended her PhD thesis on December 17, 2025.
Adaptive control of collaborative robots for preventing musculoskeletal disorders
Participants: Aya Yaacoub, Francis Colas, Vincent Thomas.
This work is done in collaboration with Pauline Maurice from the HUCEBOT team.
The use of collaborative robots in direct physical collaboration with humans constitutes a possible answer to musculoskeletal disorders: not only can they relieve the worker from heavy loads, but they could also guide them towards more ergonomic postures. In this context, one objective of the ROOIBOS Project is to build adaptive robot strategies that are optimal regarding productivity but also the long-term health and comfort of the human worker, by adapting the robot behavior to the human's physiological state.
In previous works, we proposed to use Partially Observable Markov Decision Processes (POMDP) to compute a robot policy taking into account the long-term consequences of the biomechanical demands on the human worker's joints (joint loading) and to distribute the efforts among the different joints during the execution of a repetitive task. The proposed platform merges within the same framework several works conducted in the Larsen team, namely virtual human modeling and simulation, fatigue estimate and decision making in the face of uncertainties. We also proposed an approach to automatically extract a small discrete set of relevant actions from the continuous action space by indentifying and gathering relevant actions through short-term planning (greedy-based selection approach) phases.
During this year, we designed an experiment to validate the effectiveness of the proposed POMDP-based planning approach for fatigue mitigation with human subjects, in a human-robot co-manipulation repeated task. Although the protocol hes been fully described, the experiments have not yet been conducted because of time limitations.
During 2025, Aya wrote her PhD thesis manuscript, and with the approval of the reviewers, her thesis defense is scheduled for February 3, 2026.
Identifying human movement strategies for human-robot collaboration
Participants: Vincent Thomas, Francis Colas.
This work is done in collaboration with former post-doc Jessica Colombel and with Pauline Maurice from the HUCEBOT team.
In human-robot physical collaboration, it is necessary that the robot can anticipate the whole-body posture of the human co-worker to enable a seamless and efficient collaboration. When co-manipulating an object, the human posture is partly guided by the pose of the robot end-effector. However owing to the high kinematic redundancy of the human body, an infinity of postures can in theory be adopted for a same hand pose. In practice, human movements are largely stereotyped, which drastically reduces the number of observed solutions. Yet some diversity remains, which we refer to as “movement strategies”. The objective of this work is to develop a methodology to identify human movement strategies in a manual task, and explore the relations between movement strategies annd factors such as anthropometry and physical fatigue. During the postdoctoral work of Jessica Colombel, we designed an experiment and conducted a large data collection campaign, to acquire human motion data in a repetitive manual task to work on. We started to analyze the data, and explored diffusion methods to cluster the data in different movement strategies. Inverse optimal control is also a method that we plan to explore.
This line of work led to a preliminary publication as an abstract and presentation in a French biomechanics conference 4.
7 Bilateral contracts and grants with industry
7.1 Bilateral grants with industry
PhD grant with Airbus
Participants: Olivier Buffet, Aubin Delaveau.
Collaboration with Florent Teichtel-Königsbuch (Airbus).
The thesis is funded by Airbus to contribute to the development of new aircraft with improved hybrid energy management, assuming that both kerosene and hydrogen fuel cells can be used to produce electricity. This involves both high-level energy management for the whole duration of a flight, and low-level (thus real-time) control of a physical (electric) system under various conditions.
8 Partnerships and cooperations
8.1 National initiatives
8.1.1 ANR : EpiRL
Participants: Olivier Buffet.
-
Program:
ANR
-
Project title:
Apprentissage par renforcement épistémique
-
Duration:
February 2023 – February 2027
-
Coordinator:
François Schwarzentruber (École normale supérieure de Lyon)
-
Partner institutions:
IRIT CNRS, DAVI, IRISA ENS de Rennes, GREYC Université de Caen-Normandie, ENS de Lyon
-
Abstract:
EpiRL project aims at investigating the combination of epistemic planning and reinforcement learning (RL), by proposing new algorithms that are efficient, adaptive, and capable of computing decisions relying on theory of knowledge and belief. We expect from this approach an efficiency in the generation of epistemic plans, while decisions made by RL algorithms will be explainable. Moreover, the algorithms of EpiRL will be tested and evaluated within a real application that exploits autonomous agents.
The project will address the weaknesses of both epistemic planning and RL: on the one hand, existing epistemic planning algorithms are costly, do not adapt to the environment, and concepts are hand-crafted and are not learned; on the other hand, in reinforcement learning, agents adapt to their environments but are unable to reason about beliefs of other agents. The newly developed algorithms will combine the strengths of both fields.
Four workpackages are proposed:
- Study representations of states
- Develop RL algorithms
- Study representations of policies
- Validating the algorithms with the industrial partner DAVI, in particular, through the development of a debunking chatbot whose use case will apply to raising awareness about environmental issues.
In this project our responsibility lies in the study and definition of representations for the knowledge state and the policy, for reinforcement learning algorithms.
9 Dissemination
9.1 Promoting scientific activities
9.1.1 Scientific events: organisation
Member of the organizing committees
- Amine Boumaza was a member of the organizing comittee of DODO-2025 at OrangeLab Lannion.
9.1.2 Scientific events: selection
Member of the conference program committees
- Amine Boumaza was a PC member for GECCO-2025.
- Olivier Buffet was a PC member for AAAI-2025, AAMAS-2025, IJCAI-2025, ECAI-2025, JIAF-2025, UAI-2025.
- Francis Colas was PC member for IJCAI-2025, ECAI-2025.
- Vincent Thomas was PC member for JIAF-2025, ECAI-2025.
Reviewer
- Amine Boumaza was a reviewer for Alife-2025.
- Olivier Buffet was a reviewer for EWRL-2025.
- Francis Colas was reviewer for IROS-2025.
9.1.3 Journal
Reviewer - reviewing activities
- Olivier Buffet was a reviewer for EJAI (European Journal on Artificial Intelligence), JAIR (Journal of Artificial Intelligence Research), and IEEE ToG (Transactions on Games).
9.1.4 Scientific expertise
- Francis Colas was expert for an ASTRID project.
- Francis Colas was expert for a CIFRE project.
9.1.5 Research administration
- Amine Boumaza is a member of comité de centre.
- Olivier Buffet is a member of the FSSSCT.
- Francis Colas is member of the following commissions: Comipers, ComiDoc, Comité de Centre.
- Vincent Thomas is a member of the commission IES du centre.
9.2 Teaching - Supervision - Juries - Educational and pedagogical outreach
9.2.1 Teaching
Vincent Thomas is co-responsible of the parcours "Intelligence Artificielle et ses Applications en Vision et Robotique" of “Master 2 Informatique”, Univ. Lorraine, France.
- Master: Amine Boumaza , “Recherche Locale Stochastique et Métaheuristiques”, 30h eq. TD, M1 Informatique, Univ. Lorraine, France.
- Master: Francis Colas , “Planification de trajectoires”, 17h eq. TD, M2 Informatique “Intelligence Artificielle et ses Applications en Vision et Robotique”, Univ. Lorraine, France.
- Master: Francis Colas , “Ingénierie Logicielle pour l'Intelligence Artificielle et la Robotique”, 15h eq. TD, M2 Informatique “Intelligence Artificielle et ses Applications en Vision et Robotique”, Univ. Lorraine, France.
- Master: Francis Colas , “Situation Intégratrice”, 36h eq. TD, M2 Informatique “Intelligence Artificielle et ses Applications en Vision et Robotique”, Univ. Lorraine, France.
- Master: Alexis Scheuer , “Contrôle d'exécution”, 17h eq. TD, M2 Informatique “Intelligence Artificielle et ses Applications en Vision et Robotique”, Univ. Lorraine, France.
- Master: Alexis Scheuer , “Éléments de robotique”, 4h eq. TD, M2 MEEF, Univ. Lorraine, France.
- Master: Alexis Scheuer , “Robotique autonome”, 30h eq. TD, M1 Informatique, Univ. Lorraine, France.
- Master: Vincent Thomas , “Planification et apprentissage”, 17h eq. TD, M2 Informatique “Intelligence Artificielle et ses Applications en Vision et Robotique”, Univ. Lorraine, France.
- Master: Vincent Thomas , “Game Design”, 37h eq. TD, M1 Sciences Cognitives, Univ. Lorraine, France.
- Master: Vincent Thomas , “Agent Intelligent”, 25h eq. TD, M1 Sciences Cognitives, Univ. Lorraine, France
- Bachelor: Alexis Scheuer , “Optimisation”, 53h eq. TD, L3 Informatique (FST), Univ. Lorraine, France.
- Bachelor: Alexis Scheuer , “Conception”, 8h eq. TD, L3 Informatique (FST), Univ. Lorraine, France.
- Bachelor: Alexis Scheuer , “Introduction à la robotique”, 40h eq. TD, L2 Informatique (FST), Univ. Lorraine, France.
- Bachelor: Alexis Scheuer , “Bureautique”, 40h eq. TD, L1 Informatique (FST), Univ. Lorraine, France.
- Bachelor: Vincent Thomas , “Conception et Programmation”, 168h eq. TD, BUT Informatique, Univ. Lorraine, France.
- Bachelor: Vincent Thomas , ”Optimisation et bases de l'apprentissage automatique”, 46h eq. TD, BUT Informatique, Univ. Lorraine, France.
9.2.2 Supervision
- PhD defended: Salomé Lepers , “Interpretability and Explicability in Probabilistic Planning”, started in October 2022, defended on 2025-12-17, Olivier Buffet (advisor), Vincent Thomas (co-advisor).
- PhD in progress: Aya Yaacoub , “User-specific planning of a collaborative robot behavior to help prevent musculoskeletal disorders”, started in December 2021, Francis Colas (advisor), Pauline Maurice (co-advisor), Vincent Thomas (co-advisor), ROOIBOS project.
- PhD in progress: Raphael Boige , “Tree search algorithms: beyond Monte Carlo tree search”, started in November 2024, Bruno Scherrer (advisor), Amine Boumaza (co-advisor).
- PhD in progress: Aubin Delaveau , “Safe and Robust Management of Energy in Hybrid Aircrafts”, started in February 2025, Olivier Buffet (advisor), in collaboration with Florent Teichteil-Königsbuch (Airbus).
- PhD in progress: Antonin Rousseau , “Robot navigation under exogeneous uncertainty”, started in November 2025, Francis Colas (advisor), Alexis Scheuer (co-advisor).
- Master M2: Paul Loisil , “Incertitude de mouvement”, March to August 2025, Francis Colas (co-advisor), Alexis Scheuer (co-advisor).
- Master M2: Nicolas Queignec , “Planification de chemin en environnement inconnu et dynamique”, March to July 2025, Francis Colas (advisor).
- Master M1: Emna Debbech , “Apprentissage profond pour la résolution de -POMDP. Application au problème de recherche de proie dans un labyrinthe.”, July to September 2025, Olivier Buffet (co-advisor), Vincent Thomas (co-advisor) in collaboration with Guénaël Cabanes (co-advisor).
- Master M1: Nada El Hanafi , ”Infrastructure de navigation pour l'exploration robotique”, June to August 2025, Francis Colas (advisor).
- Bachelor (L3): Jarod Galbrun , "Preuve formelle pour la planification probabiliste", June to July 2025, Olivier Buffet (co-advisor) in collaboration with Pierre-Jean Spaenlehauer (EPI CARAMBA, co-advisor).
- Bachelor (BUT3): Adrien Naigeon , "Plateforme d’apprentissage pour la planification automatique", April to June 2025, Vincent Thomas (co-advisor), Olivier Buffet (co-advisors).
- Bachelor (L2): Camille Desplas , “Algorithmes de recherche de plus courts chemins”, May to June 2025, Francis Colas (advisor).
9.2.3 Juries
- Olivier Buffet was reviewer for the PhD of Hector Kohler (Cristal/INRIA, Université de Lille) and for the HDR of Guillaume Lozenguez (IMT Nord Europe, Université de Lille).
- Francis Colas was reviewer for the PhD of Camille Charrier (Université Grenoble Alpes, LPNC).
- Bruno Scherrer was reviewer for the PhD of Chiara Mignacco (Université Paris-Saclay).
9.2.4 Educational and pedagogical outreach
- Alexis Scheuer participated at the "fête de la science 2025" by organizing and co-animating an introductory robotics workshop for pupils, Univ. Lorraine.
- Vincent Thomas was a member of the organizing comité of the "séminaire de pédagogie universitaire" (2025), Univ. Lorraine.
- Vincent Thomas was a member of the organizing comité of the Game Jam for university educators (2025), Univ. Lorraine.
- Vincent Thomas organized two workshops from journée ISN for secondary school teachers, at Loria (Nancy).
9.3 Popularization
9.3.1 Specific official responsibilities in science outreach structures
- Amine Boumaza is a member of the editorial board of Interstice.
9.3.2 Participation in Live events
- Amine Boumaza participated twice in the Le Procès du Robot.
- Amine Boumaza gave a conference on collective robotics to students from prépa at Lycée Henri Poincaré of Nancy.
- Olivier Buffet participated in the robotic animation for elementary students, during "fête de la science 2025", Univ. Lorraine.
- Vincent Thomas participated in the robotic animation for elementary students, during "fête de la science 2025", Univ. Lorraine.
10 Scientific production
10.1 Major publications
- 1 articleRobots that can adapt like animals.Nature5217553May 2015, 503-507HALDOIback to text
- 2 articleOptimally Solving Dec-POMDPs as Continuous-State MDPs.Journal of Artificial Intelligence Research55February 2016, 443-497HALDOIback to text
10.2 Publications of the year
International journals
International peer-reviewed conferences
Conferences without proceedings
Reports & preprints
Other scientific publications
10.3 Cited publications
- 18 article Prioritized Motion-Force Control of Constrained Fully-Actuated Robots: "Task Space Inverse Dynamics". Robotics and Autonomous Systems 2014, URL: http://dx.doi.org/10.1016/j.robot.2014.08.016back to text
- 19 inproceedingsMulti-robot taboo-list exploration of unknown structured environments.2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2015)Hamburg, GermanySeptember 2015HALback to text
- 20 inproceedingsA POMDP Extension with Belief-dependent Rewards.Advances in Neural Information Processing Systems (NIPS)Vancouver, CanadaMIT PressDecember 2010HALback to text
- 21 articleSpecial Issue on Long-Term Autonomy.The International Journal of Robotics Research32142013, 1609--1610back to textback to text
- 22 inproceedingsPose Estimation For A Partially Observable Human Body From RGB-D Cameras.IEEE/RJS International Conference on Intelligent Robots and Systems (IROS)Hamburg, GermanySeptember 2015, 8HALback to text
- 23 inproceedingsSimultaneous Tracking and Activity Recognition (STAR) using Advanced Agent-Based Behavioral Simulations.ECAI - Proceedings of the Twenty-first European Conference on Artificial IntelligencePragues, Czech RepublicAugust 2014HALback to text
- 24 inproceedingsComparison of Selection Methods in On-line Distributed Evolutionary Robotics.ALIFE 14: The fourteenth international conference on the synthesis and simulation of living systemsArtificial Life 14New York, United StatesJuly 2014HALDOIback to textback to text
- 25 articleInterview: Is SLAM Solved?KI - Künstliche Intelligenz2432010, 255-257URL: http://dx.doi.org/10.1007/s13218-010-0047-xDOIback to text
- 26 articleReinforcement Learning in Robotics: A Survey.The International Journal of Robotics ResearchAugust 2013back to textback to textback to text
- 27 articleFast damage recovery in robotics with the t-resilience algorithm.The International Journal of Robotics Research32142013, 1700--1723back to text
- 28 inproceedingsLong-term 3D map maintenance in dynamic environments.Robotics and Automation (ICRA), 2014 IEEE International Conference onIEEE2014, 3712--3719back to text
- 29 techreportRobotics 2020 Multi-Annual Roadmap.2014, URL: http://www.eu-robotics.net/ppp/objectives-of-our-topic-groups/back to textback to text
- 30 inproceedingsImproved human-robot team performance using Chaski, A human-inspired plan execution system. ACM/IEEE International Conference on Human-Robot Interaction (HRI)2011, 29-36back to text
- 31 inproceedingsInteractive Surface for Bio-inspired Robotics, Re-examining Foraging Models.23rd IEEE International Conference on Tools with Artificial Intelligence (ICTAI)Boca Raton, United StatesIEEENovember 2011HALback to text
- 32 inproceedingsRole determination in human-human interaction.3rd Joint EuroHaptics Conf. and World Haptics2009, 51-56back to text
- 33 bookIntroduction to Reinforcement Learning.MIT Press1998back to text
- 34 articleThe grand challenges in Socially Assistive Robotics.IEEE Robotics and Automation Magazine - Special Issue on Grand challenges in Robotics1412007, 1-7back to text
- 35 articleSimultaneous Tracking and Activity Recognition (STAR) Using Many Anonymous, Binary Sensors.34682005, 62-79URL: http://dx.doi.org/10.1007/11428572_5DOIback to text
- 36 articleSocial Robots: Views of Staff of a Disability Service Organization.International Journal of Social Robotics632014, 457-468back to text