2025Activity reportProject-TeamPETRUS
RNSR: 201622250V- Research center Inria Saclay Centre at Université Paris-Saclay
- In partnership with:Université Versailles Saint-Quentin
- Team name: PErsonal & TRUSted cloud
Creation of the Project-Team: 2017 July 01
Each year, Inria research teams publish an Activity Report presenting their work and results over the reporting period. These reports follow a common structure, with some optional sections depending on the specific team. They typically begin by outlining the overall objectives and research programme, including the main research themes, goals, and methodological approaches. They also describe the application domains targeted by the team, highlighting the scientific or societal contexts in which their work is situated.
The reports then present the highlights of the year, covering major scientific achievements, software developments, or teaching contributions. When relevant, they include sections on software, platforms, and open data, detailing the tools developed and how they are shared. A substantial part is dedicated to new results, where scientific contributions are described in detail, often with subsections specifying participants and associated keywords.
Finally, the Activity Report addresses funding, contracts, partnerships, and collaborations at various levels, from industrial agreements to international cooperations. It also covers dissemination and teaching activities, such as participation in scientific events, outreach, and supervision. The document concludes with a presentation of scientific production, including major publications and those produced during the year.
Keywords
Computer Science and Digital Science
- A1.1.8. Security of architectures
- A1.1.9. Fault tolerant systems
- A1.3. Distributed Systems
- A3.1.2. Data management, quering and storage
- A3.1.3. Distributed data
- A3.1.5. Control access, privacy
- A3.1.6. Query optimization
- A3.1.9. Database
- A3.1.11. Structured data
- A4.7. Access control
- A4.8. Privacy-enhancing technologies
Other Research Topics and Application Domains
- B2.5.3. Assistance for elderly
- B6.4. Internet of things
- B6.6. Embedded systems
- B9.10. Privacy
1 Team members, visitors, external collaborators
Research Scientist
- Luc Bouganim [Team leader, INRIA, Senior Researcher, HDR]
Faculty Member
- Philippe Pucheral [UVSQ, Professor, HDR]
PhD Student
- Ali Ncibi [INRIA]
Technical Staff
- Ludovic Javet [INRIA, Engineer]
- Ivan Krivokuca [INRIA, Engineer, from Oct 2025]
Interns and Apprentices
- Abdel-Malik Fofana [INRIA, Apprentice, until Aug 2025]
- Ivan Krivokuca [INRIA, Apprentice, until Sep 2025]
Administrative Assistant
- Katia Evrat [INRIA]
2 Overall objectives
We are witnessing an exponential accumulation of personal data on central servers: data automatically gathered by administrations and companies but also data produced by individuals themselves (e.g., photos, agendas, data produced by smart appliances and quantified-self devices) and deliberately stored in the cloud for convenience. The net effect is, on the one hand, an unprecedented threat on data privacy due to abusive usage and attacks and, on the other hand, difficulties in providing powerful user-centric services (e.g. personal big data) which require crossing data stored today in isolated silos. The Personal Cloud paradigm holds the promise of a Privacy-by-Design storage and computing platform, where each individual can gather her complete digital environment in one place and share it with applications and users, while preserving her control. However, this paradigm leaves the privacy and security issues in user's hands, which leads to a paradox if we consider the weaknesses of individuals' autonomy in terms of computer security, ability and willingness to administer sharing policies. The challenge is however paramount in a society where emerging economic models are all based - directly or indirectly - on exploiting personal data.
While many research works tackle the organization of the user's workspace, the semantic unification of personal information, the personal data analytics problems, the objective of the PETRUS project-team is to tackle the privacy and security challenges from an architectural point of view. More precisely, our objective is to help providing a technical solution to the personal cloud paradox. More precisely, our goals are (i) to propose new architectures (encompassing both software and hardware aspects) and administration models (decentralized access and usage control models, data sharing, data collection and retention models) for secure personal cloud data management, (ii) to propose new secure distributed database indexing models, privacy preserving query processing strategies and data anonymization techniques for the personal cloud, and (iii) study economic, legal and societal issues linked to secure personal cloud adoption.
3 Research program
To tackle the challenge introduced above, we identify three main lines of research:
- (Axis 1) Personal cloud server architectures and administration models. Based on the intuition that user control, security and privacy are key properties in the definition of trusted personal cloud solutions, our objective is to propose new architectures (encompassing both software and hardware aspects) for secure personal cloud data management. We also focus in this axis on administration models and their enforcement in relation to the architecture of the system, so that the exclusive control of a non expert individual can be ensured.
- (Axis 2) Global query evaluation. The goal of this line of research is to provide capabilities for crossing data belonging to multiple individuals (e.g., performing statistical queries over personal data, computing queries on social graphs or organizing participatory data collection) in a fully decentralized setting while providing strong and personalized privacy guarantees. This means proposing new secure distributed database indexing models and query processing strategies. In addition, we concentrate on locally ensuring to each participant the good behaviour of the processing, such that no collective results can be produced if privacy conditions are not respected by other participants.
- (Axis 3) Technical, legal and economical issues linked to PDMS adoption. This research axis is more transverse and entails multidisciplinary research, addressing the links between economic, legal, societal and technological aspects. We are particularly interested in some specific issues related to the design, implementation and deployment of real PDMS solutions.
Recently, the PETRUS team reorganized its activities in response to the growing importance of technology transfer within the team and the strategic need to better delineate its research priorities. This led PETRUS to concentrate on the transfer of PlugDB (a flagship software developed by the team), around which a more focused and selective research effort is conducted. In parallel, a new Inria research team, PETSCRAFT, was created in June 2024 to pursue a broader research agenda in privacy and cybersecurity.
4 Application domains
4.1 Personal cloud, home care, IoT, sensing, surveys
As stated in the software section, the Petrus research strategy aims at materializing its scientific contributions in an advanced hardware/software platform with the expectation to produce a real societal impact. Hence, our software activity is structured around a common Secure Personal Cloud platform rather than several isolated demonstrators. This platform will serve as the foundation to develop a few emblematic applications.
Several privacy-preserving applications can actually be targeted by a Personal Cloud platform, like: (i) smart disclosure applications allowing the individual to recover her personal data from external sources (e.g., bank, online shopping activity, insurance, etc.), integrate them and cross them to perform personal big data tasks (e.g., to improve her budget management) ; (ii) management of personal medical records for care coordination and well-being improvement; (iii) privacy-aware data management for the IoT (e.g., in sensors, quantified-self devices, smart meters); (iv) community-based sensing and community data sharing; (v) privacy-preserving studies (e.g., cohorts, public surveys, privacy-preserving data publishing). Such applications overlap with all the research axes described above but each of them also presents its own specificities. For instance, the smart disclosure applications will focus primarily on sharing models and enforcement, the IoT applications require to look with priority at the embedded data management and sustainability issues, while community-based sensing and privacy-preserving studies demand to study secure and efficient global query processing.
Among these applications domains, one is already receiving a particular attention from our team. Indeed, we gained a strong expertise in the management and protection of healthcare data through our past DMSP (Dossier Medico-Social Partagé) experiment in the field. This expertise is being exploited to develop a dedicated healthcare and well-being personal cloud platform. We are currently deploying 10000 boxes equipped with PlugDB in the context of the DomYcile project. In this context, we are currently setting up an Inria Common Laboratory with the Domiserve company (La Poste Group) to industrialize this platform and deploy it at large scale.
5 Latest software developments, platforms, open data
5.1 Latest software developments
5.1.1 PlugDB
-
Keywords:
Databases, Personal information, Privacy, Hardware and Software Platform
-
Functional Description:
PlugDB is a full-fledged personal database server embedded in tamper-resistant hardware. It acts as a digital safe within a personal device (box), to manage and protect personal data. It has been designed to meet three major requirements:
- Physical control of the owner (e.g., a patient) on his personal data (e.g., healthcare data): the owner can decide on his own that certain data are highly sensitive (e.g., incontinence, addiction, breakdown, end of life) and must remain confined into the safe. Then, he physically controls who, when and for which purpose these data are accessed and can ultimately unplug the box at will, providing him the same control as with a paper folder.
- Sovereignty over the data usages: the prescriber of the solution (e.g., a public agent, an insurance company) acts as a data controller in the GDPR sense. It is then guaranteed that only approved usages of the owner’s data will be permitted. Thanks to PlugDB, only the algorithms embedded in the digital safe can access the raw data, undergo the defined access control rules and export outside the box only the minimal relevant information.
- Hardware security: decentralization is not synonym of higher security if class attacks (i.e., attacks than can be reproduced in a large set of devices) can be conducted on all boxes. The tamper-resistance of the box avoids class attacks by imposing the attacker to be in possession of the box to physically break the hardware security. Hence, the ratio between the cost of an attack (increased by the tamper-resistance) and its benefit (divided by the number of owners) is reversed compared to a traditional cloud-based solution.
To meet these requirements, PlugDB provides advanced capabilities to store any forms of personal data (tuples, documents, images, sensor data, etc.), query them in a SQL-like language, protect them against crashes and accidental losses, encrypt them to prevent spying, and share them through a powerful access control policy mixing rules from both the prescriber and the owner. PlugDB code is a bare metal project developed by Inria and UVSQ on an isolated hardware platform consisting of a microcontroller, a TPM (Trusted Platform Module) providing the tamper-resistance and an eMMC storing the encrypted data. This unique association of software and hardware makes PlugDB a Trusted Computing Base for personal data.
The code of PlugDB is organized in several components:
- Communication manager: this component manages the communications with the outside and guarantees the security of the exchanges by means of secure sessions similar to a TLS protocol.
- Application services: this component interprets the messages received by the communication manager, calls the appropriate commands and sends back an answer to the client. These commands are grouped by services, notably the DB service linked to the PlugDB database (object creation, insertion, deletion, updates, queries and transaction), the NDBS service used to manage objects stored outside the PlugDB database, Scripts similar to stored procedures and a service by which the embedded PlugDB firmware can be upgraded.
- ODBC: this component supplies an ODBC-like interface for a subset of the PlugDB commands.
- PlugDB engine: this component is the most important (in size, complexity and functionality) of the whole architecture. PlugDB engine is a full-fledged database server allowing to store, index, query, and update a variety of database objects (tuples, blobs, semi-structured objects) while enforcing advanced access control rules and guaranteeing the integrity of the database by means of ACID transactions. The database footprint is protected against malicious attacks thanks to data encryption.
- NDBS engine: this component enables the storing and retrieval of simple objects outside the database scope, that is directly in NAND Flash. Hence, they do not benefit from the database functionalities but their integrity and confidentiality remain protected by hashing and encryption, both being optional.
- Wrappers: this component provides an abstraction layer on top of the hardware platform, making the rest of the architecture – as far as possible – independent of the underlying microcontroller, NOR & NAND memories and cryptographic libraries specificities.
- Trusted Root: This component is built around a TPM (Trusted Platform Module, that is a secure chip) providing strong hardware security guarantees. The TPM is used to store a set of secrets (e.g., the database encryption key, the hash of the embedded code, etc.) and to enforce a set of basic mechanisms. The BOOT implements a secure boot with the TPM help, so that the genuineness of the embedded code is always guaranteed. Taken together, these mechanisms make the whole PlugDB infrastructure a Trusted Computing Base (TCB).
- Base: Base simply groups a set of functions (e.g., error management, logging mechanism, configuration constants) shared by all the components of the architecture.
This infrastructure in turn relies on a reduced set of services provided by third parties, notably the ST drivers associated to the underlying MCU and TPM. Note that no operating system is used for security and performance considerations, making PlugDB a bare metal programming project.
PlugDB comes with a set of tools required to manage the database and perform non regression tests and performance measurements.
- QGen is a compiler of database schemas and queries. It takes as input two text files describing the database meta-schema and the parameterized queries required to run the application and translates them into internal PlugDB data structures.
- PyLoadStress is a data generation platform designed to rigorously test and stress PlugDB. This Python-based tool allows to generate massive volumes of consistent data, simulating real-world scenarios to evaluate the robustness and performance of the DBMS.
- PyPlugDB is a Python client for interacting with the PlugDB server, acting as middleware to simplify server communications. Designed for testing and continuous integration, it enables automated test execution and facilitates reproducible end-to-end tests. Though essential for development, PyPlugDB is not built for production deployment.
- URL:
-
Contact:
Luc Bouganim
-
Participants:
Luc Bouganim, Philippe Pucheral, Laurent Schneider, Ludovic Javet, Ivan Krivokuca, Abdel-Malik Fofana
-
Partner:
Université de Versailles St-Quentin-en-Yvelines
6 New results
6.1 Daily Activity Detection and Machine Learning on Microcontrollers
Participants: Ali Ncibi [correspondent], Luc Bouganim, Philippe Pucheral.
In the context of the OwnCare2 IILab, our goal is to automatically analyze traces from non-invasive home sensors (contact, pressure, or presence binary sensors) in order to detect activities performed at home (e.g. showering, eating, sleeping). The goal is to prevent risk situations (e.g., loss of autonomy) and alert health professionals. The use of sensors is revolutionizing homecare for dependent people, but it also poses an unprecedented threat to personal privacy. Embedding the processing of these traces in a secure microcontroller automatically provides privacy guarantees for the user, since this sensitive data does not leave the secure environment. We identified two main complementary questions: (1) how to obtain an efficient ML model with little or no annotated data on the target person, using a few existing (annotated) datasets of other people; and (2) how to deploy a machine learning model in a microcontroller with limited resources. A first study of the state of the art on point (1) and on existing annotated datasets showed the disparity of the latter and of activity detection methods. We have built an extensible experimental platform which integrates all datasets processing steps (cleaning, discretization, feature computation, actual machine learning, post-processing, evaluation) and enable the comparison of various models and hyperparameter choices on multiple datasets. This platform has been demonstrated at EGC 2025 14 and DCOSS 2025 12.
6.2 Revisiting Textual Representations for Domain-Robust Human Activity Recognition in Smart Homes
Participants: Ali Ncibi [correspondent].
Language-based representations have recently emerged as a promising approach for cross-domain Human Activity Recognition (HAR) in smart homes, where binary sensor streams are verbalized into natural-language descriptions processed by pretrained encoders. However, prior work has typically fixed both the textualization scheme and the embedding model, leaving open how linguistic design choices influence transferability. We made a comprehensive factorial analysis of textualization and embedding strategies for language-based HAR. We systematically vary (i) how sensor event windows are expressed—across seven sequential and summarized textualizations—and (ii) how they are embedded using lexical (TF–IDF), static (Word2Vec), and contextual (SBERT) encoders. Experiments on four public smart-home datasets under consistent in-domain and cross-domain transfer conditions reveal that textualization design, not encoder complexity, governs performance. Sequential, event-ordered sentences maximize in-domain accuracy, while single-sentence, schema-based summaries —such as the proposed Compound Sensor Summary (CSS)— generalize best across homes. Clause-level ablations further show that event descriptions drive recognition, whereas explicit timing information can reduce robustness by overfitting to home-specific schedules. Overall, our findings establish a reproducible framework for analyzing and designing language-based representations in HAR, demonstrating that linguistic structure —rather than deep contextualization— is the primary determinant of domain robustness in smart-home activity recognition. This work was published at ICAART 2026 13
7 Bilateral contracts and grants with industry
7.1 Bilateral contracts with industry
OwnCare-2 IILab (Jan 2022 - Dec 2025)
- Partners: PETRUS, Domiserve
Participants: Luc Bouganim, Ludovic Javet, Philippe Pucheral [correspondent], Laurent Schneider, Ivan Krivokuca.
The OwnCare IILab – Inria Innovation Lab - (Jan 2018-Dec 2021) aimed at conceiving a secured personal medical folder facilitating the organization of medical and social care provided at home to elderly people and at deploying it in the field. This IILab has been built in partnership with the Hippocad company which won, in association with Inria and UVSQ, a public call for tender launched by the Yvelines district to deploy this medical folder on the whole distinct (10.000 patients). This solution, named DomYcile in the Yvelines district, is based on a home box combining the PlugDB hardware/software technology developed by the Petrus team (to manage and secure the medical folder) and additional technology developed by Hippocad. The primary result of the OwnCare IILab has been to build a concrete industrial solution based on PlugDB and deploy it so far among 3000 patients in the Yvelines district, despite the Covid pandemia. In 2022, Hippocad has become a subsidiary of the La Poste group opening new opportunities in terms of deployment. Hence, Inria, UVSQ and Hippocad, now Domiserve of La Poste Group, have launched a follow up of the OwnCare IILab for the period Jan 2022-Dec 2025. The goal of the OwnCare2 IILab is (1) to integrate our solution in the MaSanté 2022 national roadmap by making it interoperable with external services (without hurting the security provided by the box), (2) to handle, in a privacy-preserving way, new usages like actimetrics, teleassistance and global statistics based on IoT techniques, machine learning and decentralized computations and (3) try to deploy it at the national/international level. In 2023, a new district (Hauts de Seine) has decided to deploy the DomYcile solution on its own territory, leading to an extended partnership.
8 Dissemination
8.1 Promoting scientific activities
8.1.1 Scientific events: organisation
- Luc Bouganim: Co-organizer "École thématique BDA Masses de Données Distribuées", Cargese (2026)
8.1.2 Research administration
- Luc Bouganim: PhD thesis referent for the Doctoral School of Université Paris-Saclay
- Luc Bouganim: Comité de Suivi Individuel (CSI), Anne Fenet, UVSQ.
- Luc Bouganim: Comité de Suivi Individuel (CSI), Nassima Kaid, UVSQ.
- Philippe Pucheral: Member of the Scientific Commission (CS) of the ISN Graduate School of Université Paris-Saclay.
8.2 Teaching - Supervision - Juries - Educational and pedagogical outreach
- Philippe Pucheral: vice-head of the M1 and M2 DataScale master program at Université Paris-Saclay.
- Master: Philippe Pucheral, course in M2: databases, course in M1: introductory courses for jurists,UVSQ, France.
- Engineers school: Ludovic Javet, Bases de données relationnelles (ENSTA, module IN207, M1), 32.
8.2.1 Supervision
- PhD in progress: Ali Ncibi, Secure machine Learning on IOT traces for daily activity discovery, Inria, since March 2023, Luc Bouganim and Philippe Pucheral.
- Luc Bouganim & Ludovic Javet: Supervision of Abdel-Malik Fofana (apprentice)
- Luc Bouganim & Ludovic Javet: Supervision of Ivan Krivokuca (apprentice)
8.2.2 Juries
- Luc Bouganim: Reviewer of the HDR of Shaoyi Yin (Université de Toulouse), march 2026.
- Philippe Pucheral: President of the PhD jury of Perla Hajjar (UVSQ), december 2025.
8.3 Popularization
- Luc Bouganim: PlugDB & OwnCare, presentation to the scientific delegation of the Nairobi University (Kenya), Inria Saclay, November, 4, 2025.
9 Scientific production
9.1 Major publications
- 1 articlePersonal Data Management Systems: The security and functionality standpoint.Information Systems802019, 13-35HALDOI
- 2 articlePersonal Database Security and Trusted Execution Environments: A Tutorial at the Crossroads.Proceedings of the VLDB Endowment (PVLDB)August 2019HALDOI
- 3 articleHighly distributed and privacy-preserving queries on personal data management systems.The VLDB Journal322March 2023, 415-445HALDOI
- 4 inproceedingsAn Extensive and Secure Personal Data Management System Using SGX.EDBT 2022 - 25th International Conference on Extending Database TechnologyEdinburgh / Virtual, United KingdomMarch 2022HAL
- 5 articleEdgelet Computing: Enabling Privacy-Preserving Decentralized Data Processing at the Network Edge.Personal and Ubiquitous Computing29February 2025, 45-75HALDOI
- 6 inproceedingsEdgelet Computing: Pushing Query Processing and Liability at the Extreme Edge of the Network.CCGrid 2022Taormina, ItalyMay 2022HAL
- 7 articleSecure distributed queries over large sets of personal home boxes.Transactions on Large-Scale Data- and Knowledge-Centered SystemsSeptember 2020HAL
- 8 inproceedingsSEP2P: Secure and Efficient P2P Personal Data Processing.EDBT 2019 - 22nd International Conference on Extending Database TechnologyLisbon, PortugalMarch 2019HAL
- 9 articleHandling Dropouts in Federating Learning with Personal Data Management Systems.Transactions on Large-Scale Data- and Knowledge-Centered SystemsMay 2024. In press. HAL
- 10 articleMobile participatory sensing with strong privacy guarantees using secure probes.Geoinformatica253July 2021, 533-580HALDOI
9.2 Publications of the year
International journals
International peer-reviewed conferences
National peer-reviewed Conferences