EN FR
EN FR

2025Activity reportProject-Team​‌ARTISHAU

RNSR: 202424570G
  • Research​​ center Inria Centre at​​​‌ Rennes University
  • In partnership​ with:Université de Rennes​‌
  • Team name: ARTificial Intelligence:​​ Security, trutHfulness, and AUdit​​​‌
  • In collaboration with:Institut​ de recherche en informatique​‌ et systèmes aléatoires (IRISA)​​

Creation of the Project-Team:​​​‌ 2024 October 01

Each​ year, Inria research teams​‌ publish an Activity Report​​ presenting their work and​​​‌ results over the reporting​ period. These reports follow​‌ a common structure, with​​ some optional sections depending​​​‌ on the specific team.​ They typically begin by​‌ outlining the overall objectives​​ and research programme, including​​​‌ the main research themes,​ goals, and methodological approaches.​‌ They also describe the​​ application domains targeted by​​​‌ the team, highlighting the​ scientific or societal contexts​‌ in which their work​​ is situated.

The reports​​​‌ then present the highlights​ of the year, covering​‌ major scientific achievements, software​​ developments, or teaching contributions.​​​‌ When relevant, they include​ sections on software, platforms,​‌ and open data, detailing​​ the tools developed and​​​‌ how they are shared.​ A substantial part is​‌ dedicated to new results,​​ where scientific contributions are​​​‌ described in detail, often​ with subsections specifying participants​‌ and associated keywords.

Finally,​​ the Activity Report addresses​​​‌ funding, contracts, partnerships, and​ collaborations at various levels,​‌ from industrial agreements to​​ international cooperations. It also​​​‌ covers dissemination and teaching​ activities, such as participation​‌ in scientific events, outreach,​​ and supervision. The document​​​‌ concludes with a presentation​ of scientific production, including​‌ major publications and those​​ produced during the year.​​​‌

Keywords

Computer Science and​ Digital Science

  • A4. Security​‌ and privacy
  • A9.3. Signal​​ processing
  • A9.11. Generative AI​​​‌
  • A9.14. Evaluation of AI​ models
  • A9.17. Cybersecurity and​‌ AI

Other Research Topics​​ and Application Domains

  • B6.5.​​​‌ Information systems
  • B9.10. Privacy​

1 Team members, visitors,​‌ external collaborators

Research Scientists​​

  • Teddy Furon [Team​​​‌ leader, INRIA,​ Senior Researcher, HDR​‌]
  • Eva Giboulot [​​INRIA, Researcher]​​​‌
  • Erwan Le Merrer [​INRIA, Researcher,​‌ HDR]

Faculty Members​​

  • Ewa Kijak [UNIV​​​‌ RENNES, Professor,​ from Sep 2025,​‌ HDR]
  • Ewa Kijak​​ [UNIV RENNES,​​​‌ Associate Professor, until​ Aug 2025, HDR​‌]

Post-Doctoral Fellow

  • Ryan​​ Webster [INRIA,​​​‌ Post-Doctoral Fellow]

PhD​ Students

  • Paul Chaurand [​‌INRIA, from Sep​​ 2025]
  • Timothee Chauvin​​​‌ [INRIA]
  • Adele​ Denis [INRAE]​‌
  • Virgile Dine [INRIA​​]
  • Gautier Evennou [​​​‌IMATAG, CIFRE]​
  • Pierre Fernandez [FACEBOOK​‌, CIFRE, until​​ Feb 2025]
  • Jade​​ Garcia Bourrée [INRIA​​​‌, until Sep 2025‌]
  • Enoal Gesny [‌​‌INRIA]
  • Augustin Godinot​​ [INRIA, from​​​‌ Nov 2025]
  • Augustin‌ Godinot [UNIV RENNES‌​‌, until Oct 2025​​]
  • Louis Hemadou [​​​‌SAFRAN, CIFRE,‌ until Oct 2025]‌​‌
  • Chloé Imadache [INRIA​​]
  • Quentin Le Roux​​​‌ [THALES, CIFRE‌, until Oct 2025‌​‌]
  • Gurvan Richardeau [​​PEReN, CIFRE]​​​‌

Interns and Apprentices

  • Paul‌ Chaurand [ENS Rennes‌​‌, Intern, from​​ Feb 2025 until Jul​​​‌ 2025]
  • Malo De‌ Hedouville [CNRS,‌​‌ Intern, from Jun​​ 2025 until Aug 2025​​​‌]

Administrative Assistant

  • Loic‌ Lesage [INRIA]‌​‌

Visiting Scientist

  • Isabela Borlido​​ Barcelos [GOUV BRESIL​​​‌]

External Collaborator

  • Charly‌ Faure [DGA-MI]‌​‌

2 Overall objectives

2.1​​ Context

2.1.1 AI is​​​‌ scary

A picture‌ is worth a thousand‌​‌ words" Confucius, 5th​​ century BC. “I​​​‌ believe only what I‌ see" Saint Thomas‌​‌ at the resurrection of​​ Christ. “The weight​​​‌ of words, the shock‌ of photos" slogan‌​‌ of a well-known french​​ weekly magazine. All these​​​‌ quotations show the importance‌ of images in our‌​‌ civilisation. Every article in​​ the press and every​​​‌ post on social networks‌ is illustrated with photos.‌​‌ But today, Artificial Intelligence​​ is disrupting our relationship​​​‌ with images. Dall-E in‌ 2021, Stable diffusion or‌​‌ MidJourney in 2022, these​​ AIs produce ultra-realistic images.​​​‌ Humans can no longer‌ tell the difference between‌​‌ generated images and real​​ ones. The same applies​​​‌ to text with Large‌ Language Models, such as‌​‌ ChatGPT. AI is nowadays​​ at the root of​​​‌ a crisis of confidence‌ in multimedia data, which‌​‌ is the main content​​ of social networks. These​​​‌ Artificial Intelligences are powerful‌ tools for creating deepfakes,‌​‌ fakenews, and disinformation on​​ a massive scale. This​​​‌ is a major problem‌ in Influence Warfare.

But‌​‌ it's just one of​​ the many dangers of​​​‌ Artificial Intelligence. Artificial Intelligence‌ is also about analysing‌​‌ data to make decisions.​​ These systems are now​​​‌ in the wild, serving‌ populations in most parts‌​‌ of their online interaction​​ (robots, online curation such​​​‌ as recommendation, pricing or‌ ranking algorithms, self-driving cars,‌​‌ text or image generation,​​ ...). These systems have​​​‌ demonstrated incredible performances and‌ the market of AI-based‌​‌ systems is awaited to​​ worth hundreds of billions​​​‌ US dollars in the‌ coming years1.‌​‌ The word `performances' here​​ encompasses the primary goals​​​‌ of the algorithms: their‌ ability of performing a‌​‌ given task (accuracy -classification,​​ average error -regression, perplexity​​​‌ -generation), but also their‌ speed (low latency, high‌​‌ throughput), and their frugality​​ (in terms of training​​​‌ samples, memory footprint, and‌ electric power). Yet this‌​‌ undeniable success is hampered​​ by a growing lack​​​‌ of trust in AI‌ and machine learning. These‌​‌ algorithms are scary, and​​ this mixed feeling is​​​‌ fueled by the lack‌ of numerous secondary properties‌​‌: fairness, explicability, plausibility,​​ safety, transparency, truthfulness ...​​​‌

Some believe that these‌ problems will be resolved‌​‌ through legislation. Europe has​​​‌ just passed the EU​ AI Act to ensure​‌ that AI deployed in​​ Europe will be safe,​​​‌ transparent, traceable, free from​ discriminatory bias and environmentally​‌ responsible. The Biden-Harris administration​​ states that AI can​​​‌ in no way usurp​ humans. Recently, the UK​‌ backed down: an AI​​ cannot be trained on​​​‌ copyright-protected content without the​ agreement of the copyright​‌ holders. Finally, AI is​​ no longer a simple​​​‌ algorithm that can be​ benchmarked by measuring its​‌ primary goals (the probability​​ of giving a correct​​​‌ result, its processing speed,​ its memory footprint on​‌ a GPU). Regulations also​​ impose seemingly secondary characteristics​​​‌ on AI that are​ indeed crucial to its​‌ acceptance in society. But​​ what good is regulation​​​‌ if it is not​ accompanied by technical means​‌ of control? This is​​ all the more difficult​​​‌ when we think of​ the AIs of pure​‌ players trained in secrecy​​ and deployed on their​​​‌ clouds. These are veritable​ black boxes accessible via​‌ APIs. Are they compliant​​ with recent legislation? This​​​‌ is the challenge of​ AI certification where a​‌ dishonest AI provider conceals​​ the non-compliance to these​​​‌ standards by preventing or​ deluding an audit of​‌ his system.

Last but​​ not least: AI is​​​‌ increasingly used in critical​ applications such as cybersecurity​‌ where by assumption there​​ exists a malicious person​​​‌ willing to delude the​ system. Can we trust​‌ this new tool? Wouldn't​​ it be a good​​​‌ idea to check the​ level of security and​‌ privacy of AI before​​ using it in critical​​​‌ applications such as cybersecurity?​ An example of a​‌ vulnerability: if the attacker​​ can modify training data,​​​‌ he can build a​ backdoor into a model.​‌ In other words, the​​ model learned from this​​​‌ poisoned data behaves as​ expected, but the attacker​‌ can control this model​​ in the sense that​​​‌ by modifying the test​ data in the same​‌ way, he can make​​ the model say what​​​‌ he wants.

This is​ a real problem because​‌ there are many models​​ that have already been​​​‌ learned and are available​ as open-source software. Can​‌ we trust the integrity​​ of these models? Who​​​‌ can guarantee that they​ do not contain a​‌ backdoor? As another example,​​ machine learning models tend​​​‌ to retain a memory​ of training data. So,​‌ knowing a model, an​​ attacker can, to a​​​‌ certain extent, predict whether​ a given piece of​‌ data was used at​​ training time, and sometimes​​​‌ even reconstruct training data.​ This represents a threat​‌ to data confidentiality and​​ a violation of privacy​​​‌ if the data is​ sensitive with respect to​‌ the GDPR law.

It​​ is then of a​​​‌ societal interest to develop​ methods to secure models​‌ by design at training​​ time, to audit the​​​‌ compliance of models already​ deployed online, and to​‌ scout out where AI​​ is thwarting our trust.​​​‌ ARTISHAU targets the secondary​ properties of machine learning​‌ algorithms in a hostile​​ environment due to the​​​‌ presence of an attacker.​

2.1.2 Grid for reading​‌

The following list of​​ critera is here to​​ summarize the main ideas​​​‌ and supporting details thrown‌ in this introduction.

  • Type‌​‌ of AI. The​​ introduction mentions two types​​​‌ of Artificial Intelligence. Decision-making‌ AI analyzes data to‌​‌ take decision, whereas generative​​ AI synthesizes data.
  • Access​​​‌ to the model.‌ One speaks about white‌​‌ box when all the​​ internals of the model​​​‌ under scrutiny (i.e.‌ architecture, weights and biases)‌​‌ is disclosed. This means​​ that the model is​​​‌ fully reproducible in the‌ lab. Black box defines‌​‌ scenarios where one can​​ only query the model​​​‌ and observe its output.‌ The access is granted‌​‌ through an API (a.k.a.​​ MLaaS – Machine Learning​​​‌ as a Service) or‌ the model is embedded‌​‌ in IC (a.k.a. ML-on-Chips).​​
  • Security issues. There​​​‌ are of different natures.‌ Either issues stem from‌​‌ intrinsic vulnerabilities of machine​​ learning, or they are​​​‌ posed by a malevolent‌ use of AI. The‌​‌ latter is especially true​​ for generative AI.
  • Goals​​​‌. Our goals range‌ from the control and‌​‌ certification of AI (​​i.e. audit) to the​​​‌ publications of recommendations and‌ design of defenses.

2.2‌​‌ Definitions

2.2.1 Definition of​​ machine learning security

Grid​​​‌ for reading: Decision-making –‌ Intrinsic vulnerabilities – Design‌​‌ of defenses.

Revealing the​​ intrinsic security of machine​​​‌ learning is of utmost‌ importance. The main problem‌​‌ of machine learning is​​ that it works great​​​‌ (provided a fair amount‌ of training data and‌​‌ computing power). It works​​ great even under perturbations​​​‌ or light distribution drifts.‌ Yet, this robustness gives‌​‌ a false sense of​​ security making us believe​​​‌ that AI is almighty.‌ Indeed, generalization, robustness, and‌​‌ security are different concepts​​ often confused:

  • Generalization is​​​‌ the ability to operate‌ as expected on unseen‌​‌ data.
  • Robustness is​​ the ability to operate​​​‌ as expected on noisy‌ data.
  • Security is‌​‌ the ability to operate​​ as expected on data​​​‌ deliberately perturbed by attackers‌, or at least‌​‌ to sense that the​​ conditions are not met​​​‌ for operating safely.

The‌ intention of the attackers‌​‌ but especially the quest​​ of efficient attacks leveraging​​​‌ their knowledge of the‌ targeted system makes a‌​‌ major difference between security​​ and robustness. Recent literature​​​‌ witnesses a flurry of‌ dangers in machine learning.‌​‌ Claiming to be a​​ team expert in machine​​​‌ learning security implies to‌ contribute on the study‌​‌ of all these threats.​​ Yet, we shall first​​​‌ organize this fuzzy ensemble‌ of use cases into‌​‌ a simple vision for​​ the security of machine​​​‌ learning. This vision is‌ the skeleton of the‌​‌ team-project ARTISHAU.

Machine Learning​​ amounts to learn a​​​‌ modelθ from training‌ data𝒟 and to‌​‌ apply this model onto​​ some testing datax​​​‌ to output an inference‌ y. Training data‌​‌ 𝒟, model θ​​, and testing data​​​‌ x are assets needing‌ protection so that the‌​‌ inference y can be​​ trusted. Needing protection means​​​‌ defending some cardinal values‌ which are, according to‌​‌ us, Confidentiality, Privacy​​, and Integrity.​​​‌

This makes 3 assets‌ × 3 values equal‌​‌ 9 scenarios, which are​​​‌ detailed in the sequel​ of the proposal. On​‌ top of this, there​​ is also the huge​​​‌ diversity due to the​ nature of the data​‌ (image, video, audio, or​​ categorial data like text,​​​‌ user profile...) and due​ to the X-learning​‌ framework with X∈​​{supervised, unsupervised, continuous,​​​‌ meta, few shot, federated,​ ...}.

2.2.2​‌ Definition of audits of​​ decision-making algorithms

Grid for​​​‌ reading: Decision-making AI –​ Black box – Certification.​‌

Computer scientists and engineers​​ are used to design​​​‌ and develop algorithms that​ process information and that​‌ can have an important​​ impact on society. For​​​‌ fine tuning these, developers​ operate a controlled feedback​‌ loop on data fed​​ as inputs to the​​​‌ algorithm, and the corresponding​ algorithm results (output accuracy​‌ for instance). Considering an​​ exterior viewpoint (the viewpoint​​​‌ of users or regulators),​ that observes or audits​‌ the behavior of remote​​ algorithms is less frequent.​​​‌ A so-called black box​ approach on algorithms can​‌ be dated back to​​ Moore’s tests black box​​​‌ automata in 1956 57​. Relatively recent and​‌ sporadic works instead placed​​ this viewpoint at the​​​‌ service of algorithmic auditing,​ in order to allow​‌ users to gain some​​ understanding on the algorithmic​​​‌ decisions they are facing.​ In particular, these nascent​‌ forms of algorithmic audits​​ also constitute a prerequisite​​​‌ to enable platform regulation:​ if a regulation entity​‌ wants to enforce some​​ behavior, means for verification​​​‌ are mandatory (as captured​ by the Russian proverb​‌ trust, but verify).​​

In this context, the​​​‌ design of algorithmic audits​ aims at producing targeted​‌ and reproducible methods to​​ verify predicates, or discover​​​‌ some specific behavior of​ the decision-making algorithm under​‌ scrutiny. The link to​​ the security objective of​​​‌ the team should here​ be clear: if one​‌ observer (here an auditor)​​ can extract information from​​​‌ a remote algorithm, then​ it may constitute a​‌ security issue for the​​ platform. This is why​​​‌ the audit axis will​ benefit from the research​‌ done in the other​​ project axes, to fully​​​‌ understand the potential offensive/defensive​ nature of the task.​‌

2.2.3 Definition of manipulation​​ of information

Grid for​​​‌ reading: Generative AI –​ Malevolent Usage – Design​‌ of defenses.

The definitions​​ in sections 2.2.1 and​​​‌ 2.2.2 hold for algorithms​ analyzing data and taking​‌ decisions. Algorithms generating synthetic​​ data are also a​​​‌ source of threats especially​ in IT fight for​‌ influence (in French Lutte​​ Informatique d'Influence - L2I).​​​‌ AIs generating or editing​ images 69, 59​‌, 63, 41​​ are a great tool​​​‌ for crafting deepfakes, and​ so are AIs generating​‌ texts 31, 67​​ for fake news.

Whether​​​‌ it is destabilization operations​ organized on social networks​‌ or manipulation of information​​ for the purpose of​​​‌ influencing public opinion, the​ dangers in the informational​‌ sphere are increasingly visible.​​ The attacks in the​​​‌ realm of influence in​ cyberspace are more and​‌ more harmful because AI​​ renders them automatic and​​​‌ massive.

Among numerous examples,​ we can cite the​‌ deepfake video of Facebook​​ CEO Mark Zuckerberg created​​ and shared online on​​​‌ June 2019. The video‌ showed Zuckerberg making comments‌​‌ about Facebook's dominance and​​ power over users' personal​​​‌ data. More recently, AI-generated‌ videos of people expressing‌​‌ support for Burkina Faso's​​ new military junta have​​​‌ recently been shared on‌ social networks, in what‌​‌ may be an attempt​​ to spread pro-military propaganda.​​​‌ The images were generated‌ thanks to Synthesia,‌​‌ a platform that allows​​ to create videos from​​​‌ written texts. Emmanuel Macron‌ was put in several‌​‌ situations by the AI​​ Midjourney: in the​​​‌ middle of flames in‌ a Parisian street, fighting‌​‌ with demonstrators or collecting​​ waste.

Apart from artificially​​​‌ generated misinformation, the image‌ repurposing is also widely‌​‌ used for manipulation purposes.​​ As example, a series​​​‌ of photos posted on‌ Facebook in 2021 purport‌​‌ to show French soldiers​​ exploiting Mali's gold resources,​​​‌ rather than fighting alongside‌ the Malian army. But‌​‌ in reality, none of​​ these images were taken​​​‌ in Mali and these‌ soldiers are not French.‌​‌ These pictures are from​​ seizures of gold bullion​​​‌ shipments by U.S. soldiers‌ in Iraq, and they‌​‌ were shot in 2003.​​

2.3 Conclusion on the​​​‌ objectives

Security is considered‌ in a broad spectrum‌​‌ covering confidentiality, privacy, integrity​​ (first reseach axis), compliance​​​‌ to regularisation laws and‌ standards (second research axis),‌​‌ and generation of disinformation​​ (third axis). The following​​​‌ section draws a map‌ of this domain composed‌​‌ of three research temporalities.​​

The short term objective​​​‌ of ARTISHAU is to‌ explore the frontiers in‌​‌ security issues of AI​​ and especially machine learning​​​‌ by developing attacks and‌ defenses and pinpointing who‌​‌ wins the game under​​ which conditions. The main​​​‌ difficulty is to pave‌ all the ten challenges‌​‌ hereafter grouped into three​​ research axes. This exhaustive​​​‌ covering is mandatory to‌ achieve the ambition of‌​‌ ARTISHAU which is to​​ become a team of​​​‌ experts in the security‌ of machine learning at‌​‌ large.

The middle term​​ objective is to make​​​‌ connections between the challenges.‌ Theoretical results will outline‌​‌ common key factors impacting​​ the feasibility of attacks​​​‌ and defenses. Practical studies‌ will look at a‌​‌ given defense mechanism from​​ different perspectives. It is​​​‌ paramount to assess that‌ a defense against one‌​‌ attack is not weakening​​ the system against other​​​‌ threats. More likely, studying‌ the improvements in the‌​‌ security level against all​​ types of attack may​​​‌ compensate a loss in‌ utility stemming from one‌​‌ particular defense. The trade-off​​ is more acceptable when​​​‌ considering the full picture.‌ This holistic approach is‌​‌ not yet adopted in​​ the literature.

The long​​​‌ term objective of ARTISHAU‌ is to develop a‌​‌ global procedure assessing the​​ security levels of already​​​‌ deployed machine learning algorithms.‌ It means that we‌​‌ are able to gauge​​ how vulnerable a given​​​‌ model is and potentially‌ to compare to other‌​‌ models in its category.​​ ARTISHAU also aims to​​​‌ issue guidelines and methodologies‌ for security-by-design learning.

3‌​‌ Research program

3.1 Research​​ axis A: Security of​​​‌ Machine Learning

Grid for‌ reading: Decision-making, Intrinsic vulnerabilities,‌​‌ Design of defenses.

Research​​​‌ axis A describes the​ 9 scenarios (3 assets​‌ × 3 values) arising​​ from the vision described​​​‌ in section 2.2.1.​ This tour of the​‌ security threats over machine​​ learning shows that this​​​‌ vision is sound because​ it encompasses 9 scenarios​‌ which all make sense​​2. They are​​​‌ here grouped into 3​ challenges.

3.1.1 Challenge #1:​‌ Protection of the training​​ set 𝒟

The owners​​​‌ of datasets have spent​ a lot of efforts​‌ gathering a large amount​​ of valuable data and​​​‌ annotating them in preparation​ of supervised learning.

Confidentiality.​‌ If training data are​​ sensitive, then their owners​​​‌ simply do not want​ to disclose them. On​‌ the other hand, they​​ are not able to​​​‌ carry up the training​ procedure that is outsourced​‌ to a machine learning​​ expert. Traditional encryption provides​​​‌ protection, but prevents processing​ the data, whereas multi-party​‌ computation or homomorphic encryption​​ allow both. This amounts​​​‌ to learning a model​ over encrypted data. Prototypes​‌ based on multi-party computation​​ are already running 71​​​‌ and federated learning receives​ a lot of attention.​‌ On the other hand,​​ there are applications where​​​‌ neither of these solutions​ is admissible, and homomorphic​‌ encryption remains the only​​ option.

Privacy. Model θ​​​‌ is the result of​ a training over 𝒟​‌, thus it is​​ data dependent and potentially​​​‌ leaks information about 𝒟​. A privacy enabling​‌ training procedure trades off​​ the utility (e.g.​​​‌ the accuracy of the​ model) for some privacy​‌ (lower information leakage). There​​ is an obvious connection​​​‌ with differential privacy 32​. Model θ does​‌ not leak (or by​​ a small controllable amount)​​​‌ whether a particular data​ point was part of​‌ the training set, even​​ if the attacker knows​​​‌ all the other training​ data 22. But,​‌ this concept might be​​ too strict as it​​​‌ grants a large advantage​ (the knowledge of the​‌ full dataset except one​​ item) to the attacker.​​​‌ It is more realistic​ to think in terms​‌ of distributational privacy 25​​, 51 or membership​​​‌ inference 65: the​ attacker is given a​‌ finite superset of data,​​ and he/she has to​​​‌ decide which of them​ were actually used during​‌ the training.

Privacy and​​ confidentiality are usually paired​​​‌ especially in collaborative or​ federated learning where separate​‌ parties collaborate and learn​​ from each other’s data​​​‌ 40.

Integrity. The​ attacker has injected corrupted​‌ data in the training​​ set. Poisoning means that​​​‌ this corruption biases the​ training issuing a model​‌ that seamlessly misclassifies a​​ given query 24.​​​‌ Backdooring generalizes poisoning: by​ coherently modifying some training​‌ data associated to a​​ target class, the model​​​‌ learned to link this​ modification to the target:​‌ any query modified by​​ the same process triggers​​​‌ the target class at​ the output of the​‌ backdoored model 23.​​ In simple words, the​​​‌ attacker remotely controls the​ backdoored model.

3.1.2 Challenge​‌ #2: Protection of the​​ model θ

Learning over​​​‌ a large training set​ requires skills and computing​‌ power. This motivates the​​ protection of the resulting​​ model θ.

Confidentiality.​​​‌ Deep neural networks run‌ on GPU which are‌​‌ highly computationally efficient but​​ absolutely not secure. It​​​‌ is easy to steal‌ a model by freezing‌​‌ this unit and dumping​​ the parameters. This raises​​​‌ concerns on AI embedded‌ systems be it smartphone‌​‌ applications or military vehicles​​ equipped with sensors and​​​‌ AI. An interesting approach‌ masks the model parameters‌​‌ and sets a protocol​​ between the GPU and​​​‌ the TEE3 in‌ charge of unmasking the‌​‌ final result 68.​​

Privacy. Is it possible​​​‌ to identify a model‌ enclosed in a black‌​‌ box just by querying​​ it? Black boxes are​​​‌ common: accessing a model‌ via an API (MLaaS‌​‌4) or via​​ an IC (Onboard AI).​​​‌ Our recent work shows‌ that identification is possible‌​‌ to some extend by​​ comparing the fingerprint of​​​‌ models 55, 54‌. Moreover, we showed‌​‌ that this piece of​​ information is valuable for​​​‌ the attacker when concocting‌ an evasion attack. The‌​‌ transferability of attacks improves​​ when the target model​​​‌ is disclosed 56.‌ For the defender point‌​‌ of view, this technique​​ could also scout the​​​‌ theft of models 29‌.

Integrity. Many learned‌​‌ models are available on​​ the shelf. The user​​​‌ needs guarantees that they‌ are free from backdoors‌​‌ (see section 3.1.1).​​ Trojaning applies very sparse​​​‌ modifications of the weights‌ of a learned model‌​‌ to inject hazardous behaviour​​ 58, 52.​​​‌ Note that a recent‌ paper claims that it‌​‌ is theoretically possible to​​ plant a backdoor in​​​‌ neural networks without raising‌ suspicion even in a‌​‌ white box setup 39​​. Practical implementations are​​​‌ coming 26.

3.1.3‌ Challenge #3: Protection of‌​‌ the testing data x​​

Confidentiality. Inferring a query​​​‌ without `seeing it' is‌ feasible thanks to homomorphic‌​‌ encryption. This is​​ the dual problem of​​​‌ the above-mentioned scenario (See‌ section 3.1.1) where‌​‌ the query is now​​ encrypted and the model​​​‌ is in the clear.‌ Again, multi-party computation and‌​‌ homomorphic encryption are competing​​ approaches. The first one​​​‌ relies on the assumption‌ that the parties do‌​‌ not collude,whereas the Achilles'​​ heel of the latter​​​‌ is the amplification of‌ the encryption noise while‌​‌ processing data, which needs​​ to be reset with​​​‌ complex bootstrapping.

Privacy. The‌ user is interested in‌​‌ classifying a query x​​ for a given classification​​​‌ problem. Yet, the user‌ does not want any‌​‌ other information to be​​ extracted from x.​​​‌ Is it possible to‌ sanitize the query by‌​‌ projecting it onto a​​ subspace only containing the​​​‌ information needed for the‌ inference? This approach has‌​‌ been investigated in 36​​, 60.

Integrity.​​​‌ This is the field‌ of adversarial examples.‌​‌ The attacker adds a​​ small perturbation to the​​​‌ query x to delude‌ a DNN classifier. The‌​‌ forgery is perceptually close​​ to the original query​​​‌ or even undetectable. This‌ topic is the tip‌​‌ of the iceberg in​​ machine learning security with​​​‌ more than 6,000 publications‌ in the last four‌​‌ years 30.

3.1.4​​​‌ Priorities

This first research​ axis reveals the complexity​‌ and the richness of​​ the field. However, this​​​‌ is not exactly a​ terra incognita since the​‌ members of ARTISHAU have​​ already cleared some ways​​​‌ before the creation of​ the team-project in October​‌ 2024, notably challenges #2:​​ Privacy, #3: Confidentiality ,#3:​​​‌ Integrity.

In the short​ term, the priority is​‌ given to the following​​ topics where we miss​​​‌ expertise:

  • Challenge #1: Privacy.​ A previous collaboration with​‌ our colleague Yufei Han​​ from the PIRAT team​​​‌ has discovered that membership​ inference attacks are over-claimed.​‌ They do work but​​ only under specific lab​​​‌ conditions, notably overfitting models.​ This year, equipped with​‌ this new expertise, we​​ explore the topic of​​​‌ machine unlearning under the​ new convention DGA-MI -​‌ Inria.
  • Challenges #1 and​​ #2: Integrity. These two​​​‌ challenges are known as​ backdoor detection by analysing​‌ either the training data​​ (#1) or the learnt​​​‌ model (#2). The Cifre​ Ph.d. thesis with THALES​‌ puts the priority on​​ challenge #2, especially on​​​‌ a black-box model 64​, 50.

3.2​‌ Research axis B: Audit​​ of black-box AIs

Grid​​​‌ for reading: Decision-making AI,​ Black box, Certification

3.2.1​‌ Challenge #4: Decidability and​​ conditions for accurate audit​​​‌ tasks

While model creators​ have by definition a​‌ complete (i.e.,​​ white-box) access to their​​​‌ models for improving them,​ it is tempting for​‌ external auditors or users​​ to try and infer​​​‌ some properties of these​ models from their (black-box)​‌ standpoint.

This challenge questions​​ the conditions for correctness​​​‌ of a given audit​ task. We indeed know​‌ from active property testing​​ that for some task​​​‌ assumptions on the model​ symmetry are taken. The​‌ models we are targeting​​ are of a much​​​‌ higher complexity in their​ input cardinality and domain;​‌ we will most likely​​ have to restrict the​​​‌ tasks one wants to​ perform correctly. This challenge​‌ thus tackles the conditions​​ under which some tasks​​​‌ can be performed (​i.e., decided). Proving​‌ that some tasks are​​ not feasible under certain​​​‌ conditions (see 49,​ 27) is also​‌ of importance in order​​ to delimit what is​​​‌ auditable or not, and​ then clarify some areas​‌ where empirical work is​​ often engaged.

3.2.2 Challenge​​​‌ #5: Tracking models evolutions​

Most research effort are​‌ devoted to AI systems​​ in vitro, i.e. under​​​‌ restricted lab conditions, oversimplified​ assumptions, and a static​‌ ML pipeline. Machine Learning​​ Operations (MLOps) is a​​​‌ new paradigm considering the​ life cycles of real-world​‌ and large-scale AI systems​​ with continuous model updates.​​​‌ What is the impact​ for auditors?

Several research​‌ works 54, 45​​ show that one auditor​​​‌ can measure a distance​ between models. One straightforward​‌ use is to track​​ the changes of several​​​‌ versions of an audited​ model. This requires successive​‌ queries with specific test​​ data x's (see​​​‌ e.g. 46). A​ more advanced challenge is​‌ to find out if​​ a consistent change (​​​‌i.e., evolution) of​ a model for a​‌ precise reason can also​​ be captured by such​​ distances under the form​​​‌ of an also consistent‌ direction. An auditor would‌​‌ then be able to​​ track a model evolution​​​‌ to assess whether or‌ not a platform is‌​‌ updating a model in​​ a direction deemed suitable.​​​‌ This example application further‌ underlines the link between‌​‌ audits (for regulators) and​​ security (i.e.,​​​‌ potential leaks of company‌ related operational secrets for‌​‌ instance).

3.2.3 Challenge #6:​​ Auditors coordination and stealth​​​‌

After the conditions for‌ correct audits are better‌​‌ understood (from challenge #4),​​ and some concrete application​​​‌ of audits are designed‌ (challenge #5), the question‌​‌ of efficiency in audits,​​ by means of collaboration​​​‌ can arise. Indeed, a‌ collaboration between multiple auditors‌​‌ interested in inferring a​​ given model property may​​​‌ bring significant improvements for‌ the accuracy of audits.‌​‌ Since the queries by​​ the auditors are most​​​‌ likely part of the‌ normal operation of model‌​‌ θ, letting these​​ observations (queries/results) be shared​​​‌ in a collaborative audit‌ environment might be of‌​‌ interest. This is to​​ be opposed to a​​​‌ single auditor setup, where‌ queries have the very‌​‌ purpose of checking properties​​ (i.e., not​​​‌ using the service per‌ se). We are thus‌​‌ interested in studying the​​ benefit of collaboration on​​​‌ diminishing the number of‌ non service related queries,‌​‌ or to perform audits​​ in a stealthy mode.​​​‌

This axis also relates‌ to coordination techniques that‌​‌ can be borrowed from​​ the field of distributed​​​‌ computing (and expertise from‌ the WIDE team).

3.2.4‌​‌ Priorities

The priorities are​​ listed from short to​​​‌ long term research investigations.‌

  • Challenges #4: Explainability in‌​‌ a black-box setup cannot​​ imply trust when conditions​​​‌ for an audit are‌ insufficient. Finding the minimal‌​‌ set of assumptions for​​ an auditor, that makes​​​‌ it possible to design‌ efficient audit algorithms (in‌​‌ other words that would​​ permit a gain over​​​‌ purely randomized queries to‌ a platform) is our‌​‌ first priority and currently​​ pursued.
  • Challenges #5: Prior​​​‌ to the creation of‌ the team-project, we designed‌​‌ a method for computing​​ distances between models. Yet,​​​‌ this only operates for‌ predictive AI models, i.e.‌​‌ classifiers. Our priority is​​ to tackle generative AI​​​‌ models, especially LLM 48‌.
  • Challenges #6: An‌​‌ informal collaboration has already​​ started within the associated​​​‌ team with EPFL's SaCS‌ team to work on‌​‌ parallel audits from several​​ agents with various individual​​​‌ objectives 70. Another‌ priority is the assessment‌​‌ of stealthiness of the​​ auditor, and if not,​​​‌ the possibilty for the‌ model owner to cheat.‌​‌

3.3 Research axis C:​​ Threats from generative models​​​‌

Grid for reading: Generative‌ AI, Malevolent Usage, Design‌​‌ of defenses.

The development​​ of AI-based image editing​​​‌ techniques makes tampered images‌ and facial manipulation (commonly‌​‌ referred to as deepfakes)​​ widely available and more​​​‌ realistic 21. A‌ new step has been‌​‌ taken with modern AI​​ generative technologies that now​​​‌ make it possible to‌ alter content by means‌​‌ of a simple textual​​ instruction 28, 42​​​‌. Many detection methods‌ rely on the general‌​‌ assumption that any falsification,​​​‌ or synthesis process, introduces​ low-level artifacts. While these​‌ approaches can be effective,​​ they suffer from three​​​‌ major weaknesses: (i) poor​ generalization ability 72;​‌ (ii) the need for​​ a large annotated dataset,​​​‌ (iii) a lack of​ robustness when the data​‌ have been compressed or​​ rescaled several times, as​​​‌ is the case for​ images circulating on social​‌ networks.

3.3.1 Manipulated data​​ detection

Challenge #7: Person-centric​​​‌ deepfake detection.

Concerning facial​ manipulation, one strategy is​‌ to address the above​​ limitations by proposing person-centric​​​‌ deep fake detection, that​ involves the learning of​‌ behavioural-signatures representing enrolled subjects.​​ By overcoming the method​​​‌ used to generate the​ deepfakes, this approach improves​‌ robustness to low quality​​ videos and allow to​​​‌ handle a wide range​ of modifications, from lip-synchronization​‌ to face swapping. We​​ intend to explore metric​​​‌ learning, as well as​ one-class learning. Metric learning​‌ enables the maximization of​​ feature-wise distances between real​​​‌ and manipulated frames, while​ minimizing the feature-wise distances​‌ between frames obtained from​​ real videos. In one-class​​​‌ learning, deepfake detection can​ be formulated as an​‌ anomaly detection problem. The​​ distribution of non-manipulated face​​​‌ images would be modelled,​ aiming to identify manipulated​‌ face images as anomalies​​ w.r.t. this model. The​​​‌ associated challenges are to​ determine discriminative and robust​‌ behavioral characteristics for an​​ individual, and to limit​​​‌ the number of videos​ needed to learn the​‌ model of an individual.​​

Challenge #8: Tampered images​​​‌ detection. A possible solution​ to increase the robustness​‌ of tampered image detection​​ is to search for​​​‌ similar candidates in a​ trusted database or in​‌ an open database, in​​ order to obtain additional​​​‌ information to the simple​ statistics of the content.​‌ Nevertheless, a simple comparison​​ of images is not​​​‌ sufficient. The diversity of​ modifications made to images​‌ in real case scenario​​ (low quality, cropping, editing​​​‌ operations) makes the search​ results very noisy. Most​‌ importantly, to be used​​ in practice, a method​​​‌ must not only detect​ a manipulation, but also​‌ differentiate a malicious modification​​ from an editing operation​​​‌ that does not change​ the meaning of the​‌ image.

Challenge #9: Out-of-context​​ images detection.

Image reuse​​​‌ is one of the​ easiest, and therefore most​‌ common methods of spreading​​ false information, in which​​​‌ an (unmodified) image is​ reused to illustrate a​‌ different story. In this​​ case, the falsified part​​​‌ is the link between​ the image and the​‌ text. This makes detection​​ very difficult, as it​​​‌ requires a semantic understanding​ of the text, the​‌ image and their relation.​​ Detecting this link requires​​​‌ text and image alignment,​ which is one of​‌ the modern multimodal tasks.​​ Despite recent advances in​​​‌ vision-language models, the detection​ capabilities of these methods​‌ are limited  53,​​ 44. We are​​​‌ interested in improving these​ models to better take​‌ into account the detection​​ of named entities, paraphrases,​​​‌ negations. As for tampered​ image detection, the use​‌ of external open-domain resources​​ (typically the web) is​​​‌ a way to improve​ the detection. This involves​‌ cross-modal search and fine-grained​​ semantic understanding of text​​ and images. The challenges​​​‌ are related to the‌ use of a noisy‌​‌ open domain, as well​​ as extraction and processing​​​‌ of evidences. Generative methods‌ can be used as‌​‌ levers to improve cross-modal​​ search and text and​​​‌ image alignment methods.

3.3.2‌ Challenge #10: Planting watermarking‌​‌ in generative models.

Watermarking​​ is a well-known means​​​‌ to prove source of‌ a content (image, audio,‌​‌ video, source code). It​​ has witnessed a revival​​​‌ thanks to deep learning‌ 47 where the model‌​‌ extracts robust features from​​ content which carry the​​​‌ watermark signal 35,‌ 34.

Several governments‌​‌ agree that key players​​ in the field of​​​‌ AI should protect their‌ models and also offer‌​‌ means to detect that​​ a piece of content​​​‌ has been generated 61‌. Their AI should‌​‌ not usurp humans. For​​ both applications, watermarking is​​​‌ one approach, so-called active‌ in opposition to challenges‌​‌ #8 (Sect. 3.3.1)​​ and #2 (Sect. 3.1.2​​​‌) based on passive‌ forensics.

Watermarking is used‌​‌ to embed a signature​​ onto the generated pieces​​​‌ of content. A first‌ distinction is whether the‌​‌ model is private (like​​ chatGPT, Midjourney, ...) or​​​‌ publicly available like Stable‌ Diffusion. In the latter‌​‌ case, the challenge is​​ to merge generation and​​​‌ watermarking, so that the‌ model generates natively watermarked‌​‌ content. Another question is​​ whether versioning is possible.​​​‌ The user downloads a‌ generative model which embeds‌​‌ an identification signature into​​ generated content. The second​​​‌ difficulty is who is‌ in charge of the‌​‌ watermark detection. A public​​ detector would open the​​​‌ door to oracle attacks.‌ This topic is highly‌​‌ trendy in image and​​ especially text generation.

3.3.3​​​‌ Priorities

  • Challenges #8: The‌ detection and characterization of‌​‌ modified images already started.​​ The goal is to​​​‌ develop a vision-language model‌ reporting the differences in‌​‌ between two images together​​ with a global measurement​​​‌ of the change of‌ the semantic. Embedding semantic‌​‌ information within the content​​ is one possibility 33​​​‌.
  • Challenge #10: Watermarking‌ for AI generated content‌​‌ is our priority due​​ to the agenda of​​​‌ the EU AI Act‌ Art.50. This includes watermarking‌​‌ of text generated by​​ LLMs 38, 66​​​‌. Also, the question‌ of the security of‌​‌ these new watermarking primitives​​ is of utmost importance​​​‌ 37.
  • Challenges #7:‌ Deepfake detection will focus‌​‌ on face reenactment, which​​ is the most difficult​​​‌ case, but the least‌ studied. After identifying the‌​‌ weaknesses of current methods​​ for this type of​​​‌ modification, the aim is‌ to develop a person-centric‌​‌ approach. The robustness and​​ the amount of training​​​‌ data required will be‌ the key issues to‌​‌ be resolved.

4 Application​​ domains

4.1 Security, cybersecurity,​​​‌ and defense applications

The‌ main utility of AI‌​‌ in defense applications is​​ the processing of information.​​​‌ Security and defense is‌ facing a deluge of‌​‌ data, in terms of​​ quantity and resolution. AI​​​‌ is here to help‌ exploiting that data, be‌​‌ they hot (freshly captured​​ in real-time) or cold​​​‌ (archived). Exploiting means helping‌ the operator to analyse‌​‌ hot data and to​​​‌ navigate among cold archives​ in order to discover​‌ events and to raise​​ alarms instrumental during the​​​‌ decision-making process.

Specificities concern​ data and the operation​‌ mode. Data is plentiful,​​ large, heterogeneous, but also​​​‌ possibly confidential, sensitive, with​ an access forbidden to​‌ anyone not having “​​le droit d’en savoir​​​‌”. Data originate from​ sources varying in trust.​‌ AI for defense requires​​ reliability in operation mode.​​​‌ This requirement is set​ to an extreme level:​‌ missing an alarm may​​ endanger the life of​​​‌ civilians, soldiers or cause​ irreparable damages.

The existence​‌ of very serious adversaries​​ fundamentally differentiates defense and​​​‌ security applications from any​ other context. Adversaries can​‌ attack systems and manipulate​​ data in order to​​​‌ cause e.g. false negatives​ (missing an event) or​‌ false positives (raising a​​ wrong alarm), overall reducing​​​‌ the performance of an​ AI-based decision process. Defense​‌ also means coalition with​​ allies. This spurs interoperability,​​​‌ but with restrictions. Collaborating​ does not mean sharing​‌ all data and knowledge.​​ France may grant allies​​​‌ access to AI systems​ while preventing them from​‌ stealing technology, or inferring​​ about the training data.​​​‌ Another specificity is purely​ technological: AI systems for​‌ defense might be considered​​ as weapons, and as​​​‌ such they cannot rely​ on any untested outsourced​‌ technology.

Cybersecurity also encompasses​​ the fight against disinformation.​​​‌ Though information operations are​ as old as war​‌ itself, armed forces have​​ had to adapt their​​​‌ strategies to establish themselves​ in cyberspace. Belligerent states​‌ have been particularly active​​ in this field of​​​‌ cyberspace information operations. AI​ generation or modification of​‌ content is disrupting information​​ warfare. This transformation poses​​​‌ many challenges to the​ armed forces as well​‌ as to our industries.​​

4.2 Compliance with coming​​​‌ regulations

National and European​ regulations (EU AI Act)​‌ classifies AI systems according​​ to their risks. The​​​‌ majority of obligations fall​ on providers of high-risk​‌ AI systems. All providers​​ must also conduct model​​​‌ evaluations, adversarial testing, track​ and report serious incidents​‌ and ensure cybersecurity protections.​​ Yet, even limited-risk AI​​​‌ systems are subject to​ light transparency obligations. Deployers​‌ must ensure that end-users​​ are aware that they​​​‌ are interacting with AI​ (chatbots and deepfakes).

Beyond​‌ self assessments, governments and​​ the European Commission have​​​‌ recently created entities (INESIA​ in France, AI Office​‌ in Europe) whose goal​​ is to edit a​​​‌ Code of Practice detailing​ the technical implementation of​‌ the related legal concepts,​​ to assess the compliance​​​‌ of deployed models with​ respect to this Code​‌ throught black-box audit, and​​ to investigate present and​​​‌ future systemic risks.

5​ Highlights of the year​‌

5.1 Awards

  • Best Paper​​ Award at SRDS 2025​​​‌ (Symposium on Reliable Distributed​ Systems) to "Robust​‌ Fingerprinting of Graphs with​​ FinG7 co-authored​​​‌ by Jade Garcia Bourrée​ and Erwan Le Merrer​‌ .
  • Spotlight paper at​​ ICML 2025 (International Conference​​​‌ on Machine Learning) to​ "Robust ML Auditing​‌ using Prior Knowledge"​​ 4 co-authored by Jade​​​‌ Garcia Bourrée , Augustin​ Godinot , and Erwan​‌ Le Merrer .
  • Teddy​​ Furon received the Prix​​ Innovation Inria-Dassault System de​​​‌ l'Académie des Sciences.‌

5.2 Societal impact

Teddy‌​‌ Furon co-funded the startup​​ Label4.ai in January 2025.​​​‌ This company offers means‌ to detect AI generated‌​‌ content. It is based​​ on the forensics analysis​​​‌ of content (research transfer‌ from CNRS and Univ.‌​‌ Napoli) or on watermarking​​ techniques labelling AI content​​​‌ at the generation step‌ (research transfer from Inria).‌​‌ This covers four modalities:​​ image, video, audio, text.​​​‌ Label4.ai received the award‌ Trophée Start-Up Numérique 2025‌​‌ by IMT Starter.

6​​ New results

6.1 Security​​​‌ of machine learning

On‌ the Vulnerability of Retrieval‌​‌ in High Intrinsic Dimensionality​​ Neighborhood

Participant: Teddy Furon​​​‌.

This work investigates‌ the vulnerability of the‌​‌ nearest neighbors search, which​​ is a pivotal tool​​​‌ in pattern analysis, data‌ science, and machine learning.‌​‌ The vulnerability is gauged​​ as the relative amount​​​‌ of perturbation that an‌ attacker needs to add‌​‌ to a dataset point​​ in order to modify​​​‌ its proximity to a‌ given query. The statistical‌​‌ distribution of the relative​​ amount of perturbation is​​​‌ derived from simple assumptions,‌ outlining the key factor‌​‌ that drives its typical​​ values: The higher the​​​‌ intrinsic dimensionality, the more‌ vulnerable is the nearest‌​‌ neighbors search. Experiments on​​ six large-scale datasets validate​​​‌ this model up to‌ some outliers, which are‌​‌ explained as violations of​​ the assumptions. Related publication​​​‌ 3.

Robust Fingerprinting‌ of Graphs with FING‌​‌

Participants: Odysseas Drosis [EPFL]​​, Jade Garcia Bourrée​​​‌, Anne-Marie Kermarrec [EPFL]‌, Erwan Le Merrer‌​‌, Othmane Safsafi [EPFL]​​.

Graphs have become​​​‌ fundamental for carrying invaluable‌ insights into numerous scientific‌​‌ disciplines. Controlling if they​​ are further shared and​​​‌ modified is essential when‌ sharing such graphs. This‌​‌ control is typically achieved​​ using digital watermarking by​​​‌ embedding identification information in‌ the graph structure. In‌​‌ this work, we propose​​ the first approach to​​​‌ fingerprinting graphs by associating‌ a characteristic signature of‌​‌ these graphs that can​​ be extracted later as​​​‌ proof of ownership. This‌ work provides the same‌​‌ guarantees as watermarking while​​ avoiding the need to​​​‌ modify the graph, instead‌ by exporting the fingerprint‌​‌ to an external timestamped​​ database. We present the​​​‌ novel fingerprinting scheme FING.‌ FING relies on the‌​‌ Factor-r Sum Subsets problem​​ to create a digital​​​‌ fingerprint. This problem is‌ NP-hard, so it is‌​‌ easy to create and​​ extract for the graph​​​‌ originator while being intractable‌ for an attacker. We‌​‌ provide an analysis of​​ the robustness of FING​​​‌ facing a wide range‌ of attacks that aim‌​‌ at removing or extracting​​ the fingerprint. Finally, we​​​‌ empirically show FING's scalability.‌ A fingerprint can be‌​‌ created in around four​​ minutes on a single​​​‌ core for 10 million‌ node graphs and is‌​‌ robust against attacks removing​​ thousands of edges, for​​​‌ instance. Related publication 7‌.

Queries, Representation &‌​‌ Detection: The Next 100​​ Model Fingerprinting Schemes

Participants:​​​‌ Augustin Godinot, Erwan‌ Le Merrer, Camilla‌​‌ Penzo [PEReN], François​​ Taïani [Inria WIDE],​​​‌ Gilles Trédan [CNRS LAAS]‌.

The deployment of‌​‌ machine learning models in​​​‌ operational contexts represents a​ significant investment for any​‌ organisation. Consequently, the risk​​ of these models being​​​‌ misappropriated by competitors needs​ to be addressed. In​‌ recent years, numerous proposals​​ have been put forth​​​‌ to detect instances of​ model stealing. However, these​‌ proposals operate under implicit​​ and disparate data and​​​‌ model access assumptions; as​ a consequence, it remains​‌ unclear how they can​​ be effectively compared to​​​‌ one another. Our evaluation​ shows that a simple​‌ baseline that we introduce​​ performs on par with​​​‌ existing state-of-the-art fingerprints, which,​ on the other hand,​‌ are much more complex.​​ To uncover the reasons​​​‌ behind this intriguing result,​ this work introduces a​‌ systematic approach to both​​ the creation of model​​​‌ fingerprinting schemes and their​ evaluation benchmarks. By dividing​‌ model fingerprinting into three​​ core components – Query,​​​‌ Representation and Detection (QuRD)​ – we are able​‌ to identify around 100​​ previously unexplored QuRD combinations​​​‌ and gain insights into​ their performance. Finally, we​‌ introduce a set of​​ metrics to compare and​​​‌ guide the creation of​ more representative model stealing​‌ detection benchmarks. Our approach​​ reveals the need for​​​‌ more challenging benchmarks and​ a sound comparison with​‌ baselines. To foster the​​ creation of new fingerprinting​​​‌ schemes and benchmarks, we​ open-source our fingerprinting toolbox.​‌ Related publication 10.​​

BAIT: A new DNN​​​‌ backdoor attack using inpainted​ triggers

Participants: Quentin Le​‌ Roux, Yannick Teglia​​ [THALES], Eric Bourbao​​​‌ [THALES], Philippe Loubet-Moundi​ [THALES], Teddy Furon​‌.

Backdoor attacks compromise​​ deep neural networks by​​​‌ injecting them with covert,​ malicious behaviors during training,​‌ which attackers can later​​ activate at test-time. As​​​‌ backdoors become more sophisticated,​ defenses struggle to catch​‌ up. This paper introduces​​ a simple yet effective​​​‌ Backdoor Attack using Inpainting​ as a Trigger, dubbed​‌ BAIT. The attack's trigger​​ relies on a randomly-drawn​​​‌ polygonal patch, filled via​ inpainting with an off-the-shelf​‌ generative adversarial network. Using​​ BAIT, we show that​​​‌ several defenses, including common​ test-time input purification methods,​‌ can be bypassed by​​ patch-based backdoors. To counter​​​‌ this, we propose four​ targeted defense strategies. Related​‌ publication 14.

Survivability​​ of Backdoor Attacks on​​​‌ Unconstrained Face Recognition Systems​

Participants: Quentin Le Roux​‌, Yannick Teglia [THALES]​​, Teddy Furon,​​​‌ Eric Bourbao [THALES],​ Philippe Loubet-Moundi [THALES].​‌

The widespread use of​​ deep learning face recognition​​​‌ raises several security concerns.​ Although prior works point​‌ at existing vulnerabilities, DNN​​ backdoor attacks against real-life,​​​‌ unconstrained systems dealing with​ images captured in the​‌ wild remain a blind​​ spot of the literature.​​​‌ This work conducts the​ first system-level study of​‌ backdoors in deep learning-based​​ face recognition systems. This​​​‌ work yields four contributions​ by exploring the feasibility​‌ of DNN backdoors on​​ these pipelines in a​​​‌ holistic fashion. We demonstrate​ for the first time​‌ two backdoor attacks on​​ the face detection task:​​​‌ face generation and face​ landmark shift attacks. We​‌ then show that face​​ feature extractors trained with​​​‌ large margin losses also​ fall victim to backdoor​‌ attacks. Combining our models,​​ we then show using​​ 20 possible pipeline configurations​​​‌ and 15 attack cases‌ that a single backdoor‌​‌ enables an attacker to​​ bypass a system's entire​​​‌ function. Finally, we provide‌ stakeholders with several best‌​‌ practices and countermeasures. Related​​ publication 20.

Task-Agnostic​​​‌ Attacks Against Vision Foundation‌ Models

Participants: Brian Pufler‌​‌ [Univ. Geneva], Yury​​ Belousov [Univ. Geneva],​​​‌ Vitaliy Kinakh [Univ. Geneva]‌, Teddy Furon,‌​‌ Slava Voloshynovskiy [Univ. Geneva]​​.

The study of​​​‌ security in machine learning‌ mainly focuses on downstream‌​‌ task-specific attacks, where the​​ adversarial example is obtained​​​‌ by optimizing a loss‌ function specific to the‌​‌ downstream task. At the​​ same time, it has​​​‌ become standard practice for‌ machine learning practitioners to‌​‌ adopt publicly available pre-trained​​ vision foundation models, effectively​​​‌ sharing a common backbone‌ architecture across a multitude‌​‌ of applications such as​​ classification, segmentation, depth estimation,​​​‌ retrieval, question answering and‌ more. The study of‌​‌ attacks on such foundation​​ models and their impact​​​‌ to multiple downstream tasks‌ remains vastly unexplored. This‌​‌ work proposes a general​​ framework that forges task-agnostic​​​‌ adversarial examples by maximally‌ disrupting the feature representation‌​‌ obtained with foundation models.​​ We extensively evaluate the​​​‌ security of the feature‌ representations obtained by popular‌​‌ vision foundation models by​​ measuring the impact of​​​‌ this attack on multiple‌ downstream tasks and its‌​‌ transferability between models. Related​​ publication 17.

Multi-modal​​​‌ Identity Extraction

Participants: Ryan‌ Webster, Teddy Furon‌​‌.

The success of​​ multi-modal foundational models is​​​‌ partly attributed to their‌ diverse, billions-scale training data.‌​‌ By nature, web data​​ contains human faces and​​​‌ descriptions of individuals. Thus,‌ these models pose potentially‌​‌ widespread privacy issues. Recently,​​ identity membership inference attacks​​​‌ (IMIAs) against the CLIP‌ model showed that membership‌​‌ of an individual's name​​ and image within training​​​‌ data can be reliably‌ inferred.

This work formalizes‌​‌ the problem of identity​​ extraction, wherein an attacker​​​‌ can reliably extract the‌ names of individuals given‌​‌ their images only. We​​ provide the following contributions​​​‌ (i) we adapt a‌ previous IMIA to the‌​‌ problem of selecting the​​ correct name among a​​​‌ large set and show‌ that the method scales‌​‌ to millions of names​​ (ii) we design an​​​‌ attack that outperforms the‌ adapted baseline (iii) we‌​‌ show that an attacker​​ can extract names via​​​‌ optimization only. To demonstrate‌ the interest of our‌​‌ framework, we show how​​ identity extraction can be​​​‌ used to audit model‌ privacy. Indeed, a family‌​‌ of prominent models that​​ advertise blurring faces before​​​‌ training to protect privacy‌ is still highly vulnerable‌​‌ to attack. Related publication​​ 16.

Improving Unlearning​​​‌ with Model Updates Probably‌ Aligned with Gradients

Participants:‌​‌ Virgile Dine, Teddy​​ Furon, Charly Faure​​​‌.

We formulate the‌ machine unlearning problem as‌​‌ a general constrained optimization​​ problem. It unifies the​​​‌ first-order methods from the‌ approximate machine unlearning literature.‌​‌ This work then introduces​​ the concept of feasible​​​‌ updates as the model's‌ parameter update directions that‌​‌ help with unlearning while​​ not degrading the utility​​​‌ of the initial model.‌ Our design of feasible‌​‌ updates is based on​​​‌ masking, i.e. a careful​ selection of the model's​‌ parameters worth updating. It​​ also takes into account​​​‌ the estimation noise of​ the gradients when processing​‌ each batch of data​​ to offer a statistical​​​‌ guarantee to derive locally​ feasible updates. The technique​‌ can be plugged in,​​ as an add-on, to​​​‌ any first-order approximate unlearning​ methods. Experiments with computer​‌ vision classifiers validates this​​ approach. Related publication 6​​​‌.

6.2 Audit of​ black-box AIs

Robust ML​‌ Auditing using Prior Knowledge​​

Participants: Jade Garcia Bourrée​​​‌, Augustin Godinot,​ Sayan Biswas [EPFL],​‌ Anne-Marie Kermarrec [EPFL],​​ Erwan Le Merrer,​​​‌ Gilles Trédan [CNRS LAAS]​, Martijn de Vos​‌ [EPFL], Milos Vujasinovic​​ [EPFL].

Among the​​​‌ many technical challenges to​ enforcing AI regulations, one​‌ crucial yet underexplored problem​​ is the risk of​​​‌ audit manipulation. This manipulation​ occurs when a platform​‌ deliberately alters its answers​​ to a regulator to​​​‌ pass an audit without​ modifying its answers to​‌ other users. In this​​ work, we introduce a​​​‌ novel approach to manipulation-proof​ auditing by taking into​‌ account the auditor's prior​​ knowledge of the task​​​‌ solved by the platform.​ We first demonstrate that​‌ regulators must not rely​​ on public priors (e.g.​​​‌ a public dataset), as​ platforms could easily fool​‌ the auditor in such​​ cases. We then formally​​​‌ establish the conditions under​ which an auditor can​‌ prevent audit manipulations using​​ prior knowledge about the​​​‌ ground truth. Finally, our​ experiments with two standard​‌ datasets illustrate the maximum​​ level of unfairness a​​​‌ platform can hide before​ being detected as malicious.​‌ Our formalization and generalization​​ of manipulation-proof auditing with​​​‌ a prior opens up​ new research directions for​‌ more robust fairness audits.​​ Related publication 4.​​​‌

P2NIA: Privacy-Preserving Non-Iterative Auditing​

Participants: Jade Garcia Bourrée​‌, Hadrien Lautraite [Univ.​​ Québec], Sébastien Gambs​​​‌ [Univ. Québec], Gilles​ Trédan [CNRS LAAS],​‌ Erwan Le Merrer,​​ Benoît Rottembourg [Inria Bordeaux]​​​‌.

The emergence of​ AI legislation has increased​‌ the need to assess​​ the ethical compliance of​​​‌ high-risk AI systems. Traditional​ auditing methods rely on​‌ platforms' application programming interfaces​​ (APIs), where responses to​​​‌ queries are examined through​ the lens of fairness​‌ requirements. However, such approaches​​ put a significant burden​​​‌ on platforms, as they​ are forced to maintain​‌ APIs while ensuring privacy,​​ facing the possibility of​​​‌ data leaks. This lack​ of proper collaboration between​‌ the two parties, in​​ turn, causes a significant​​​‌ challenge to the auditor,​ who is subject to​‌ estimation bias as they​​ are unaware of the​​​‌ data distribution of the​ platform. To address these​‌ two issues, we present​​ P2NIA, a novel auditing​​​‌ scheme that proposes a​ mutually beneficial collaboration for​‌ both the auditor and​​ the platform. Extensive experiments​​​‌ demonstrate P2NIA's effectiveness in​ addressing both issues. In​‌ summary, our work introduces​​ a privacy-preserving and non-iterative​​​‌ audit scheme that enhances​ fairness assessments using synthetic​‌ or local data, avoiding​​ the challenges associated with​​​‌ traditional API-based audits. Related​ publication 9.

6.3​‌ Threats from generative models​​

Reframing image difference captioning​​ with BLIP2IDC and synthetic​​​‌ augmentation

Participants: Gautier Evennou‌, Antoine Chaffin [LightOn]‌​‌, Vivien Chappelier [Imatag]​​, Ewa Kijak.​​​‌

The rise of the‌ generative models quality during‌​‌ the past years enabled​​ the generation of edited​​​‌ variations of images at‌ an important scale. To‌​‌ counter the harmful effects​​ of such technology, the​​​‌ Image Difference Captioning (IDC)‌ task aims to describe‌​‌ the differences between two​​ images. While this task​​​‌ is successfully handled for‌ simple 3D rendered images,‌​‌ it struggles on real-world​​ images. The reason is​​​‌ twofold: the training data-scarcity,‌ and the difficulty to‌​‌ capture fine-grained differences between​​ complex images. To address​​​‌ those issues, we propose‌ a simple yet effective‌​‌ framework to both adapt​​ existing image captioning models​​​‌ to the IDC task‌ and augment IDC datasets.‌​‌ We introduce BLIP2IDC, an​​ adaptation of BLIP2 to​​​‌ the IDC task at‌ low computational cost, and‌​‌ show it outperforms two-streams​​ approaches by a significant​​​‌ margin on real-world IDC‌ datasets. We also propose‌​‌ to use synthetic augmentation​​ to improve the performance​​​‌ of IDC models in‌ an agnostic fashion. We‌​‌ show that our synthetic​​ augmentation strategy provides high​​​‌ quality data, leading to‌ a challenging new dataset‌​‌ well-suited for IDC named​​ Syned. Related publication 8​​​‌.

Evaluating the security‌ of public surrogate watermark‌​‌ detectors

Participants: Chloé Imadache​​, Eva Giboulot,​​​‌ Teddy Furon.

The‌ omnipresence of generated content‌​‌ has led to an​​ increasing need of multimedia​​​‌ content traceability. Watermarking techniques‌ have been proven to‌​‌ provide both detection guarantees​​ and robustness. However, widespread​​​‌ use of such methods‌ would require disclosing the‌​‌ watermark detector to the​​ public. Such access breaches​​​‌ the watermark security: end‌ users with unlimited access‌​‌ to the detector could​​ easily craft adversarial examples,​​​‌ through white-box and black-box‌ attacks.

To circumvent this‌​‌ issue, we suggest providing​​ to the public a​​​‌ surrogate, less accurate detector.‌ Calls to the private‌​‌ detector would be reserved​​ for important or anomalous​​​‌ cases. This paper studies‌ the potential leakage of‌​‌ information from the surrogate​​ detector. We first create​​​‌ a wide panel of‌ images adversarial to the‌​‌ surrogate detector. The efficiency​​ of the private detector​​​‌ is then assessed on‌ this data. This allows‌​‌ us to introduce a​​ metric of the transferability​​​‌ of these attacks from‌ the surrogate to the‌​‌ private detector. Through this​​ metric, we evaluate the​​​‌ security of different designs‌ of surrogate detectors. Related‌​‌ publication 13.

Watermark​​ anything with localized messages​​​‌

Participants: Tom Sander [Meta‌ FAIR], Pierre Fernandez‌​‌, Alain Durmus [CMAP​​ Ecole Polytechnique], Teddy​​​‌ Furon, Matthijs Douze‌ [Meta FAIR].

Image‌​‌ watermarking methods are not​​ tailored to handle small​​​‌ watermarked areas. This restricts‌ applications in real-world scenarios‌​‌ where parts of the​​ image may come from​​​‌ different sources or have‌ been edited. We introduce‌​‌ a deep-learning model for​​ localized image watermarking, dubbed​​​‌ the Watermark Anything Model‌ (WAM). The WAM embedder‌​‌ imperceptibly modifies the input​​ image, while the extractor​​​‌ segments the received image‌ into watermarked and non-watermarked‌​‌ areas and recovers one​​​‌ or several hidden messages​ from the areas found​‌ to be watermarked. The​​ models are jointly trained​​​‌ at low resolution and​ without perceptual constraints, then​‌ post-trained for imperceptibility and​​ multiple watermarks. Experiments show​​​‌ that WAM is competitive​ with state-of-the art methods​‌ in terms of imperceptibility​​ and robustness, especially against​​​‌ inpainting and splicing, even​ on high-resolution images. Moreover,​‌ it offers new capabilities:​​ WAM can locate watermarked​​​‌ areas in spliced images​ and extract distinct 32-bit​‌ messages with less than​​ 1 bit error from​​​‌ multiple small regions -no​ larger than 10% of​‌ the image surface -even​​ for small 256 ×​​​‌ 256 images. Related publication​ 15.

6.4 Miscellaneous​‌

Algorithmic curation of News​​ on YouTube: Evidence from​​​‌ the 2022 French Presidential​ Campaign

Participants: Julien Figeac​‌ [CNRS], Erwan Le​​ Merrer, Marie Neihouser​​​‌ [Université Paris 1 Panthéon-Sorbonne]​, Gilles Trédan [CNRS​‌ LAAS].

Debate is​​ growing over how algorithmic​​​‌ recommendations influence news visibility,​ particularly regarding interest and​‌ ideological bias. This work​​ examines YouTube's News Recommender​​​‌ System (NRS), focusing on​ the French "News" section​‌ of the homepage through​​ a large-scale audit using​​​‌ automated browsing agents. Our​ findings show that the​‌ NRS prioritises platform-native creators​​ over established news outlets,​​​‌ favouring content that aligns​ with YouTube's features rather​‌ than traditional editorial standards.​​ Politically charged, opinion-based videos,​​​‌ especially those from prominent​ figures affiliated with extreme​‌ political parties, receive ongoing​​ algorithmic promotion. While centrist​​​‌ and moderate political figures​ remain underrepresented, the algorithm​‌ boosts their visibility once​​ users interact with this​​​‌ type of content. This​ two-part mechanism, which amplifies​‌ already prominent content and​​ appears to overcompensate for​​​‌ rare content, does not​ just reflect engagement-based optimisation,​‌ but is driven by​​ the algorithm's tendency to​​​‌ maintain coherence in a​ highly unbalanced content landscape.​‌ However, this compensatory logic​​ does not counteract the​​​‌ algorithm's broader tendency to​ promote content from radical​‌ political figures while marginalising​​ institutional news outlets. Through​​​‌ this process, the NRS​ actively reshapes news exposure​‌ within the "News" section,​​ privileging political expression over​​​‌ journalistic authority and reproducing​ structural hierarchies of visibility.​‌ Related publication 19.​​

Fast & Fourier: spectral​​​‌ graph watermarking

Participants: Jade​ Garcia Bourrée, Anne-Marie​‌ Kermarrec [EPFL], Erwan​​ Le Merrer, Othmane​​​‌ Safsafi [EPFL].

We​ address the problem of​‌ watermarking graph objects, which​​ consists in hiding information​​​‌ within them, to prove​ their origin. The two​‌ existing methods to watermark​​ graphs use subgraph matching​​​‌ or graph isomorphism techniques,​ which are known to​‌ be intractable for large​​ graphs. To reduce the​​​‌ operational complexity, we propose​ FFG, a new graph​‌ watermarking scheme adapted from​​ an image watermarking scheme,​​​‌ since graphs and images​ can be represented as​‌ matrices. We analyze and​​ compare FFG, whose novelty​​​‌ lies in embedding the​ watermark in the Fourier​‌ transform of the adjacency​​ matrix of a graph.​​​‌ Our technique enjoys a​ much lower complexity than​‌ that of related works​​ (i.e. in O(N2 log​​​‌ N)), while performing better​ or at least as​‌ well as the two​​ state-of-the-art methods. Related publication​​ 5.

Adapting Without​​​‌ Seeing: Text-Aided Domain Adaptation‌ for Adapting CLIP-like Models‌​‌ to Novel Domains

Participants:​​ Louis Hemadou, Heléna​​​‌ Vorobieva [Safran Tech],‌ Ewa Kijak, Frédéric‌​‌ Jurie [CNRS GREYC].​​

This work addresses the​​​‌ challenge of adapting large‌ vision models, such as‌​‌ CLIP, to domain shifts​​ in image classification tasks.​​​‌ While these models, pre-trained‌ on vast datasets like‌​‌ LAION 2B, offer powerful​​ visual representations, they may​​​‌ struggle when applied to‌ domains significantly different from‌​‌ their training data, such​​ as industrial applications. We​​​‌ introduce TADA, a Text-Aided‌ Domain Adaptation method that‌​‌ adapts the visual representations​​ of these models to​​​‌ new domains without requiring‌ target domain images. TADA‌​‌ leverages verbal descriptions of​​ the domain shift to​​​‌ capture the differences between‌ the pre-training and target‌​‌ domains. Our method integrates​​ seamlessly with fine-tuning strategies,​​​‌ including prompt learning methods.‌ We demonstrate TADA's effectiveness‌​‌ in improving the performance​​ of large vision models​​​‌ on domain-shifted data, achieving‌ state-of-the-art results on benchmarks‌​‌ like DomainNet. Related publication​​ 11.

Cross-task knowledge​​​‌ distillation for few-shot detection.‌

Participants: Louis Hemadou,‌​‌ Heléna Vorobieva [Safran Tech]​​, Ahmed Nasreddine Benaichouche​​​‌ [Safran Tech], Frédéric‌ Jurie [CNRS GREYC],‌​‌ Ewa Kijak.

While​​ powerful pretrained visual encoders​​​‌ have advanced many vision‌ tasks, their knowledge is‌​‌ not fully leveraged by​​ object detectors, especially in​​​‌ few-shot settings. A key‌ challenge in transferring this‌​‌ knowledge via cross-task distillation​​ is the semantic mismatch​​​‌ between outputs: classifiers produce‌ clean probability distributions, while‌​‌ detector scores implicitly encode​​ both class and objectness.​​​‌ To address this, we‌ propose a lightweight fine-tuning‌​‌ strategy guided by a​​ novel, correlation-based distillation loss.​​​‌ This loss aligns the‌ detector's relative class preferences‌​‌ with those of a​​ strong image classifier, effectively​​​‌ decoupling the learning of‌ class semantics from objectness.‌​‌ Applied to a state-of-the-art​​ detector, our method consistently​​​‌ improves performance in a‌ low-data regime, demonstrating an‌​‌ effective way to bridge​​ the gap between powerful​​​‌ classifiers and object detectors.‌ Related publication 12.‌​‌

7 Bilateral contracts and​​ grants with industry

7.1​​​‌ Bilateral contracts with industry‌

CIFRE PHD: Certification of‌​‌ Deep Neural Networks

Participants:​​ Quentin Le Roux,​​​‌ Teddy Furon.

Duration:‌ 3 years, ended in‌​‌ November 2025, Partner: THALES​​

This is a CIFRE​​​‌ PhD thesis project aiming‌ at assessing the security‌​‌ of already trained Deep​​ Neural Networks, especially in​​​‌ the context of face‌ recognition.
CIFRE PHD: Watermarking‌​‌ Generative AIs

Participants: Pierre​​ Fernandez, Teddy Furon​​​‌.

Duration: 3 years,‌ ended in February 2025,‌​‌ Partner: Meta FAIR

This​​ is a CIFRE PhD​​​‌ thesis project aiming at‌ designing watermarking techniques dedicated‌​‌ to generative AIs (text​​ to image, text to​​​‌ speech, LLM).
CIFRE PHD:‌ Domain generalization exploiting synthetic‌​‌ data

Participants: Louis Hemadou​​, Ewa Kijak.​​​‌

Duration: 3 years, defended‌ in December 2025, Partner:‌​‌ Safran Tech

This is​​ a CIFRE PhD thesis​​​‌ project aiming at exploiting‌ synthetic data to be‌​‌ able to perform transfer​​ learning in presence of​​​‌ very few or inexistent‌ real data in the‌​‌ context of image detection​​​‌ or classification tasks.
CIFRE​ PHD: Detection and explanation​‌ of semantic manipulations in​​ multimedia content

Participants: Gautier​​​‌ Evennou, Ewa Kijak​.

Duration: 3 years,​‌ started in September 2023,​​ Partner: Imatag

This is​​​‌ a CIFRE PhD thesis​ project aiming at detecting​‌ and explaining semantic manipulations​​ in multimedia content, in​​​‌ the context of misinformation.​

8 Partnerships and cooperations​‌

8.1 International initiatives

8.1.1​​ STIC/MATH/CLIMAT AmSud projects

Isabela​​​‌ Borlido Barcelos
  • Status
    PhD​
  • Institution of origin:
    Pontifical​‌ Catholic University of Minas​​ Gerais
  • Country:
    Brazil
  • Dates:​​​‌
    January to December 2025​
  • Project:
    STIC-AMSUD GiMMD (Graph-based​‌ Analysis and Understanding of​​ Image, Video and Multimedia​​​‌ Data)
  • Partners:
    Universidad de​ la República de Uruguay,​‌ Pontifical Catholic University of​​ Minas Gerais, Inria
  • Context​​​‌ of the visit:
    Feature​ learning from image markers​‌ by using graph analysis​​
  • Mobility program/type of mobility:​​​‌
    research stay

8.2 International​ research visitors

8.2.1 Visits​‌ of international scientists

Other​​ international visits to the​​​‌ team
Martijn de Vos,​ Sayan Biswas, Milos Vujasinovic​‌
  • Status
    PhD and post-Docs​​
  • Institution of origin:
    EPFL,​​​‌ SACS team
  • Country:
    Switzerland​
  • Dates:
    13-17, January 2025​‌
  • Context of the visit:​​
    preparation of a submission​​​‌ for ICML 2025
  • Mobility​ program/type of mobility:
    research​‌ stay

8.2.2 Visits to​​ international teams

Research stays​​​‌ abroad
Augustin Godinot
  • Visited​ institution:
    Vector Institute for​‌ Artificial Intelligence and University​​ of Toronto
  • Country:
    Canada​​​‌
  • Dates:
    6 months from​ March to September 2025​‌
  • Context of the visit:​​
    Visit to Nicolas Papernot,​​​‌ Chaire Inria International
  • Mobility​ program/type of mobility:
    Internship​‌

8.3 National initiatives

PEPR​​ Cybersécurité projet COMPROMIS

Participants:​​​‌ Teddy Furon, Eva​ Giboulot, Ewa Kijak​‌, Enoal Gesny,​​ Chloé Imadache, Ryan​​​‌ Webster, Paul Chaurand​.

Duration: 4.5 years,​‌ started April 2024

The​​ COMPROMIS project is based​​​‌ on a modern vision​ of multimedia data protection,​‌ with deep learning at​​ its heart. This project​​​‌ defends the idea that​ the protection of multimedia​‌ data must necessarily be​​ associated with the security​​​‌ of the tools that​ analyse this data, i.e.​‌ these days Artificial Intelligence​​ (AI). The observation is​​​‌ simple: the protection of​ multimedia data is undoubtedly​‌ the area of cybersecurity​​ that has benefited most​​​‌ from AI, but it​ has neglected to check​‌ the level of security​​ of this new tool.​​​‌ AI has become one​ of the weak links​‌ in the protection of​​ multimedia data. The scientific​​​‌ hurdles thus concern both​ the classic applications of​‌ multimedia data protection and​​ the emerging field of​​​‌ deep learning.
DGA-Inria collaboration:​ Machine Unlearning

Participants: Virgile​‌ Dine, Teddy Furon​​, Charly Faure [AMIAD]​​​‌.

Duration: 3 years,​ started in October 2024,​‌ Partner: AMIAD

The project​​ aims at developing algorithms​​​‌ to make computer unlearn.​ From a model trained​‌ over a training dataset,​​ we aim at deriving​​​‌ a second model ignoring​ some training samples, or​‌ some classes of samples​​ without retraining it from​​​‌ scratch.
MinArm-Inria collaboration: EVE4​

Participants: Eva Giboulot,​‌ Teddy Furon.

Duration:​​ 3 years, ended in​​​‌ April 2025. Partners: MinArm,​ CRIStAL Lille, LIRMM, Univ.​‌ Tech. Troyes, Univ. Paris​​ Saclay

Teaching and technology​​ survey on steganography and​​​‌ steganalysis in the real‌ world.
MinArm-Inria collaboration: EVE5‌​‌

Participants: Eva Giboulot.​​

Duration: 18 months, started​​​‌ in December 2025. Partners:‌ MinArm, CRIStAL Lille, GREYC‌​‌

Teaching and technology survey​​ on steganography, steganalysis, watermarking​​​‌ and forensics analysis of‌ multimedia content in the‌​‌ real world.
ANR PACMAM​​ (ANR-24-CE23-7787)

Participants: Erwan Le​​​‌ Merrer, Timothée Chauvin‌.

Duration: 42 months,‌​‌ started in 2024. Partners:​​ PEReN, LAAS-CNRS

The PACMAM​​​‌ project seeks to increase‌ the transparency of algorithmic‌​‌ decisions by laying the​​ foundations for efficient black-box​​​‌ auditing of large-capacity models‌ under budget constraints. The‌​‌ project will focus on​​ active learning strategies for​​​‌ auditing that have recently‌ been introduced in the‌​‌ literature, yet whose applicability​​ to concrete cases remains​​​‌ uncertain. The proposed research‌ is organized in three‌​‌ work packages (WPs), each​​ of which addresses a​​​‌ fundamental challenge in this‌ research area. WP1 aims‌​‌ to understand how audit​​ efficiency is affected by​​​‌ a model's capacity, leveraging‌ measures such as VC‌​‌ dimension and Rademacher complexity.​​ This information will help​​​‌ auditors strike a balance‌ between query budget and‌​‌ accuracy. Building on WP1,​​ WP2 focuses on making​​​‌ active auditing practical for‌ large-capacity models by identifying‌​‌ efficient ways to select​​ optimal inputs and determining​​​‌ what auditors need to‌ know about audited models‌​‌ to succeed. Finally, WP3​​ explores how models that​​​‌ are frequently updated can‌ be monitored efficiently. The‌​‌ goal is to reduce​​ the query budget needed​​​‌ to continuously monitor an‌ evolving model. Overall, PACMAM‌​‌ will thus provide the​​ foundation for the efficient​​​‌ auditing of evolving high-capacity‌ models. The project will‌​‌ ensure that any developed​​ solutions is implemented rapidly​​​‌ thanks to the involvement‌ of PEReN, the French‌​‌ government's department in charge​​ of algorithmic regulation. More​​​‌ details are available at‌ Website.

8.4 Public‌​‌ policy support

COFRA-funded Ph.D.​​ thesis with PEReN

Participants:​​​‌ Gurvan Richardeau, Erwan‌ Le Merrer.

Duration:‌​‌ 3 years, started in​​ 2024.

PEReN, the Center​​​‌ of Expertise for Digital‌ Platform Regulation, is an‌​‌ interministerial office with national​​ competence placed under the​​​‌ joint authority of the‌ ministers responsible for the‌​‌ economy, culture and digital​​ affairs. This collaboration deals​​​‌ with the audit of‌ LLMs, especially the fingerprinting‌​‌ of the models. It​​ amounts to identify the​​​‌ model in a black‌ box with the minimum‌​‌ number of interactions.
EU​​ Artificial Intelligence Act

Participants:​​​‌ Teddy Furon.

Teddy‌ Furon participates to the‌​‌ Transparency Working Group in​​ charge of publishing the​​​‌ Code of Practice related‌ to the Art. 50‌​‌ of the EU AI​​ Act, under the supervision​​​‌ of the EU AI‌ office. This working group‌​‌ gathers industrials, academics, and​​ NGOs. It deals with​​​‌ the future obligation for‌ the generative AI providers‌​‌ in the EU to​​ offer means to mark​​​‌ and detect AI generated‌ content. It specifies the‌​‌ technical solutions and the​​ way to audit them.​​​‌

Expertise for the Department‌ of Justice

Participants: Erwan‌​‌ Le Merrer.

Erwan​​ Le Merrer is a​​​‌ technical expert for the‌ Department of Justice on‌​‌ some undergoing investigations that​​​‌ cannot be publicly disclosed.​

9 Dissemination

9.1 Promoting​‌ scientific activities

9.1.1 Scientific​​ events: organisation

General chair,​​​‌ scientific chair
  • Teddy Furon​ is the general chair​‌ of ESSAI, European Symposium​​ on Security of Artificial​​​‌ Intelligence.
Member of the​ organizing committees
  • Teddy Furon​‌ was a member of​​ the organizing committee of​​​‌ the workshop on GenAI​ watermarking at ICLR 2025.​‌
  • Eva Giboulot co-organized the​​ French workshop Detection of​​​‌ IA generated content (GdR​ IASIS & GdR SI).​‌
  • Erwan Le Merrer was​​ a member of the​​​‌ scientific committee for the​ organization for the Inria/DFKI​‌ IDESSAI European Summer School​​ on AI.

9.1.2 Scientific​​​‌ events: selection

Member of​ the conference program committees​‌
  • Ewa Kijak is a​​ member of the steering​​​‌ comittee of IEEE International​ Conference on Content-Based Multimedia​‌ Indexing (CBMI).
Reviewer
  • Teddy​​ Furon was a reviewer​​​‌ for ICLR 2026, NeurIPS​ 2025, ICCV 2025, ICML​‌ 2025, IEEE ICASSP 2025,​​ IEEE WIFS 2025, Workshop​​​‌ on GenAI watermarking at​ ICLR 2025.
  • Erwan Le​‌ Merrer was a reviewer​​ for ECAI 2025, AAAI​​​‌ 2026, ECML/PKDD 2026.
  • Ewa​ Kijak was a reviewer​‌ for CBMI 2025.

9.1.3​​ Journal

Reviewer - reviewing​​​‌ activities
  • Teddy Furon was​ a reviewer for IEEE​‌ Transactions on IFS, IEEE​​ Transactions on Multimedia.
  • Ewa​​​‌ Kijak was a guest​ editor for Multimedia Tools​‌ and Applications.
  • Ewa Kijak​​ was a reviewer for​​​‌ Multimedia Tools and Applications.​

9.1.4 Invited talks

  • Teddy​‌ Furon participated to the​​ Winter School of PEPR​​​‌ Cybersécurité.
  • Teddy Furon and​ Charly Faure invited by​‌ the ComCyber to the​​ panel Enjeux de sécurité​​​‌ des systèmes d’information intégrant​ de l’IA.
  • Ewa​‌ Kijak was invited speaker​​ at the French workshop​​​‌ Detection of IA generated​ content (GdR IASIS &​‌ GdR SI).
  • Ewa Kijak​​ was invited speaker at​​​‌ the 1rst edition of​ the scientific days of​‌ INESIA (Institut National de​​ l'Evaluation et de la​​​‌ Sécurité de l'IA).
  • Ewa​ Kijak was invited speaker​‌ at the Academic Day​​ of the Union of​​​‌ Physics and Chemistry Teachers.​
  • Teddy Furon participated to​‌ the event De pixels​​ à Perception organized by​​​‌ Campus Innovation – Université​ de Rennes.

9.1.5 Leadership​‌ within the scientific community​​

  • Erwan Le Merrer is​​​‌ the recipient of a​ chair of the SEQUOIA​‌ cluster.
  • Teddy Furon received​​ the Prix Innovation Inria-Dassault​​​‌ System de l'Académie des​ Sciences.
  • Jade Garcia Bourrée​‌ and Erwan Le Merrer​​ received the best paper​​​‌ award at SRDS 2025​ for paper 7.​‌
  • Spotlight paper at ICML​​ (International 2025 to "​​​‌Robust ML Auditing using​ Prior Knowledge4​‌ co-authored by Jade Garcia​​ Bourrée , Augustin Godinot​​​‌ , and Erwan Le​ Merrer .

9.1.6 Scientific​‌ expertise

  • Teddy Furon is​​ the scientific advisor of​​​‌ the startup Label4.ai.​
  • Erwan Le Merrer is​‌ an expert for the​​ Crédit d’Impôt Recherche funding​​​‌ program at the Direction​ Générale des Finances Publiques​‌.
  • Erwan Le Merrer​​ is an expert for​​​‌ the thèse CIFRE funding​ program at the Association​‌ Natioanale de la Recherche​​ et de la Technologie​​​‌ (ANRT).

9.1.7 Research administration​

  • Erwan Le Merrer is​‌ the president du conseil​​ scientifique de la Société​​ Informatique de France.​​​‌
  • Ewa Kijak is a‌ member of the executive‌​‌ comittee of the IA​​ Cluster SequoIA.
  • Teddy Furon​​​‌ was a member of‌ the jury Prix de‌​‌ thèse GdR Sécurité Informatique​​.
  • Teddy Furon is​​​‌ the president of the‌ Commission des Délégations du‌​‌ Centre Inria de l'Université​​ de Rennes.
  • Teddy​​​‌ Furon is a member‌ of the Commission du‌​‌ Personnel du Centre Inria​​ de l'Université de Rennes​​​‌ / IRISA.
  • Teddy‌ Furon participates to the‌​‌ coordination of the call​​ for proposals on the​​​‌ research effort related to‌ INESIA and managed by‌​‌ Inria Agence de programmes​​.

9.2 Teaching -​​​‌ Supervision - Juries -‌ Educational and pedagogical outreach‌​‌

9.2.1 Teaching

  • Eva Giboulot​​ , Rare Event Simulations,​​​‌ 40h, M2, INSA Rennes‌
  • Ewa Kijak is head‌​‌ of the Image engineering​​ track (M1-M2) of ESIR,​​​‌ Univ. Rennes
  • Ewa Kijak‌ , Information retrieval and‌​‌ Multimodal applications, 24h, M2,​​ ESIR
  • Ewa Kijak ,​​​‌ Deep Learning for Vision,‌ 12h, M2, ESIR
  • Ewa‌​‌ Kijak , Supervised machine​​ learning, 20h, M1R, ENS​​​‌ Rennes
  • Ewa Kijak ,‌ Machine learning, 12h, M1,‌​‌ ESIR
  • Ewa Kijak ,​​ Image processing, 45h, M1,​​​‌ ESIR, Univ. Rennes

9.2.2‌ Supervision

  • PhD Pierre Fernandez‌​‌ , Watermarking Generative AI,​​ defended January 2025, with​​​‌ Teddy Furon
  • PhD Louis‌ Hemadou , Domain generalization‌​‌ exploiting synthetic data, defended​​ December 2025, with Ewa​​​‌ Kijak
  • PhD Quentin Le‌ Roux , Backdoors in‌​‌ DNN applied to face​​ recognition systems, defended November​​​‌ 2025, with Teddy Furon‌
  • PhD Jade Garcia Bourrée‌​‌ , Trust but verify:​​ robust statistical auditing of​​​‌ ML black-boxes, defended October‌ 2025, with Erwan Le‌​‌ Merrer
  • PhD in progress:​​ Adele Denis , IA-based​​​‌ automated detection and behavior‌ analysis among piglets. Started‌​‌ September 2024, with Ewa​​ Kijak , Caroline Clouard​​​‌ (INRAE) and Céline Tallet‌ (INRAE)
  • PhD in progress:‌​‌ Virgile Dine , Machine​​ Unlearning. Started October 2024,​​​‌ with Teddy Furon
  • PhD‌ in progress: Enoal Gesny‌​‌ , Watermarking of Generative​​ AI. Started November 2024,​​​‌ with Eva Giboulot and‌ Teddy Furon
  • PhD in‌​‌ progress: Chloé Imadache ,​​ Security of Deep Learning​​​‌ based Watermarking. Started December‌ 2024, with Eva Giboulot‌​‌
  • PhD in progress: Gautier​​ Evennou , Detection and​​​‌ explanation of semantic manipulations‌ in multimedia content. Started‌​‌ in September 2023, with​​ Ewa Kijak
  • PhD in​​​‌ progress: Augustin Godinot ,‌ Tools for machine learning‌​‌ audits in the presence​​ of deceptive model providers.​​​‌ Started in 2021, with‌ Erwan Le Merrer
  • PhD‌​‌ in progress: Gurvan Richardeau​​ , Audit of evolutions​​​‌ between LLMs. Started in‌ 2024, with Erwan Le‌​‌ Merrer
  • PhD in progress:​​ Paul Chaurand , Zero-shot​​​‌ IA-manipulated content detection. Started‌ September 2025, with Ewa‌​‌ Kijak
  • PhD in progress:​​ Timothee Chauvin , Auditing​​​‌ LLM-based Agents with Computer‌ Interaction Capabilities. Started in‌​‌ 2024, with Erwan Le​​ Merrer

9.2.3 Juries

  • Teddy​​​‌ Furon was president of‌ the PhD jury of‌​‌ Benoit Coquerel, Univ. Rennes,​​ December 2025
  • Teddy Furon​​​‌ was a reviewer for‌ the HDR of Cédric‌​‌ Gouy-Pailler, CEA, May 2025​​
  • Teddy Furon was a​​​‌ reviewer for the PhD‌ of Mohamed Lansari, Univ.‌​‌ Brest, December 2025
  • Teddy​​​‌ Furon was a reviewer​ for the PhD of​‌ Wassim Bouaziz, Institut Polytechnique​​ de Paris, December 2025​​​‌
  • Teddy Furon was a​ reviewer for the PhD​‌ of Lucas Gnecco Heredia,​​ Université Paris Sciences et​​​‌ Lettres, May 2025
  • Ewa​ Kijak was a reviewer​‌ for the HDR of​​ Camille Guinaudeau, LIMSI, Paris-Saclay​​​‌ University, November 2025
  • Ewa​ Kijak was a reviewer​‌ for the PhD of​​ Theo Gigant, CentraleSupelec, Paris-Saclay​​​‌ University, October 2025
  • Teddy​ Furon was a jury​‌ member for the PhD​​ of Matthieu Serfaty, ENS​​​‌ Paris-Saclay, December 2025
  • Ewa​ Kijak was a jury​‌ member for the HDR​​ of Petra Gomez-Kramer, La​​​‌ Rochelle University, June 2025​
  • Ewa Kijak was a​‌ reviewer for the PhD​​ of Felipe Belem, Gustave​​​‌ Eiffel University, February 2025​
  • Erwan Le Merrer was​‌ an invited member of​​ the PhD defense of​​​‌ Jade Garcia-Bourrée, October 2025​

9.2.4 Educational and pedagogical​‌ outreach

  • Teddy Furon presented​​ `What is it​​​‌ to be a researcher​ in computer science?'​‌ to 5 highschool classes​​ (program Chiche!).

9.3​​​‌ Popularization

9.3.1 Specific official​ responsibilities in science outreach​‌ structures

  • Erwan Le Merrer​​ leads the scientific board​​​‌ of the Société Informatique​ de France in 2025.​‌

9.3.2 Productions (articles, videos,​​ podcasts, serious games, ...)​​​‌

  • Erwan Le Merrer published​ four interviews with the​‌ Société Informatique de France​​ this year on the​​​‌ impact of AI on​ jobs, on Binaire magazine.​‌
  • We have welcomed in​​ our team the author​​​‌ / illustrator Marie Spenale​ who drawed our storie​‌ on Instagram and LinkedIn​​.
  • Teddy Furon participated​​​‌ to the proposal Projet​ de pôle territorial DEMOCRAT’ICC​‌.

9.3.3 Participation in​​ Live events

  • Ewa Kijak​​​‌ participated as panelist to​ the Conference on Generative​‌ AI and Disinformation organized​​ by the Sorbonne Center​​​‌ for Artificial Intelligence (SCAI)​ as part of the​‌ AI Action Summit, January​​ 2025.
  • Ewa Kijak participated​​​‌ to the Procès de​ l'IA as part of​‌ the West Data Festival,​​ in Laval, March 2025.​​​‌
  • Ewa Kijak participated to​ the Fête de la​‌ Science, Rennes, October​​ 2025.
  • Erwan Le Merrer​​​‌ proposed and won a​ funding for 1artist1scientist event​‌ for the 50 years​​ of the IRISA laboratory.​​​‌ An art perfomance was​ built in the form​‌ of a vending machine​​ embedding an AI, to​​​‌ question the choice of​ users accessing AI-enchanced services.​‌ This work was presented​​ at Fête de la​​​‌ science in Rennes for​ 2 days, and then​‌ at the laboratory party.​​ An intern from INSA​​​‌ was supervised and participated​ in the work. Several​‌ interviews followed, including by​​ a regional TV ;​​​‌ these are available here​.
  • Eva Giboulot participated​‌ to a round table​​ at the Société des​​​‌ auteurs dans les arts​ graphiques et plastiques (ADAGP)​‌ on the subject of​​ Intelligences artificielles génératives et​​​‌ traçabilité, January 2025.​

9.3.4 Others science outreach​‌ relevant activities

  • Erwan Le​​ Merrer was interviewed by​​​‌ the french newspaper Le​ Télégramme, about our​‌ research on YouTube (3​​ articles), February 2025.
  • Erwan​​​‌ Le Merrer was interviewed​ for the Data Skeptic​‌ podcast on “LLMs hallucinte​​ graphs too”, January 2025.​​
  • Teddy Furon was interviewed​​​‌ by the think tank‌ Villa Numeris.

10‌​‌ Scientific production

10.1 Major​​ publications

10.2 Publications of the​​ year

International journals

International peer-reviewed conferences

Conferences without proceedings​​​‌

  • 17 inproceedingsB.Brian‌ Pufler, Y.Yury‌​‌ Belousov, V.Vitaliy​​ Kinakh, T.Teddy​​​‌ Furon and S.Slava‌ Voloshynovskiy. Task-Agnostic Attacks‌​‌ Against Vision Foundation Models​​.5th Workshop of​​​‌ Adversarial Machine Learning at‌ CVPR 2025Nashville, United‌​‌ StatesMarch 2025,​​ 1-18HALback to​​​‌ text

Reports & preprints‌

10.3​​​‌ Cited publications

  • 21 misc‌K. W.Kyle Wiggers‌​‌ (TechCrunch). TechCrunch,​​ eds. Deepfakes for all:​​​‌ Uncensored AI art model‌ prompts ethics questions.‌​‌2022, URL: https://techcrunch.com/2022/08/24/deepfakes-for-all-uncensored-ai-art-model-prompts-ethics-questions​​back to text
  • 22​​​‌ inproceedingsM.Martin Abadi‌, A.Andy Chu‌​‌, I.Ian Goodfellow​​, H. B.H​​​‌ Brendan McMahan, I.‌Ilya Mironov, K.‌​‌Kunal Talwar and L.​​Li Zhang. Deep​​​‌ learning with differential privacy‌.Proceedings of the‌​‌ 2016 ACM SIGSAC conference​​ on computer and communications​​​‌ security2016, 308--318‌back to text
  • 23‌​‌ inproceedingsM.Mauro Barni​​, K.Kassem Kallas​​​‌ and B.Benedetta Tondi‌. A new backdoor‌​‌ attack in cnns by​​ training set corruption without​​​‌ label poisoning.2019‌ IEEE International Conference on‌​‌ Image Processing (ICIP)IEEE​​2019, 101--105back​​​‌ to text
  • 24 inproceedings‌B.Battista Biggio,‌​‌ B.Blaine Nelson and​​ P.Pavel Laskov.​​​‌ Poisoning attacks against support‌ vector machines.Proceedings‌​‌ of the 29th International​​ Coference on International Conference​​​‌ on Machine Learning2012‌, 1467--1474back to‌​‌ text
  • 25 articleA.​​Avrim Blum, K.​​​‌Katrina Ligett and A.‌Aaron Roth. A‌​‌ learning theory approach to​​ noninteractive database privacy.​​​‌Journal of the ACM‌ (JACM)6022013‌​‌, 1--25back to​​ text
  • 26 articleM.​​​‌Mikel Bober-Irizar, I.‌Ilia Shumailov, Y.‌​‌Yiren Zhao, R.​​Robert Mullins and N.​​​‌Nicolas Papernot. Architectural‌ backdoors in neural networks‌​‌.arXiv preprint arXiv:2206.07840​​2022back to text​​​‌
  • 27 articleJ. G.‌Jade Garcia Bourrée,‌​‌ E. L.Erwan Le​​ Merrer, G.Gilles​​​‌ Tredan and B.Benoît‌ Rottembourg. On the‌​‌ relevance of APIs facing​​ fairwashed audits.arXiv​​​‌ preprint arXiv:2305.138832023back‌ to text
  • 28 article‌​‌T.Tim Brooks,​​ A.Aleksander Holynski and​​​‌ A. A.Alexei A.‌ Efros. InstructPix2Pix: Learning‌​‌ to Follow Image Editing​​ Instructions.arXiv preprint​​​‌ 2211.098002023back to‌ text
  • 29 inproceedingsX.‌​‌Xiaoyu Cao, J.​​​‌Jinyuan Jia and N.​ Z.Neil Zhenqiang Gong​‌. IPGuard: Protecting intellectual​​ property of deep neural​​​‌ networks via fingerprinting the​ classification boundary.Proceedings​‌ of the 2021 ACM​​ Asia Conference on Computer​​​‌ and Communications Security2021​, 14--25back to​‌ text
  • 30 url[ERROR:​​ No template for type​​​‌ url] N.Nicholas Carlini​. A Complete List​‌ of All (arXiv) Adversarial​​ Example Papers. https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html.​​​‌back to text
  • 31​ url[ERROR: No template​‌ for type url] ChatGPT​​. website: https://chat.openai.com.​​​‌back to text
  • 32​ articleC.Cynthia Dwork​‌, A.Aaron Roth​​ and others. The​​​‌ algorithmic foundations of differential​ privacy.Foundations and​‌ Trends®93--42014​​, 211--407back to​​​‌ text
  • 33 inproceedingsG.​Gautier Evennou, V.​‌Vivien Chappelier, E.​​Ewa Kijak and T.​​​‌Teddy Furon. SWIFT:​ Semantic Watermarking for Image​‌ Forgery Thwarting.Proc.​​ of IEEE WIFSIEEE​​​‌Roma, ItalyIEEEDecember​ 2024, 1-6HAL​‌back to text
  • 34​​ inproceedingsP.Pierre Fernandez​​​‌, A.Alexandre Sablayrolles​, T.Teddy Furon​‌, H.Hervé Jégou​​ and M.Matthijs Douze​​​‌. Tatouage Numérique d'Images​ dans l'Espace Latent de​‌ Réseaux Auto-Supervisés.GRETSI​​ 2022 - Colloque Francophone​​​‌ de Traitement du Signal​ et des ImagesNancy,​‌ FranceSeptember 2022,​​ 1-4HALback to​​​‌ text
  • 35 inproceedingsP.​Pierre Fernandez, A.​‌Alexandre Sablayrolles, T.​​Teddy Furon, H.​​​‌Hervé Jégou and M.​Matthijs Douze. Watermarking​‌ Images in Self-Supervised Latent​​ Spaces.ICASSP 2022​​​‌ - IEEE International Conference​ on Acoustics, Speech and​‌ Signal ProcessingIEEESingapore,​​ SingaporeIEEEMay 2022​​​‌, 1-5HALback​ to text
  • 36 phdthesis​‌C.Clément Feutry.​​ Two sides of relevant​​​‌ information : anonymized representation​ through deep learning and​‌ predictor monitoring.Université​​ Paris-Saclay2019back to​​​‌ text
  • 37 inproceedingsE.​Enoal Gesny, E.​‌Eva Giboulot and T.​​Teddy Furon. When​​​‌ does gradient estimation improve​ black-box adversarial attacks?Proceedings​‌ of IEEE WIFS 2024​​Roma, ItalyIEEEDecember​​​‌ 2024, 1-6HAL​back to text
  • 38​‌ inproceedingsE.Eva Giboulot​​ and T.Teddy Furon​​​‌. WaterMax: breaking the​ LLM watermark detectability-robustness-quality trade-off​‌.38th Conference on​​ Neural Information Processing Systems​​​‌ (NeurIPS 2024)Vancouver, Canada​December 2024, 1-34​‌HALback to text​​
  • 39 articleS.Shafi​​​‌ Goldwasser, M. P.​Michael P Kim,​‌ V.Vinod Vaikuntanathan and​​ O.Or Zamir.​​​‌ Planting undetectable backdoors in​ machine learning models.​‌arXiv preprint arXiv:2204.069742022​​back to text
  • 40​​​‌ articleA.Arnaud Grivet​ Sébert, R.Rafaël​‌ Pinot, M.Martin​​ Zuber, C.Cedric​​​‌ Gouy-Pailler and R.Renaud​ Sirdey. SPEED: Secure,​‌ PrivatE, and Efficient Deep​​ learning.Machine Learning​​​‌11042021,​ 675--694back to text​‌
  • 41 inproceedingsS.Shuyang​​ Gu, D.Dong​​​‌ Chen, J.Jianmin​ Bao, F.Fang​‌ Wen, B.Bo​​ Zhang, D.Dongdong​​​‌ Chen, L.Lu​ Yuan and B.Baining​‌ Guo. Vector quantized​​ diffusion model for text-to-image​​ synthesis.Proceedings of​​​‌ the IEEE/CVF Conference on‌ Computer Vision and Pattern‌​‌ Recognition2022, 10696--10706​​back to text
  • 42​​​‌ articleI.Inhwa Han‌, S.Serin Yang‌​‌, T.Taesung Kwon​​ and J. C.Jong​​​‌ Chul Ye. Highly‌ Personalized Text Embedding for‌​‌ Image Manipulation by Stable​​ Diffusion.arXiv preprint​​​‌ 2303.087672023back to‌ text
  • 43 inproceedingsS.‌​‌Sanghyun Hong, Y.​​Yigitcan Kaya, I.-V.​​​‌Ionuț-Vlad Modoranu and T.‌Tudor Dumitras. A‌​‌ Panda? No, It's a​​ Sloth: Slowdown Attacks on​​​‌ Adaptive Multi-Exit Neural Network‌ Inference.International Conference‌​‌ on Learning Representations2020​​back to text
  • 44​​​‌ inproceedingsM.Mingzhen Huang‌, S.Shan Jia‌​‌, M.-C.Ming-Ching Chang​​ and S.Siwei Lyu​​​‌. Text-Image De-Contextualization Detection‌ Using Vision-Language Models.‌​‌2022, 8967-8971back​​ to text
  • 45 inproceedings​​​‌H.Hengrui Jia,‌ H.Hongyu Chen,‌​‌ J.Jonas Guan,​​ A. S.Ali Shahin​​​‌ Shamsabadi and N.Nicolas‌ Papernot. A Zest‌​‌ of LIME: Towards Architecture-Independent​​ Model Distances.International​​​‌ Conference on Learning Representations‌2021back to text‌​‌
  • 46 inproceedingsE.Erwan​​ Le Merrer and T.​​​‌Trédan Gilles. Tampernn:‌ efficient tampering detection of‌​‌ deployed neural nets.​​2019 IEEE 30th International​​​‌ Symposium on Software Reliability‌ Engineering (ISSRE)IEEE2019‌​‌, 424--434back to​​ text
  • 47 articleE.​​​‌Erwan Le Merrer,‌ P.Patrick Perez and‌​‌ G.Gilles Trédan.​​ Adversarial frontier stitching for​​​‌ remote neural network watermarking‌.Neural Computing and‌​‌ Applications322020,​​ 9233--9244back to text​​​‌
  • 48 inproceedingsE.Erwan‌ Le Merrer and G.‌​‌Gilles Trédan. LLMs​​ hallucinate graphs too: a​​​‌ structural perspective.complex‌ networksIstanbul, TurkeySpringer‌​‌December 2024HALback​​ to text
  • 49 article​​​‌E.Erwan Le Merrer‌ and G.Gilles Trédan‌​‌. Remote explainability faces​​ the bouncer problem.​​​‌Nature Machine Intelligence2‌92020, 529--539‌​‌back to text
  • 50​​ inproceedingsQ.Quentin Le​​​‌ Roux, K.Kassem‌ Kallas and T.Teddy‌​‌ Furon. A Double-Edged​​ Sword: The Power of​​​‌ Two in Defending Against‌ DNN Backdoor Attacks.‌​‌EUSIPCO 2024 - 32nd​​ IEEE European Signal Processing​​​‌ ConferenceLyon, FranceIEEE‌August 2024, 2007-2011‌​‌HALDOIback to​​ text
  • 51 inproceedingsZ.​​​‌Zinan Lin, S.‌Shuaiqi Wang, V.‌​‌Vyas Sekar and G.​​Giulia Fanti. Distributional​​​‌ Privacy for Data Sharing‌.NeurIPS 2022 Workshop‌​‌ on Synthetic Data for​​ Empowering ML Research2022​​​‌, URL: https://openreview.net/forum?id=6oVAzFsHLFKback‌ to text
  • 52 inproceedings‌​‌Y.Yunfei Liu,​​ X.Xingjun Ma,​​​‌ J.James Bailey and‌ F.Feng Lu.‌​‌ Reflection Backdoor: A Natural​​ Backdoor Attack on Deep​​​‌ Neural Networks.Computer‌ Vision -- ECCV 2020‌​‌ChamSpringer International Publishing​​2020, 182--199back​​​‌ to text
  • 53 inproceedings‌G.Grace Luo,‌​‌ T.Trevor Darrell and​​ A.Anna Rohrbach.​​​‌ NewsCLIPpings: Automatic Generation of‌ Out-of-Context Multimodal Media.‌​‌Proceedings of the 2021​​ Conference on Empirical Methods​​​‌ in Natural Language Processing‌ (EMNLP)2021, 6801--6817‌​‌back to text
  • 54​​​‌ inproceedingsT.Thibault Maho​, T.Teddy Furon​‌ and E.Erwan Le​​ Merrer. Empreinte de​​​‌ réseaux avec des entrées​ authentiques.Conference on​‌ Artificial Intelligence for Defense​​Actes de la 4ème​​​‌ Conference on Artificial Intelligence​ for Defense (CAID 2022)​‌DGA Maîtrise de l'Information​​Rennes, FranceNovember 2022​​​‌HALback to text​back to text
  • 55​‌ miscT.Thibault Maho​​, T.Teddy Furon​​​‌ and E. L.Erwan​ Le Merrer. FBI:​‌ Fingerprinting models with Benign​​ Inputs.submitted to​​​‌ IEEE Trans. on Information​ Forensics and Security2022​‌, URL: https://arxiv.org/abs/2208.03169DOI​​back to text
  • 56​​​‌ inproceedings T.Thibault Maho​, S.-M.Seyed-Mohsen Moosavi-Dezfooli​‌ and T.Teddy Furon​​. How to choose​​​‌ your best allies for​ a transferable attack? IEEE​‌ Int. Conf. on Computer​​ Vision, ICCV 2023 back​​​‌ to text
  • 57 inbook​E. F.Edward F.​‌ Moore. Gedanken-Experiments on​​ Sequential Machines.Automata​​​‌ Studies. (AM-34), Volume 34​C. E.C. E.​‌ Shannon and J.J.​​ McCarthy, eds. Princeton​​​‌Princeton University Press1956​, 129--154URL: https://doi.org/10.1515/9781400882618-006​‌DOIback to text​​
  • 58 inproceedingsA. S.​​​‌Adnan Siraj Rakin,​ Z.Zhezhi He and​‌ D.Deliang Fan.​​ Tbt: Targeted neural network​​​‌ attack with bit trojan​.Proceedings of the​‌ IEEE/CVF Conference on Computer​​ Vision and Pattern Recognition​​​‌2020, 13198--13207back​ to text
  • 59 article​‌A.Aditya Ramesh,​​ P.Prafulla Dhariwal,​​​‌ A.Alex Nichol,​ C.Casey Chu and​‌ M.Mark Chen.​​ Hierarchical text-conditional image generation​​​‌ with clip latents.​arXiv preprint arXiv:2204.061252022​‌back to text
  • 60​​ articleB.Behrooz Razeghi​​​‌, F. P.Flavio​ P Calmon, D.​‌Deniz Gunduz and S.​​Slava Voloshynovskiy. Bottlenecks​​​‌ CLUB: Unifying Information-Theoretic Trade-offs​ Among Complexity, Leakage, and​‌ Utility.arXiv preprint​​ arXiv:2207.048952022back to​​​‌ text
  • 61 url[ERROR:​ No template for type​‌ url] M. H.Melissa​​ Heikkilä (MIT Technology Review)​​​‌. How to create,​ release, and share generative​‌ AI responsibly.2023​​back to text
  • 62​​​‌ inproceedingsM.Mauro Ribeiro​, K.Katarina Grolinger​‌ and M. A.Miriam​​ A.M. Capretz. MLaaS:​​​‌ Machine Learning as a​ Service.2015 IEEE​‌ 14th International Conference on​​ Machine Learning and Applications​​​‌ (ICMLA)2015, 896-902​DOIback to text​‌
  • 63 inproceedingsR.Robin​​ Rombach, A.Andreas​​​‌ Blattmann, D.Dominik​ Lorenz, P.Patrick​‌ Esser and B.Björn​​ Ommer. High-resolution image​​​‌ synthesis with latent diffusion​ models.Proceedings of​‌ the IEEE/CVF Conference on​​ Computer Vision and Pattern​​​‌ Recognition2022, 10684--10695​back to text
  • 64​‌ articleQ. L.Quentin​​ Le Roux, E.​​​‌Eric Bourbao, Y.​Yannick Teglia and K.​‌Kassem Kallas. A​​ Comprehensive Survey on Backdoor​​​‌ Attacks and Their Defenses​ in Face Recognition Systems​‌.IEEE Access12​​2024, 47433-47468HAL​​​‌DOIback to text​
  • 65 inproceedingsA.Alexandre​‌ Sablayrolles, M.Matthijs​​ Douze, C.Cordelia​​​‌ Schmid and H.Hervé​ Jégou. Radioactive data:​‌ tracing through training.​​International Conference on Machine​​ LearningPMLR2020,​​​‌ 8326--8335back to text‌
  • 66 inproceedingsT.Tom‌​‌ Sander, P.Pierre​​ Fernandez, A.Alain​​​‌ Durmus, M.Matthijs‌ Douze and T.Teddy‌​‌ Furon. Watermarking Makes​​ Language Models Radioactive.​​​‌38th Conference on Neural‌ Information Processing Systems (NeurIPS‌​‌ 2024).SpotlightVancouver, Canada​​December 2024, 1-35​​​‌HALback to text‌
  • 67 article C.C‌​‌ Stokel-Walker. AI bot​​ ChatGPT writes smart essays-should​​​‌ academics worry? Nature 2022‌ back to text
  • 68‌​‌ articleZ.Zhichuang Sun​​, R.Ruimin Sun​​​‌, C.Changming Liu‌, A. R.Amrita‌​‌ Roy Chowdhury, S.​​Somesh Jha and L.​​​‌Long Lu. ShadowNet:‌ A secure and efficient‌​‌ system for on-device model​​ inference.arXiv preprint​​​‌ arXiv:2011.059052020back to‌ text
  • 69 inproceedingsY.‌​‌Yuri Viazovetskyi, V.​​Vladimir Ivashkin and E.​​​‌Evgeny Kashin. Stylegan2‌ distillation for feed-forward image‌​‌ manipulation.European conference​​ on computer visionSpringer​​​‌2020, 170--186back‌ to text
  • 70 inproceedings‌​‌M.Martijn de Vos​​, A.Akash Dhasade​​​‌, J.Jade Garcia‌ Bourrée, A.-M.Anne-Marie‌​‌ Kermarrec, E.Erwan​​ Le Merrer, B.​​​‌Benôit Rottembourg and G.‌Gilles Trédan. Fairness‌​‌ Auditing with Multi-Agent Collaboration​​.Frontiers in Artificial​​​‌ Intelligence and ApplicationsFrontiers‌ in Artificial Intelligence and‌​‌ ApplicationsSantiago de Compostela,​​ SpainIOS PressOctober​​​‌ 2024, 1-14HAL‌DOIback to text‌​‌
  • 71 articleS.Sameer​​ Wagh, S.Shruti​​​‌ Tople, F.Fabrice‌ Benhamouda, E.Eyal‌​‌ Kushilevitz, P.Prateek​​ Mittal and T.Tal​​​‌ Rabin. Falcon: Honest-Majority‌ Maliciously Secure Framework for‌​‌ Private Deep Learning.​​Proceedings on Privacy Enhancing​​​‌ Technologies12021,‌ 188--208back to text‌​‌
  • 72 inproceedingsY.Yaohui​​ Wang and A.Antitza​​​‌ Dantcheva. A video‌ is worth more than‌​‌ 1000 lies. Comparing 3DCNN​​ approaches for detecting deepfakes​​​‌. IEEE International Conference‌ on Automatic Face and‌​‌ Gesture Recognition (FG 2020)​​2020back to text​​​‌
  1. 1International Data Corporation‌ (IDC)
  2. 2It is‌​‌ almost complete as the​​ only missing use case​​​‌ we are aware of‌ is the accessibility of‌​‌ the model at stake​​ in a deny-of-service attack​​​‌ 43. This is‌ out of our scope.‌​‌
  3. 3the Trusted Execution​​ Environment chip, highly secure​​​‌ but not computationally efficient‌
  4. 4Machine Learning as‌​‌ a Service, i.e. on​​ the cloud 62