Deepfake Defences 2 – The Attribution Toolkit

Published: 11 July 2025

Deepfakes are AI-generated audio-visual content that is deliberately designed to misrepresent someone or something.

In our first discussion paper on this subject, Deepfake Defences, we introduced a three-part typology of deepfakes – those that demean, defraud and disinform – and set out (at a high level) a range of interventions that actors across the AI supply chain could take to address the sharing of this type of content.

In this follow up discussion paper, we look more closely at the merits of so-called ‘attribution measures’. This includes watermarking tools, provenance metadata schemes, AI labels, and context annotations. These measures are designed in one way or another to attribute certain types of information to a piece of content, for example information about who created it, how and when it was created, and – in some cases – whether the content is accurate or misleading.

This paper looks in detail at how each measure works and assesses its strengths and weaknesses. It also considers what it would take to deploy them successfully. The paper draws on the findings of a literature review, interviews with experts, a survey and series of interviews with users of online platforms, and our own internal technical evaluations of openly available watermarking tools.

The sharing of certain types of deepfakes is regulated under the Online Safety Act 2023. We will draw on the paper’s insights to inform our policy development and supervision of regulated services.

Discussion papers are intended to shine a light on emerging issues and best practices. They do not constitute formal guidance, and regulated services under the Online Safety Act 2023 are not required to adopt the measures featured in this paper.

Back to top