The UK Government has launched a consultation to resolve the ongoing tension between copyright law and artificial intelligence (AI). This consultation follows an intense period of debate over how UK law should balance the rights of publishers, content creators and rightsholders against the interests of AI developers.
The consultation recognises that the application of UK copyright law to the training of AI models is currently disputed. Rightsholders are finding it difficult to control the use of their works in training AI models and are seeking to be remunerated for such use. AI developers require access to vast quantities of high-quality (often copyright-protected) materials to train their AI models and debate the extent to which UK copyright law does (or should) restrict access to works for the purpose of AI training. This legal uncertainty is, in the Government's view, undermining the adoption of AI technology and stifling innovation.
The Government has considered a range of options to clarify copyright law and meet its objectives to placate both AI innovators and the creative industries. As put forward in the consultation, the Government's preferred option is to adopt a new text-and-data mining (TDM) exception to copyright in the UK that will allow AI companies to mine copyright protected works for AI training purposes provided the rightsholder has not "reserved their rights" (or "opted-out"), underpinned by transparency measures to enable rightsholders to understand and manage the use of their works.
This suggested approach will align the UK more closely with the copyright laws of the European Union (EU), with the hope of striking a balanced framework that gives ample protection to the UK's creative industries whilst allowing the AI sector to thrive. We take a closer look at the proposals below:
The 'Preferred Option': A New TDM Copyright Exception
The Government's preferred option in the consultation proposes to update UK copyright law by introducing a new copyright exemption for TDM. "TDM" refers to the use of automated techniques to analyse large amounts of information – a practice deployed at mass scale by AI companies to ingest large volumes of content for the purpose of training AI models.
The new TDM exception proposed by the UK government would have the following features:
- It would apply to TDM for any purpose, including commercial purposes.
- It would apply only where the party carrying out TDM has "lawful access" to the relevant works, thereby allowing rightsholders to seek remuneration at the point of access by, for example, placing content behind a paywall.
- It would apply only where the rightsholder has not "reserved their rights" (or "opted-out") from having the work subject to TDM. If a rightsholder has reserved their rights through an agreed mechanism (to be decided and discussed further below), TDM would not be permitted unless a licence was agreed with the rightsholder.
- It would be underpinned by greater transparency requirements on AI companies around the sources of training material, to help ensure compliance with the law and build trust between right holders and developers.
How would this change the UK's existing TDM framework?
If adopted, the proposal would bring about significant change to the UK's current TDM exemption found in 29A of the Copyright, Designs and Patents Act 1988 (CDPA). Presently, TDM is only permitted as an exception to copyright in the UK under very narrow conditions – there must be "lawful access" to the work and the purpose of TDM must be restricted to non-commercial research.
This existing framework has been widely criticised by the tech sector for being overly restrictive, particularly for AI developers and commercial entities seeking to innovate and train AI models (often released on a commercial basis) using large datasets. The government seemingly shares the view that current UK law fails to sufficiently accommodate TDM for commercial AI training, putting the UK at a competitive disadvantage.
Therefore, the new proposals would broaden the scope of permissible TDM in the UK significantly by allowing TDM to be carried out on copyright works for commercial purposes, subject to the ability of rightsholders to opt out (and backed by new supporting transparency requirements).
Aligning with the EU
It is worth noting that the Government's preferred option would bring UK legislation largely in line with the EU copyright framework on the topic of TDM. Under the Digital Single Market Copyright Directive 2019/790 (DSM Directive), TDM is permissible (operating as an exception to copyright) in the EU under:
- Article 3: which allows TDM for scientific research purposes (akin to the UK's existing TDM exemption for non-commercial research); and
- Article 4: which allows TDM for commercial purposes, provided access to the work is "lawful" and the rightsholder has not expressly reserved their rights "in an appropriate manner, such as machine-readable means" (akin to the newly proposed UK opt-out model).
On this basis, the proposed UK opt-out model might not materially impact the burden UK publishers and content owners already face when seeking to opt-out of TDM taking place within the EU (as most rightsholders will already have in place technical measures and standards as a means of "reserving their rights" from having their works subject to TDM at EU-level).
However, the extent to which the mechanisms used to opt-out at EU level will be deemed adequate for the UK regime is an open question. The consultation expressly highlights issues with EU 'reservation of rights' model, including the fact that "opt-outs should be machine readable, but the practical application of this has not been consistent"... this is discussed further in the next section.
Mechanisms for Opting-Out
The UK Government recognises the current limitations of existing technical standards that function as opt-out mechanisms across international regimes (including the EU), which include:
- robots.txt standard – adopted by over half of new publishers to block web crawlers;
- metadata-level instructions – instructions embedded in the works themselves to specify that TDM is prohibited; and
- Do-Not-Train registries – requiring rightsholders to notify AI firms directly that they do not want specific works to be used for training AI, which is then captured as part of a registry.
Whilst welcoming the development of these technologies and acknowledging their role in facilitating a workable opt-out system, the Government emphasises that, at present, there is a lack of standardisation in this area, with rightsholders often having to deal with multiple different systems that are not always accessible or suitable.
Taking the robots.txt standard (as the most widely-adopted standard) as an example – it cannot provide the granular control over the use of works that many rightsholders seek. It allows works to be blocked from web crawling at the website level, but does not recognise reservations associated with individual works. It also does not enable rightsholders to distinguish between scraping for different uses of the works (e.g. a publisher may be happy for web crawlers to use works for search indexing or language training, but not for generative AI).
To address these limitations, the Government proposes that rights in works made available online "should be reserved using effective and accessible machine-readable formats, which should be standardised as far as possible". Any UK regulation will need to avoid overly prescriptive provisions and concentrate on outcomes to ensure that it adapts to technical development, with a key aim being to engage with standards initiatives being taken forward by industry and by other international partners, such as the EU. This will all be considered in more detail as part of the consultation.
Transparency Requirements
The UK Government has proposed specific transparency obligations aimed at increasing trust and accountability in AI training practices. These requirements focus on ensuring that rightsholders can verify whether their works have been used for TDM and to what extent. The consultation proposes the following key measures:
- Dataset Disclosure – AI developers may be required to disclose specific information about the datasets used to train their models. This includes details about the sources of content and the specific materials ingested.
- Usage Reporting – AI developers may need to document and report how TDM activities are conducted, including providing clear and auditable records of which copyrighted works were accessed and used in AI training.
- Verification of Opt-Out Compliance – Rightsholders must be able to verify that their opt-out requests have been respected. This includes confirmation that AI developers have adhered to opt-out instructions embedded in machine-readable formats or other agreed mechanisms.
- Granularity and Accessibility – Transparency mechanisms must be practical, allowing rightsholders to access clear and usable information without excessive complexity.
In considering the above transparency requirements, the consultation specifically notes that Article 53(1)(d) of the EU's AI Act recently introduced a requirement that AI providers within the EU are required to make publicly available a "sufficiently detailed summary" of training content. This EU regime does not require exhaustive disclosure – as it can be achieved by listing the main data collections or 'sets' that went into training a model, and summarising other sources, using a template currently under development by the EU AI Office.
It remains to be seen what level copyright disclosure the Government will demand from AI companies domestically, but discussion in the consultation (and the requirement to disclose specific works and datasets, if implemented) suggests the UK transparency framework could be more stringent on tech companies than the "sufficiently detailed summary" required in the EU under the AI Act. This is all up for discussion and consideration as part of the consultation.
Additional Points for Consultation
The consultation also seeks to address other areas of legal uncertainty surrounding AI and copyright law. The Government is inviting views on whether broader clarifications or reforms are needed to ensure UK copyright law remains fit for purpose in the age of AI. This includes:
- Contracts and Licensing – the Government is seeking views on whether existing contractual frameworks sufficiently address the licensing of content for TDM and AI training purposes, or whether reforms are needed to improve clarity and consistency for both rightsholders and AI developers;
- Copyright Ownership of Computer-Generated Works – revisiting the protection of "computer-generated works" under s9(3) of the CDPA, which currently assigns authorship to the person who "makes the arrangements" necessary for the work's creation. The Government is considering whether this provision remains appropriate, particularly as AI tools become increasingly autonomous and human input diminishes;
- The "temporary copies" exception – the consultation acknowledges uncertainty around whether the "temporary copies" exception (which permits temporary copies made during technological processes) applies to the training of generative AI models and is seeking discussion and clarification;
- Labelling AI Output – the Government is considering whether AI-generated content should carry a label or watermark to differentiate it from human-created works. This could improve transparency and help protect consumers and rightsholders from misleading content;
- Digital Replicas and Deepfakes – AI technology can generate highly realistic replicas of individuals, such as actors or singers, raising concerns about consent, reputational harm, and economic exploitation. The Government is seeking views on whether specific legal protections are required to address these issues, particularly in the context of synthetic media; and
- Other emerging issues – the consultation highlights areas such as AI systems generating outputs during "inference", where user inputs (e.g., prompts) or live data (e.g., retrieval-augmented generation) interact with copyright works. The Government also raises questions about the increasing use of synthetic data to train AI systems and whether the application of copyright law to these use cases is sufficiently clear.
These additional points reflect the Government's broader ambition to modernise copyright law and ensure it keeps pace with evolving AI technologies and practices. Stakeholder feedback will play a critical role in shaping future reforms.
Comment
The UK's consultation marks a significant step towards a more structured domestic framework for TDM, copyright and AI – with IP enthusiasts up and down the country surely salivating at the prospect of greater certainty around some of the burning AI / copyright questions of today.
Certainly by proposing a broader copyright exception for commercial TDM (subject to an opt-out model and enhanced transparency), the Government is seeking to foster a workable compromise between the interests of AI companies and the creative sector – with an express nod in the consultation towards developing grounds for "collective licensing where appropriate".
Whilst the proposed approach would bring UK copyright law broadly in line with the EU model on TDM, publishers and creatives will be quick to point out that the EU regime is far from perfect and is plagued with uncertainty as to how, where and when to "opt out". These rightsowners will expect any UK regime to provide greater clarity and certainty as to how opt-out mechanisms should work and be respected, with a vested interest in pushing the Government towards an approach which places more stringent obligations on AI developers than we see at EU level.
As always, the devil will be in the detail of any implementation of these plans and the consultation presents a welcome opportunity for stakeholders on both sides of the debate to submit their views.