Copyright exceptions and fair use defences for AI train...

Home
About
Scholarly Work
Selected Academic Publications
Copyright and the CJEU, 2nd edn
DSM Directive Commentary
WIPO Metaverse Study
The Handbook of Fashion Law
Twenty Years of The IPKat
Handbook of EU Copyright Law
Originality in EU Copyright
Copyright and the CJEU
Public Engagement
Talks, Lectures & Short Courses
Recently Organized Events
Recognition
In the Media
IP Blogging
News
Contact
- Home
- About
- Scholarly Work
  Selected Academic Publications
  Copyright and the CJEU, 2nd edn
  DSM Directive Commentary
  WIPO Metaverse Study
  The Handbook of Fashion Law
  Twenty Years of The IPKat
  Handbook of EU Copyright Law
  Originality in EU Copyright
  Copyright and the CJEU
- Public Engagement
  Talks, Lectures & Short Courses
  Recently Organized Events
  Recognition
  In the Media
- IP Blogging
- News
- Contact

Home
About
Scholarly Work
Selected Academic Publications
Copyright and the CJEU, 2nd edn
DSM Directive Commentary
WIPO Metaverse Study
The Handbook of Fashion Law
Twenty Years of The IPKat
Handbook of EU Copyright Law
Originality in EU Copyright
Copyright and the CJEU
Public Engagement
Talks, Lectures & Short Courses
Recently Organized Events
Recognition
In the Media
IP Blogging
News
Contact
- Home
- About
- Scholarly Work
  Selected Academic Publications
  Copyright and the CJEU, 2nd edn
  DSM Directive Commentary
  WIPO Metaverse Study
  The Handbook of Fashion Law
  Twenty Years of The IPKat
  Handbook of EU Copyright Law
  Originality in EU Copyright
  Copyright and the CJEU
- Public Engagement
  Talks, Lectures & Short Courses
  Recently Organized Events
  Recognition
  In the Media
- IP Blogging
- News
- Contact

Copyright exceptions and fair use defences for AI training done for “research” and “learning”: A new academic study

· copyright,artificial intelligence,AI training,exceptions,research

“It is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife”. 

While this might have been engraved in the minds of many during Jane Austen’s times and still hold true to some more or less enlightened contemporaries today, in the age of Artificial Intelligence (AI) there is another truth that is perhaps even more universally acknowledged: an AI model must be in want of large amounts of ‘data’ to be trained and, thus, ensure a good fortune to its developer.

Such ‘data’, more often than not, includes content protected by copyright, related and/or sui generis rights (including the database right). It is not difficult to see why: training done on high-quality texts, audio, videos, and images – including newspaper articles, songs and compositions of professional songwriters, audiovisual content of video makers and production companies, works of artists and photographers, etc – allow for the resulting model to generate high-quality, up-to-date, and accurate outputs.

Against this background, the discussion of whether and how to strike a balance between licensing and exceptions under copyright law is one of global relevance. Over the past several years, there have been countries that have:

Adopted statutory exceptions to allow text and data mining (TDM) at specific conditions, including Japan (Article 30-4 of the Copyright Act), Singapore (Section 244 of the Copyright Act 2021), the UK during its tenure as an EU Member State (section 29A CDPA), other EU Member States, and the EU through the adoption of the DSM Directive;
Been discussing legal reforms to either introduce exceptions (e.g., Hong Kong) or broaden existing exceptions (e.g., the UK) for TDM; and
Have not considered adopting any new legislation. The latter is the case of the US where the existing fair use doctrine under section 107 of the Copyright Act 1976 has served and will continue to serve to determine the lawfulness of unlicensed TDM and AI training activities.

At the EU level, much of the attention has so far centred on Article 4 of DSM Directive, including in the context of a potential UK reform, possibly because Article 4 – as opposed to Article 3 of the directive – is perceived to have broader relevance and application due to the lack of restrictions on beneficiaries and purposes of the TDM at hand.

In this context, a four-fold observation needs to be made:

First, TDM may be part of AI training processes, but it is neither synonymous with AI training nor is it all that AI training entails, including in terms of acts restricted by copyright and related rights.
Second, from a European (thus including both the EU and the UK) perspective, focusing on Article 4 of the DSM Directive alone to treat unlicensed TDM and AI training is myopic. As the recent decision of the District Court of Hamburg in LAION demonstrates [IPKat here and here], Article 3 of the DSM Directive has also the potential to be largely – and inappropriately – relied upon in the context of unlicensed TDM and subsequent AI training practices. That is so for two key reasons: the possibility – expressly allowed by the directive itself – for the beneficiaries of the exception to collaborate with third parties, including commercial AI developers; and the potentially (over)broad construction of key notions like ‘research organization’ and ‘scientific research’. The Hamburg court did not consider either aspect in any detail, concluding instead that the defendant organization qualified as a research organization entitled to protection under the German transposition of this EU provision. The court also unduly conflated the notion of TDM, which is solely what Article 3 of the DSM Directive covers, with that of AI training. Overall, an exclusive or even predominant focus on Article 4 and the technical and legal complexities surrounding rights reservation under paragraph 3 therein distracts from the circumstance that commercial AI developers might shift their attention towards Article 3 of the DSM Directive instead, also to explore and test the extent to which they might be collaborating with the beneficiaries of Article 3 without having to incur in the limitations of Article 4, specifically the rights reservation possibility.
Third, calls advocating a relaxation of EU copyright rules to facilitate ‘research’ have been recently made, seemingly including the President of the European Commission herself, who also announced forthcoming legislative proposals “to make Europe the home of innovation again.”
Fourth, the UK Government’s Copyright and AI consultation has relatively recently ended and the official Government's response is pending. Should no reform be ultimately undertaken, the application of section 29A CDPA will depend to a large extent on how courts construe the notions of ‘research’ and the ‘non-commercial’ requirement thereof.

New academic study

Moving from the above, in an academic study that I undertook at the request of the International Federation of the Phonographic Industry and which has been just published in the European Journal of Risk Regulation, I investigate whether and to what extent unlicensed AI training activities could be undertaken by relying, not on Article 4 of the DSM Directive as transposed into national law or a hypothetical reform of the UK system of exceptions, but rather on what appear to be so far potentially overlooked defences.

Reference is made specifically to research and education exceptions, notably Article 3 of the DSM Directive and Article 5(3)(a) of the InfoSoc Directive, the latter also read in light of Article 5 of the DSM Directive.

The discussion of other jurisdictions – including the US and countries, like South Korea and Singapore, which have adopted open-ended fair use-style defences – is also undertaken. This is done to determine whether unlicensed AI training, including training seemingly done for the purpose of research or education/learning, might be considered lawful.

Key findings

The study tackles two key questions: 

Whether and, if so, under what circumstances unlicensed AI training may be considered tantamount to ‘research’ or even learning in the context of ‘teaching’, and 
Whether commercial AI developers can take advantage of the provisions above.

Ultimately, both questions are answered in the negative: there is no exception or open-ended defence that fully covers unlicensed AI training activities.

While unlicensed TDM may – under certain conditions – be covered by an exception, the same conclusion may not be correct insofar as unlicensed AI training is concerned. That is not only because the latter engages the doing of restricted acts that exceed the scope of potentially available exceptions under EU and UK laws, but also because it runs contrary to key factors under the fair use analysis.

The study concludes that, to address and remove the legal risk associated with AI training, including training initially undertaken for the purpose of ‘research’ and ‘learning’, the development of a licensing approach is already and will ultimately be unavoidable, not only for developers seeking to launch their models in the EU.

Where to read more

Open access on the website of the European Journal of Risk Regulation here.

[Originally published on The IPKat on 29 July 2025]