DSM Directive: Is it Possible to Reconcile the Development of AI Technologies and Respect for Creators' Intellectual Property?

Text and Data Mining Process – Does Artificial Intelligence Work Like the Human Mind?

Authors of works and data in the public digital space continue to face the risk of their creations being subject to automated text and data mining (TDM) for training artificial intelligence. While TDM appears similar to a human's process of engaging with a work, it raises significant issues regarding the balance between permitted use and infringement of an author's exclusive rights.

If we compare this process to a human data consumer, an author expects their published work to influence and inspire further scholarly efforts. They anticipate that human readers will engage with their work, analyze it, and retain some information. The larger the audience, the greater the potential for inspiration, innovation, and growth—one of the goals of publishing.

AI products like ChatGPT, Bing, and Bard operate on vast databases built from the creative work of others. AI functions in two stages: first, it is fed knowledge from these databases, and then it self-improves, organizing data and creating products. Importantly, it is unclear which data are used by AI, to what extent they are human-created, and which are AI-generated. AI does not differentiate these sources for the user or the original author, leaving creators unaware of how their work is utilized and whether it has been replicated or falls within fair use.

What Do Current EU Regulations Provide for Authors of Data Processed by Artificial Intelligence?

In marketing, promotional, and financial sectors, AI products are often promoted as replacements for human roles, planning strategies, posts, publications, and online activities using extensive human-created knowledge. However, the authors of data processed by AI in TDM do not receive recognition or compensation for their contributions, as current regulations, including the EU Directive 2019/790 (DSM Directive), do not address this issue comprehensively.

The DSM Directive, which EU member states were to implement by June 7, 2021, introduces a new form of permitted use for licensed entities to reproduce and extract protected databases for research and comparative analysis. This permitted use is intended for research organizations and cultural heritage institutions, aiming to facilitate scientific research and innovation. However, for commercial exploitation of works by AI, the DSM Directive allows TDM if the authors have not explicitly objected to the use of their works for AI training.

While this approach seems reasonable—allowing published works to support scientific and cultural development—it introduces an imbalance. Commercial entities can profit from TDM on protected works, while authors do not have guaranteed earnings from AI training based on their work. Articles 3 and 4 of the DSM Directive outline permitted use for scientific research but remain silent on compensation for authors in commercial contexts.

The DSM Directive offers two options: authors can either implicitly consent to AI using their works or explicitly object to such use. However, there is no third option where works are subjected to TDM with appropriate recognition and compensation for authors. The current DSM Directive does not fully enable all creators to receive proper remuneration for their contributions to AI training, whether for commercial or research purposes.

Authors' Compensation – Acknowledgment of Copyright in the TDM Process

Interestingly, the DSM Directive does recognize publishers' intellectual property in TDM processes, creating a related right under Article 15. This gives press publishers exclusive rights to online exploitation of their publications, motivated by the need to maintain high journalistic quality and prevent free use of press content on online platforms. Article 15 ensures that press publishers and the direct creators of press content receive a fair share of revenues from online information service providers.

  • Article 15(5) of Directive 2019/790: “Member States shall provide that authors of works incorporated in a press publication receive an appropriate share of the revenues obtained by press publishers for the use of their press publications by information society service providers.”

Furthermore, Article 17 of the DSM Directive holds information service providers accountable for storing and publicly sharing copyrighted works without proper authorization, ensuring they obtain necessary permissions and act promptly on justified objections from rights holders.

These regulations demonstrate existing protections for press publishers and authors in the digital space. However, AI training remains challenging due to the uncertainty of how much AI uses protected works for commercial training. Given this difficulty, authors might seek specific grounds for compensation for permitted use of their works in AI's TDM processes.

Currently, legislation struggles to keep pace with rapid AI development by major tech companies. As AI increasingly occupies roles traditionally held by humans, it's crucial to anticipate further moves by tech firms and safeguard the value of human intellect. In this scenario, roles should indeed reverse.


Dear journalists, the use of materials from REVERA website in publications is possible only after our written permission. 

For approval of materials please contact e-mail: i.antonova@revera.legal or Telegram: https://t.me/PR_revera