Generative AI & Copyright Law: Infringement & Ownership

THE INPUT SIDE: DATA & COPYRIGHT INFRINGEMENT

AI relies on data. Lots of data. This data is essential to the AI systems being developed, as it is used to train AI systems to create output such as answers to a question, a new artwork, or a new song.

However, how these massive libraries of data are collated and used raises questions, including:

how is the data being collected (e.g. web scraping, purchasing data from third parties?)

where is the data being collected from (e.g. the internet, internal corporate databases?)

what exactly is being collected (e.g. private data, metadata?) and

does the collection and training process involve ’reproduction’ and or other exclusive rights of copyright owners without permission?

For businesses and creators, a key issue is whether the collection and use of this input data infringes copyright.

Given the quantum of data that has already been collected to create the databases that underpin generative AI, many claim that the collection and use of data must involve the reproduction of copyright materials, such as news articles, artworks, and music. For many copyright owners, it has been difficult, however, to determine exactly what data has been collected due to the secrecy surrounding what training data is being collected and the opaque nature of proprietary AI systems more generally.

It comes as no surprise that lawsuits are being filed around the world alleging infringement of copyright by owners of AI systems.

THE OUTPUT SIDE: COPYRIGHT INFRINGEMENT

An AI system that has ingested third party photographs with or without license, could, based on a simple text-based prompt, generate an output that reproduces or communicates a third party’s copyright photograph. Critically, the Copyright Act only requires that a ‘substantial part’ of the work be reproduced or communicated to constitute infringement. What constitutes a ‘substantial part’ refers to the quality of what has been reproduced rather than the quantity, and is a question of fact and degree, to be determined according to the circumstances of the case. Assessment of this question in artistic works, for instance, is notoriously complex.

In December 2022, a number of artists started making this point by using AI systems to generate images of characters such as Disney’s Mickey Mouse, Darth Vader, Spiderman, and Pikachu, and encouraging people to use the images on various forms of merchandise and sell them.

If an AI-generated work does infringe copyright, the question then becomes, who is responsible for that infringement: the user of the AI system who provides the prompt to the AI or the owner of the AI system itself or both? Complex questions around joint tortfeasors and authorisation liability in Australian law are likely to be examined when considering output-side infringement.

All of these questions may need to be considered by the creators and owners of AI systems, creators using these AI systems, legislators, and regulators.

THE OUTPUT SIDE: ATTRIBUTION

As only human creators can ’author’ copyright material, only human authors have a right to be attributed as authors of the work. An AI system that is used in the creation of a work is not recognised as having any moral rights under Australian copyright law, and there is no legal right or requirement in legislation for the AI system to be attributed as an author.

However, it is good practice to acknowledge the use of an AI system to generate a work. This would be in line with the USCO’s new guidelines regarding copyright registration, and more importantly, would also align with Australia’s new non-binding AI Ethics Framework that calls for transparency and responsible disclosure in the use of AI. In the future, we expect some AI systems will require attribution through their Terms of Use.

For those whose works are reproduced without permission in AI outputs, a genuine question arises about whether their moral right of attribution has been infringed — or, in addition, whether a separate actionable false attribution has occurred.

THE AI ITSELF

The AI system itself can be protected by copyright. For example, if the source or object code for an AI system has been written by a human author, that code is automatically protected by copyright in Australia under the Copyright Act 1968 (Cth). However, it is generally understood that the written expression of code is protected by copyright, not the function of the code. Code may be written in any number of ways to perform identical functions without necessarily infringing copyright in the original code.

Where an AI system includes a dataset or datasets compiled from a range of sources, the dataset may also be protected by copyright as literary works (compilations), but such protection is only available if there has been sufficient ‘independent intellectual effort’ expended on the compilation of that data by a human author. In the case of AI systems it is untested whether the initial parameters determining how the dataset is collated involve sufficient intellectual effort by a human. Of course, this may differ between AI systems.

However, where an AI system is updated based on inputs from users or instructions to use additional training data (including by using a training algorithm to update the parameters in a neural network), there will be a stronger argument that the resulting evolved system attracts copyright protection. The issue of human authorship may arise, but the Data Access v Powerflex case suggests that this issue might not be insurmountable in all cases.

WHERE TO FROM HERE

Generative AI is only going to get bigger, better, and more powerful. With the emergence of increasingly sophisticated AI technologies, the law is at a crossroads when it comes to the authorship and use of copyright works. Regulatory and legislative bodies around the world are faced with the challenge of how best to update copyright laws in a way that promotes the use of this new technology without harming creative ecosystems and economies.

In Australia, the Attorney-General’s Department is currently completing its review of Australia’s copyright enforcement regime. That review received submissions supporting stronger protection for creatives and other submissions calling for greater flexibility and more fair dealing exceptions on the use side, particularly for activities such as text and data mining.

We will be monitoring closely where the recommendations land in balancing the rights of all stakeholders.

*The authors wish to acknowledge the valuable contributions and insights of our colleagues Cate Nagy, Bryony Evans, Cheng Lim and Kendra Fouracre.

INPUTS	Data & Copyright Lawsuits have been filed in the US & UK alleging the infringement of copyright works used in the training of AI systems. They remain pending. The EU is attempting to address this issue in its proposed “AI Act” by adding a clause requiring public disclosure of a summary of training data protected by copyright law. The Australian Federal Government has yet to specifically address these issues, although a Copyright Enforcement Review is underway.
OUTPUTS	Authorship The US Copyright Office has released guidelines outlining when works generated by AI will be protected by copyright. Protection will depend on the extent to which a human had creative control over a work's expression. In the UK, copyright law protects works generated by a computer where there is no human creator, if the work expresses original human creativity. In Australia, an original work generated by AI that has sufficient human input (independent intellectual effort) will likely be protected by copyright. Infringement by AI-generated works An AI system can potentially generate works that substantially reproduce an earlier copyright work. Determining who is responsible for this infringement is yet to addressed by legislators or the courts. Attribution Only individual / human authors have a right to be attributed as authors of a work. It is recommended by regulators in the US and Australia that any use of an AI system to generate a work should be acknowledged.
AI ITSELF	Copyright protection The source code or object code of an AI system itself can potentially be protected by copyright. AI systems that include a dataset or datasets compiled from a range of sources may also be protected by copyright as literary works (compilations).

It copies, right? Generative AI & copyright law

THE INPUT SIDE: DATA & COPYRIGHT INFRINGEMENT

THE OUTPUT SIDE: THE AUTHORSHIP & OWNERSHIP OF WORKS

THE OUTPUT SIDE: COPYRIGHT INFRINGEMENT

THE OUTPUT SIDE: ATTRIBUTION

THE AI ITSELF

WHERE TO FROM HERE

OVERVIEW OF COPYRIGHT LAW & AI

Governance Solutions

Crisis Management

Innovation at Mallesons

Owl Advisory by Mallesons

Early Careers

Qualified Lawyers

Shared Services and Support

Brisbane

Canberra

Melbourne

Perth

Sydney

Singapore

It copies, right? Generative AI & copyright law

THE INPUT SIDE: DATA & COPYRIGHT INFRINGEMENT

GitHub Copilot litigation

Stability Diffusion litigation

Getty litigation in the UK and US

Thomson Reuters and ROSS Intelligence

The EU position

The Australian position

THE OUTPUT SIDE: THE AUTHORSHIP & OWNERSHIP OF WORKS

Authorship of works: DABUS & Dr Thaler

Authorship of works: Kris Kashtanova & the USCO’s new guidance

Authorship of works: UK

Authorship of output works: Australia

THE OUTPUT SIDE: COPYRIGHT INFRINGEMENT

THE OUTPUT SIDE: ATTRIBUTION

THE AI ITSELF

WHERE TO FROM HERE

OVERVIEW OF COPYRIGHT LAW & AI

Governance Solutions

Crisis Management

Innovation at Mallesons

Owl Advisory by Mallesons

Early Careers

Qualified Lawyers

Shared Services and Support

Brisbane

Canberra

Melbourne

Perth

Sydney

Singapore