Insight,

It copies, right? Generative AI & copyright law

AU | EN
Current site :    AU   |   EN
Australia
Singapore

Only last September and October we published our AI Guides to AI & Copyright Infringement and Ownership of AI Generated Works. Since that time, there have been substantial developments in the use of generative AI and accompanying discussion regarding the application of copyright law.  Internationally, we have also seen the launch of legal challenges to AI systems on copyright grounds.

This Insight provides an update on key developments from across the world. We discuss copyright issues around the inputs of AI (things used when training an AI system) and the outputs of AI (the works generated by an AI system) as well as whether an AI system can itself be protected by copyright.

For an overview of copyright law and AI see here.

THE INPUT SIDE: DATA & COPYRIGHT INFRINGEMENT

AI relies on data. Lots of data. This data is essential to the AI systems being developed, as it is used to train AI systems to create output such as answers to a question, a new artwork, or a new song.

However, how these massive libraries of data are collated and used raises questions, including:

  • how is the data being collected (e.g. web scraping, purchasing data from third parties?)
  • where is the data being collected from (e.g. the internet, internal corporate databases?)
  • what exactly is being collected (e.g. private data, metadata?) and
  • does the collection and training process involve ’reproduction’ and or other exclusive rights of copyright owners without permission?

For businesses and creators, a key issue is whether the collection and use of this input data infringes copyright.

Given the quantum of data that has already been collected to create the databases that underpin generative AI, many claim that the collection and use of data must involve the reproduction of copyright materials, such as news articles, artworks, and music. For many copyright owners, it has been difficult, however, to determine exactly what data has been collected due to the secrecy surrounding what training data is being collected and the opaque nature of proprietary AI systems more generally.

It comes as no surprise that lawsuits are being filed around the world alleging infringement of copyright by owners of AI systems. 

Expand

One of the first AI-related lawsuits to be filed in the US was the GitHub Copilot litigation in November 2022. The class action was filed against GitHub, Microsoft (the owner of GitHub), and OpenAI, and alleges that the creation of the AI-powered coding assistant GitHub Copilot violated the legal rights of creators who posted code or other work on GitHub under open-source licences that require attribution of the author’s name and copyright.

The class action also alleges that the defendants have violated GitHub’s own terms of service and privacy policies, § 1202 of the US Digital Millennium Copyright Act, which forbids the removal of copyright management information, and the California Consumer Privacy Act among other laws. This litigation remains pending.

The GitHub Copilot lawsuit was quickly followed by a second US class action in January 2023, this time filed on behalf of three artists — Sarah Andersen, Kelly McKernan, and Karla Ortiz — against Stability AI, DeviantArt, and Midjourney for their use of Stable Diffusion (made by Stability AI), an AI image generation tool which the artists claim has been trained on millions of copyright works. According to the Stable Diffusion litigation suit, ‘Stable Diffusion contains unauthorized copies of millions — and possibly billions — of copyrighted images’ and that these ‘copies were made without the knowledge or consent of the artists.’

On 18 April 2023, DeviantArt, Stability AI and Midjourney filed motions to dismiss the lawsuit. The motion to dismiss by Midjourney demonstrates forensic hurdles that any similar representative claim in Australia may also have to overcome, arguing the claim does not identify a single ‘Work’ that was ‘used’ by Midjourney to train its platform, or any allegedly infringing image generated as output. Tracing specific authors’ specific ‘works’ through the labyrinthine process of generative AI could in many instances prove to be a challenge for plaintiffs. This litigation remains pending.

A few days later, Getty Images filed their own suit against Stability AI in the UK High Court alleging (among other things) that that ‘Stability AI unlawfully copied and processed millions of images protected by copyright and the associated metadata owned or represented by Getty Images absent a license to benefit Stability AI’s commercial interests and to the detriment of the content creators.’ Getty Images CEO Craig Peter stated that Getty Images doesn’t ‘believe this specific deployment of [Stability AI’s] commercial offering is covered by fair dealing in the UK or fair use in the US.’ Getty Images then filed a second lawsuit against Stability AI in February 2023, this time in the US. This litigation remains pending.

Thomson Reuters also brought a lawsuit against ROSS Intelligence in the US District Court in Delaware in May 2020, accusing ROSS Intelligence of illegally using its Westlaw database without permission or compensation to create their own AI-powered database. On 23 February 2023, Thomson Reuters asked for summary judgment against ROSS Intelligence Inc arguing that copying the work of a competitor’s editors to create an artificial intelligence-powered database is an undisputed violation of copyright. This litigation remains pending.

Whether these four suits will overcome the established albeit complex defence of fair use in US law, or the text and data mining exceptions in the UK GDPR, remains to be seen. 

On 27 April 2023, members of the European Parliament reached a political agreement on the EU AI Act. Reports are indicating that, as part of last minute negotiations on the scope of the AI Act, a new clause has been inserted requiring generative AI models to ‘make publicly available a summary’ disclosing the use of training data protected under copyright law. The AI Act is currently expected to go to committee vote on 11 May 2023 with a plenary vote expected in mid-June.

As far as we are aware, no lawsuits have yet been filed in Australia alleging copyright infringement against owners or users of generative AI systems, but concerns have been raised. For example, Australian artists alleged the popular AI app, Lensa AI, was ‘stealing’ their art to train its AI system to generate hyper-stylised portraits. Lensa’s parent company, Prisma Labs, in turn, defended its use of images, arguing that their AI system learns to create portraits just as a human would, by learning different artistic styles.

Lensa works by allowing subscribers to upload 10-20 photos of themselves on to its app, which then turns those photos into 50-100 digital portraits (‘avatars’) in a range of different contemporary art styles. In order to do this, Lensa states that it uses the open-source neural network model Stable Diffusion. The Stable Diffusion model is trained on LAION 5B, which Lensa describes as ‘an enormous uncurated dataset designed to serve as a general representation of the language-image connection on the internet.’ It then generates its avatars from text prompts — small pieces of text describing the desired scene in the output images.

Artist Kim Leutwyler says that Lensa is replicating the distinctive, recognisable styles of artists, including ‘brush strokes, colour, composition — techniques that take years and years to refine.’ However, Australian courts have ruled that the general style and technique of an artist over a possibly large range of work is not protected by copyright. [1]

With respect to LAION and its LAION 5B dataset, artists say the tool is ingesting their work without their permission. Some artists have been able to use the website Have I Been Trained, which allows them to see whether their works are in the LAION-5B dataset.

Under Australian copyright law, individual artworks ‘authored’ by humans are generally protected by copyright. However, LAION says that its ‘datasets are simply indexes to the internet, i.e. lists of URLs to the original images together with the ALT texts found linked to those images.’ They also say that while they downloaded images ‘to compute similarity scores between pictures and texts, we subsequently discarded all the photos.’ Lensa, states that there is ‘an understanding within the industry about the necessity of offering artists the opt-out option, introduced at the level of the entity that performs initial training of the model.’ This raises complex questions about the technical threshold required to establish whether such downloads constitute ‘reproductions’ within the meaning of Australia’s copyright legislation.

It is not just visual artists who are concerned. In March 2023, News Corporation’s chief executive Robert Thomson raised the issue of AI copyright infringement at a conference, noting that News Corp had started discussions with a ‘certain party who shall remain nameless’ about the use of news articles by AI, adding that while people talk about open source, ‘clearly, they are using proprietary content’ and ‘there should be, obviously, some compensation for that.’  

Similarly, Rod Sims, the former chairman of the Australian Competition and Consumer Commission, said recently that it’s likely large AI models have scraped news publications to generate accurate answers. He said AI companies should be forced to pay for access to content, similar to the Australian news bargaining code with Google and Facebook.

In Australia, any action seeking compensation for infringement of a copyright work by an AI system would most likely need to rely on the Copyright Act 1968 (Cth) (and potentially the Competition and Consumer Act 2010 (Cth) and the common law). It is an infringement of copyright to reproduce or communicate works digitally without the copyright owner’s permission. So, if you make a digital copy of copyright works by scraping images from public websites, you likely need permission to make that copy from the copyright owner.

Australia does not have a general ‘fair use’ defence to copyright infringement, such as that in the United States. As a result, in the event of a copyright infringement claim, AI companies whose datasets are generated through the unauthorised reproduction of copyright content face the significant challenge of establishing that their acts fall within one of Australia’s narrower statutory exceptions to copyright infringement. There are a number of exceptions in the Copyright Act that may be relied upon depending on the context, such as the fair dealing exceptions covering research and study, or parody and satire or the limited exceptions allowing temporary copying of works for technical purposes. Critically however, Australia’s ‘fair dealing’ defences still require that any dealing be ‘fair’, which is a significant hurdle for commercial operators who ingest copyright material without remunerating creators. None of these arguments have yet been tested against AI systems in Australian courts.

Commentators have recommended that concerned creators make an opt-out request directly to owners of AI systems and ask any online hosts of their work to ensure that their work cannot be ‘scraped’ by third party bots. LAION has agreed to allow artists to have their images removed from their dataset, which will mean that future models will not be trained using them. Stability AI agreed to honour this in their training run for Stable Diffusion 3. Similarly, Adobe and Shutterstock, who have also been using contributors’ images to train AI image generation tools, have both announced that they are exploring the possibility of an opt-out in the future.

The Federal Government may examine this issue in the near future as part of its review of AI and the regulatory settings and systems surrounding it: see our recent Insight, Developments in the Regulation of AI. Broadening text and data mining exceptions are also currently being discussed in the UK and the EU. The Federal Government has also commenced a wholesale Copyright Enforcement Review, which is now likely to include these contentious issues as areas for discussion. It is likely that the availability of fair use in Australia, or the introduction of fair dealings for data mining, will be highly relevant to that Review given these developments.

See Cummins v Vella [2002] FCAFC 218. This aligns with the law in the United States.

THE OUTPUT SIDE: THE AUTHORSHIP & OWNERSHIP OF WORKS

Expand

To date, discussions about AI outputs and intellectual property has focused on patents, driven mainly by Dr Stephen Thaler, who filed patent applications around the world designating his AI system DABUS (Device for the Autonomous Bootstrapping of Unified Sentience) as the inventor.  To date, most jurisdictions, including Australia, have rejected those applications, concluding that an inventor must be a human.

However, Dr Thaler has also been attempting to register an artwork created by DABUS in 2012 for copyright in the United States.  Dr Thaler first applied to register the work, ‘A Recent Entrance to Paradise,’ with the US Copyright Office (USCO) in November 2018, noting that it was ‘[c]reated autonomously by machine’ and that he was seeking ‘to register this computer-generated work as a work-for-hire to the owner of the Creativity Machine.’ The USCO rejected his application in 2019 stating that it could not ‘register this work because it lacks the human authorship necessary to support a copyright claim.’

Dr Thaler appealed to the three-person Review Board of the USCO arguing that the requirement for ‘human authorship’ was ‘unconstitutional and unsupported by either statute or case law.’ In February 2022, the Review Board also refused to register the work, holding that ‘copyright law only protects ‘the fruits of intellectual labor’ that ‘are founded in the creative powers of the [human] mind.’

In January 2023, Dr Thaler filed a lawsuit against the USCO in the Federal District Court. Thaler argues that the work in question ‘satisfies the requirements set forth in the Copyright Act’ and that ‘the AI is entirely controlled by Dr Thaler, the AI only operates at Dr Thaler’s direction, and the AI is owned as property by Dr Thaler’ and that:

‘[u]ltimately, Dr Thaler is unquestionably the “person for whom the work was prepared,” (17 USC 101), and the Creativity Machine was, for all intents and purposes, within the broad conception of the Work for Hire Doctrine acting as an employee.’

Again, it remains to be seen whether Dr Thaler will ultimately be successful in his mission to be recognised as the owner of the copyright in the AI output (the ‘work’) created by DABUS acting as an employee.  

In September 2022, however, the USCO did grant copyright protection to Ms Kris Kashtanova for authorship of the comic book, Zarya of the Dawn. The graphic novel featured AI-generated artwork alongside Ms Kashtanova’s human-generated storyline. Initially, the Copyright Office extended copyright protection to the graphic novel in its entirety, which was the first decision of its kind.

One month later, the USCO became aware that Ms Kashtanova had created the comic book using AI. After consulting further with her, the USCO cancelled the initial registration and replaced it with a registration that was more limited in its scope. Specifically, the new copyright protection only extends to the: ‘text’ and the ‘selection, coordination, and arrangement of text created by the author and artwork generated by artificial intelligence’, but not the individual images within the comic book as these were ‘not created by a human’. The reversal of the initial decision to grant protection over the graphic novel in its entirety is consistent with most jurisdictions and their current positions on authorship of AI.

Following the Zarya of the Dawn decision, on 16 March 2023, the USCO issued new policy guidelines on Works Containing Material Generated by Artificial Intelligence. These new guidelines highlight the ‘human authorship requirement’ stating that:

‘it is well-established that copyright can protect only material that is the product of human creativity. Most fundamentally, the term ‘author’, which is used in both the Constitution and the Copyright Act, excludes non-humans.’

The guidelines explain that if a ‘work’s traditional elements of authorship were produced by a machine, the work lacks human authorship and the [USCO] will not register it’ (italics added).  Or, put another way, ‘When an AI technology determines the expressive elements of its output, the generated material is not the product of human authorship’ (italics added). As an example, the USCO stated that when a human author ‘solely’ provides a ‘prompt’ to an AI system to produce a complex written, visual, or musical work in response, the ‘traditional elements of authorship’ are not determined by the user and so the USCO will not register that work.

The USCO then went further and generalised by stating that ‘based on [its] understanding of the generative AI technologies currently available, users do not exercise ultimate creative control over how such [AI] systems interpret prompts and generate material works in response.’ This particular characterisation of generative AI and the creative process has been challenged by some commentators. It also does not consider some of the more sophisticated techniques that artists are currently using to give them greater control over the outputs of these tools, particularly in open source software like Stable Diffusion.

However, the USCO does acknowledge that there will be cases in which a work containing AI generated material does contain sufficient human authorship to support a copyright claim, including where:

  • a human author selects or arranges AI-generated material in a sufficiently creative way that ‘the resulting work as a whole constitutes an original work of authorship’ or
  • a human author modifies material originally generated by AI technology to such a degree that the modifications meet the standard for copyright protection.

The USCO does qualify this, by stating that in these particular cases copyright only protects the human-authored aspects of the works, which are ‘independent of’ and do ‘not affect’ the copyright status of the AI-generated material itself. However, the USCO also acknowledged that AI systems could be used as tools in the creative process, noting that use of guitar pedals in music or Adobe Photoshop by visual artists does not prevent registration of the resulting works.  Accordingly to the USCO, ‘what matters is the extent to which the human had creative control over the work’s expression and “actually formed” the traditional elements of authorship’.

The United Kingdom has taken a different, and perhaps more modern approach to authorship.

Like Australia, UK copyright law protects original literary, dramatic, artistic and musical works (as well as films, sound recordings, broadcasts and published editions). For a work to be original though it must be the author’s own intellectual creation. This means the author has made free and creative choices and the work has the author’s ‘personal touch.’

However, unlike the position in Australia or the US, UK copyright law protects works generated by a computer where there is no human creator. The Copyright, Designs and Patents Act 1988 (UK) defines the author of a computer-generated work as ‘the person by whom the arrangements necessary for the creation of the work are undertaken.’ The general rule is that if the work expresses original human creativity it will benefit from copyright protection, even if it has been created by a human with assistance from AI.

Australia, as mentioned above, does not specifically provide protection for ‘computer-generated works’. Generally, works can only be protected by copyright if there is a human author who contributed ‘independent intellectual effort’. As a result, works generated by AI that do not have enough human input (independent intellectual effort) will not be protected by copyright. [2]

The opposite may be true – a work generated by AI that does have enough human input (independent intellectual effort) will be protected by copyright. Whether this is correct, and if so, what threshold of human input (independent intellectual effort) is sufficient for a work to be protected, is yet to be tested in Australian courts in this context.

However, in Data Access Corporation v Powerflex Services Pty Ltd [1999] HCA 49 (Data Access v Powerflex), the High Court of Australia accepted a Full Federal Court finding that a ‘compression table’ (a method of reducing the amount of memory space consumed by data files) that was created by a human by writing a computer program that applied an algorithm to a database file, constituted an original literary work and was protected by copyright. The Full Federal Court had found that this ‘compression table’ was created as a result of substantial skill and judgement employed by a human author, despite the fact that when the program was applied to data, it resulted in the composition of bit strings that were computer generated. Accordingly, how much and what type of input (independent intellectual effort) has been contributed by a human author or authors will be relevant to whether a work generated by AI is protected by copyright. How the existing threshold should apply in this new context is yet to be tested in Australia.

For many types of copyright works, such as music and artistic works where an AI tool is used, the distinction between human and AI content is likely to be difficult to identify from the completed work. Those who use AI to generate works and intend to claim copyright ownership in those works should keep detailed records of the design steps and contributions that authors have made to the work on the input side and the output side (and if applicable, to the AI system itself).

Going forward, it will be important to have guidelines for when the output of AI will be protected by copyright and who owns that copyright.

This has been affirmed by the High Court of Australia in IceTV Pty Ltd v Nine Network Australia Pty Ltd [2009] HCA 14 and the Full Court of the Federal Court in Telstra Corp Ltd v Phone Directories Co Pty Ltd [2010] FCAFC 149.

INPUTS

Data & Copyright

  • Lawsuits have been filed in the US & UK alleging the infringement of copyright works used in the training of AI systems. They remain pending.
  • The EU is attempting to address this issue in its proposed “AI Act” by adding a clause requiring public disclosure of a summary of training data protected by copyright law.
  • The Australian Federal Government has yet to specifically address these issues, although a Copyright Enforcement Review is underway.
OUTPUTS

Authorship

  • The US Copyright Office has released guidelines outlining when works generated by AI will be protected by copyright. Protection will depend on the extent to which a human had creative control over a work's expression.
  • In the UK, copyright law protects works generated by a computer where there is no human creator, if the work expresses original human creativity.
  • In Australia, an original work generated by AI that has sufficient human input (independent intellectual effort) will likely be protected by copyright.

Infringement by AI-generated works

  • An AI system can potentially generate works that substantially reproduce an earlier copyright work.
  • Determining who is responsible for this infringement is yet to addressed by legislators or the courts.

Attribution

  • Only individual / human authors have a right to be attributed as authors of a work.
  • It is recommended by regulators in the US and Australia that any use of an AI system to generate a work should be acknowledged. 
AI ITSELF

Copyright protection

  • The source code or object code of an AI system itself can potentially be protected by copyright.
  • AI systems that include a dataset or datasets compiled from a range of sources may also be protected by copyright as literary works (compilations).

THE OUTPUT SIDE: COPYRIGHT INFRINGEMENT  

An AI system that has ingested third party photographs with or without license, could, based on a simple text-based prompt, generate an output that reproduces or communicates a third party’s copyright photograph. Critically, the Copyright Act only requires that a ‘substantial part’ of the work be reproduced or communicated to constitute infringement. What constitutes a ‘substantial part’ refers to the quality of what has been reproduced rather than the quantity, and is a question of fact and degree, to be determined according to the circumstances of the case. Assessment of this question in artistic works, for instance, is notoriously complex.

In December 2022, a number of artists started making this point by using AI systems to generate images of characters such as Disney’s Mickey Mouse, Darth Vader, Spiderman, and Pikachu, and encouraging people to use the images on various forms of merchandise and sell them.

If an AI-generated work does infringe copyright, the question then becomes, who is responsible for that infringement: the user of the AI system who provides the prompt to the AI or the owner of the AI system itself or both? Complex questions around joint tortfeasors and authorisation liability in Australian law are likely to be examined when considering output-side infringement.

All of these questions may need to be considered by the creators and owners of AI systems, creators using these AI systems, legislators, and regulators.

THE OUTPUT SIDE: ATTRIBUTION

As only human creators can ’author’ copyright material, only human authors have a right to be attributed as authors of the work. An AI system that is used in the creation of a work is not recognised as having any moral rights under Australian copyright law, and there is no legal right or requirement in legislation for the AI system to be attributed as an author.

However, it is good practice to acknowledge the use of an AI system to generate a work. This would be in line with the USCO’s new guidelines regarding copyright registration, and more importantly, would also align with Australia’s new non-binding AI Ethics Framework that calls for transparency and responsible disclosure in the use of AI. In the future, we expect some AI systems will require attribution through their Terms of Use.

For those whose works are reproduced without permission in AI outputs, a genuine question arises about whether their moral right of attribution has been infringed — or, in addition, whether a separate actionable false attribution has occurred.

THE AI ITSELF

The AI system itself can be protected by copyright. For example, if the source or object code for an AI system has been written by a human author, that code is automatically protected by copyright in Australia under the Copyright Act 1968 (Cth). However, it is generally understood that the written expression of code is protected by copyright, not the function of the code. Code may be written in any number of ways to perform identical functions without necessarily infringing copyright in the original code.  

Where an AI system includes a dataset or datasets compiled from a range of sources, the dataset may also be protected by copyright as literary works (compilations), but such protection is only available if there has been sufficient ‘independent intellectual effort’ expended on the compilation of that data by a human author. In the case of AI systems it is untested whether the initial parameters determining how the dataset is collated involve sufficient intellectual effort by a human.  Of course, this may differ between AI systems.

However, where an AI system is updated based on inputs from users or instructions to use additional training data (including by using a training algorithm to update the parameters in a neural network), there will be a stronger argument that the resulting evolved system attracts copyright protection. The issue of human authorship may arise, but the Data Access v Powerflex case suggests that this issue might not be insurmountable in all cases.

WHERE TO FROM HERE

Generative AI is only going to get bigger, better, and more powerful. With the emergence of increasingly sophisticated AI technologies, the law is at a crossroads when it comes to the authorship and use of copyright works. Regulatory and legislative bodies around the world are faced with the challenge of how best to update copyright laws in a way that promotes the use of this new technology without harming creative ecosystems and economies.

In Australia, the Attorney-General’s Department is currently completing its review of Australia’s copyright enforcement regime. That review received submissions supporting stronger protection for creatives and other submissions calling for greater flexibility and more fair dealing exceptions on the use side, particularly for activities such as text and data mining.

We will be monitoring closely where the recommendations land in balancing the rights of all stakeholders.

*The authors wish to acknowledge the valuable contributions and insights of our colleagues Cate Nagy, Bryony Evans, Cheng Lim and Kendra Fouracre.

OVERVIEW OF COPYRIGHT LAW & AI

Reference

  • [1]

    See Cummins v Vella [2002] FCAFC 218. This aligns with the law in the United States.

  • [2]

    This has been affirmed by the High Court of Australia in IceTV Pty Ltd v Nine Network Australia Pty Ltd [2009] HCA 14 and the Full Court of the Federal Court in Telstra Corp Ltd v Phone Directories Co Pty Ltd [2010] FCAFC 149.

Latest Thinking
Insight
The long-awaited High Court decision in Bendel has arrived!

12 June 2026

Insight
Queensland has fired the legislative starting gun in the race for critical minerals investment.

05 June 2026

Insight
While the forfeiture rule is a longstanding position in law, its application to superannuation is not always clear.

05 June 2026