As described in our article, parties training generative AIs on data obtained from third parties are at risk of infringing the rights of those third parties, on the basis that the data used in the training set was copied, without authorisation.
This also leads to questions about whether the output from such an AI may infringe a third party’s rights. Getty Images have asserted copyright infringement against Stability AI, both on the basis that Stability AI’s use without authorisation of Getty’s images in its training set infringes its rights, and because (Getty say) given the allegedly infringing input data, outputs using that data must also infringe.
Another illustrative dispute is the on-going US class action lawsuit against Microsoft, OpenAI and GitHub regarding their AI-powered programming assistant GitHub Copilot. This AI tool is alleged to reproduce open source code which it has been trained upon without crediting to its developers as required under those developers’ open-source licenses. This dispute is in the USA, but is another illustrative example of the risk that - where an AI is trained on unauthorised data, its output may also be tainted.
Pending the outcome of cases such as this, it is quite possible in principle that generative AI outputs may infringe. However, for rightsholders, it is much more difficult to establish infringement in an output than in an input. Copyright infringement requires:
- similarity between the alleged copy and all or a substantial part of the original work; and
- that the alleged infringer in fact copied the original work, whether directly or indirectly through an intermediate work (rather than two similar works simply being developed contemporaneously).
Proving that an AI was trained on a rightsholder’s work can be difficult, particularly where the AI in question is proprietary and does not disclose its dataset. The Getty and GitHub cases referenced above are slight outliers, in that the companies behind the AIs involved had disclosed information about the datasets used.
Another uncertain issue in relation to infringing output is who may be liable. Naturally, this question will depend on the specific circumstances as well as applicable national laws. It seems likely that the company behind the AI, which arranged for its training will be liable. However, the user of the AI, who prompted an infringing output, may also be taken to infringe if they prompted in such a way as to draw out an infringing output. Where the user of an AI does so to generate text, images, video etc to use in a public, commercial manner, they should be careful to ensure that they do not prompt in a manner asking the AI to copy or otherwise use the works of a third party.
Ultimately, it is up to the national courts to decide upon the mentioned issues in relation to infringing output. We at Potter Clarkson will follow these developments, so stay tuned!
This article forms part of our AI Hub, which you can access here.
Potter Clarkson’s specialist electronics and communications team includes a number of attorneys with extensive experience in software, and AI inventions. If we can help you with an issue relating to the protection and commercialisation of innovation in any area artificial intelligence, please get in touch.