| Disclaimer: This guide is for informational purposes only and does not constitute legal advice. For specific legal concerns or detailed guidance, please consult a qualified legal professional or contact the Free Legal Advice Scheme on HKU Campus. |
Key Issues with Generative AI and Copyright
This online guide offers informational (not legal) insights into critical copyright issues related to generative AI (Gen-AI), including:
The eligibility of AI-generated content for copyright protection
AI-generated works refer to output created by generative AI (Gen-AI) based solely on user prompts without direct human authorship.
The infographic below is derived from Chapter 2 of the Copyright and Artificial Intelligence Consultation Paper, published by the Intellectual Property Department (IPD) in 2024. For more detailed information, we recommend consulting the original document.
Who is the Necessary Arranger?
Under the provisions on computer-generated works in the Copyright Ordinance, the "author" of a computer-generated LDMA work is defined as the person who undertakes the necessary arrangements for its creation, i.e., the "necessary arranger". AI-generated LDMA works are considered to fall within the scope of these provisions.
However, the identity of the necessary arranger remains ambiguous. It is unclear whether the necessary arranger should be the AI developers, AI owners, AI licensees, or the user who inputs prompts to the AI system. This question has not yet been tested in court. The Intellectual Property Department (IPD) stated in its Copyright and Artificial Intelligence Consultation Paper that the determination will be fact-specific and considered on a case-by-case basis.
The IPD plans to issue guidelines, including practical suggestions and detailed examples, to clarify copyright protection issues related to generative AI.
Originality Requirement in AI-generated Works
LDMA works must satisfy the originality requirement for copyright to subsist. While the Copyright Ordinance does not explicitly define the originality, established case law holds that a human-created LDMA work is considered original if the human author has exercised sufficient independent skill, labor, and/or judgment in its creation.
Since most AI-generated content involves limited or even no meaningful human involvement, the originality requirement for AI-generated works remains open to interpretation. This is expected to be shaped by case law development in the future.
Contractual Arrangement
Copyright ownership of AI-generated works can also be governed by contractual agreements. Some AI system owners claim copyright ownership of AI-generated works while granting users a non-exclusive license to use them. Conversely, in some cases, users are granted copyright ownership, with the AI system owners retaining a non-exclusive license for usage.
It is advisable to carefully review the terms of use or service agreements of different AI tools to understand the specific copyright ownership arrangements concerning AI-generated works.
Copyright Ownership of different types of works
The table below, adapted from the Copyright and Artificial Intelligence Consultation Paper (pages 9-11), summarizes the copyright treatment of various types of works, including AI-generated works.
Please note that AI-generated LDMA works have a shorter copyright duration and more limited moral rights. Additionally, the current Copyright Ordinance does not contain any provisions specifically addressing computer-generated non-LDMA works.
| Types | Originality requirement |
Creator in real life |
Author | First copyright owner |
Duration of copyright |
Moral rights |
| Human-Created LDMA Works |
Yes | Human author |
Human author | Author’s life plus 50 years after death |
Right to be identified as the author Right to object to derogatory treatment Right against false attribution |
|
| AI-Generated LDMA Works |
Yes | Computer (without human author) |
Person who arranges the creation of the work | 50 years from which the work was made | Right against false attribution of a work | |
| Sound Recordings | No | No statutory restriction excludes a computer as a creator. | Producer | 50 years from which the recording was made/released | N.A. | |
| Films | Producer and human principal director |
50 years after the death of the last surviving key contributor; or if none, 50 years from the film’s creation. |
Right to be identified as the director Right to object to derogatory treatment Right against false attribution |
|||
| Broadcases | Person making the broadcast |
50 years from which the broadcast was made |
N.A. | |||
| Cable programmes | Person providing the cable programme service |
50 years from which the work was made | N.A. | |||
| Typographical arrangement of published editions | Publisher | 25 years from which the edition was first published | N.A. | |||
Copyright Infringement Liability for AI-generated Content
The Intellectual Property Department (IPD) has outlined in its Copyright and Artificial Intelligence Consultation Paper that when an AI-generated work infringes copyright, liability rests with the individual or entity that made the necessary arrangements that caused the AI to infringe. This principle is consistent with how liability is determined for non-AI-generated works.
Example 1
If an AI-generated work infringes copyright without any prompts from the user suggesting such copying, and the AI developer has the ability to prevent this infringement, the developer should be primarily liable. However, the user may also be held accountable if they copy or publicly share the infringing AI-generated work.Example 2
Conversely, if a user’s prompts explicitly indicate the desire for the AI to produce an infringing copy of a copyrighted work, the end-user could be held liable for copyright infringement alongside any potential liability of the AI developer.
The IPD plans to issue interpretive guidelines that will clarify how existing legal principles for copyright infringement apply to cases involving AI-generated works. These guidelines will include practical suggestions and specific examples.
Contractual arrangement between AI system owners and end-users
You should also carefully review the terms of use or service agreements, particularly those addressing liability related to copyright infringement from AI-generated works.
Some AI system owners may include clauses that limit their liability to users. In these cases, users might be required to indemnify the AI system owners against any third-party claims resulting from the use of AI-generated works. Conversely, some AI system owners provide protections for users, covering legal costs and damages from copyright claims to foster trust in their AI services.
Ingesting Copyrighted Works as Training Data
The integration of copyrighted works into AI training data raises important legal and ethical concerns, particularly when the outputs compete with original works. Critics argue that this practice undermines the legitimate interests of copyright owners, who deserve compensation for the use of their creations. On the other hand, AI developers assert that Gen-AI enables innovative reuses of data embedded in copyrighted works, claiming that these transformative applications can qualify as fair use.
To address the balance between copyright holders and users, many countries have introduced a Text and Data Mining (TDM) Exception. Below is an update on the proposed exception in Hong Kong.
The Coming Text and Data Mining (TDM) Exception in Hong Kong
To support the development of AI models that require extensive use of copyrighted materials, the Hong Kong government plans to amend the Copyright Ordinance.
The proposed TDM Exception will allow the copying of copyrighted works for computational data analysis and processing, specifically for the development, training, and enhancement of AI models, without needing licenses from copyright owners. This exception will apply to both non-commercial and commercial uses, fostering the growth of AI technology.
TDM is not limited to AI development; researchers also use TDM to analyze large volumes of digital resources, aiming to acquire new knowledge, advance research, and uncover novel patterns and trends.
The proposed exception will include specific restrictive conditions to balance the interests of copyright owners and users. Key conditions include:
The Intellectual Property Department (IPD) will formulate codes of practice or guidelines to assist with the implementation of feasible opt-out options, as well as provide guidance on record-keeping and disclosure.
Ingesting Open Access Works under CC Licenses
Note that using open-access works licensed under Creative Commons (CC) licenses as training data is not always permissible. Some licenses impose restrictions such as non-commercial (NC), no-derivatives (ND), or share-alike (SA), which require more than just attribution.
The following table summarizes the usage of CC-licensed works as training data, assuming that the AI-generated output will be shared publicly and that copyright permission is required (e.g., no TDM exception):
| CC Licenses | Credit the Source | Generate Derivative Works | Share Output under the Same License | Commercial Use |
|
CC BY
|
Required | Allowed | Not Required | Allowed |
|
CC BY-SA
|
Required | Allowed | Required | Allowed |
|
CC BY-NC
|
Required | Allowed | Not Required | Prohibited for Commercial Use at All Stages, Including Training and Model Sharing. |
|
CC BY-NC-SA
|
Required | Allowed | Required | |
|
CC BY-ND
|
Required | Prohibited for Use as Training Data. | ||
|
CC BY-NC-ND
|
Required | |||
For more information, please refer to Using CC-Licensed Works for AI Training and the official flow chart.
Friendly Reminder: Do Not Upload HKU Library E-Resources to Third-Party Platforms
Most e-resources provided by the HKU Libraries are governed by license agreements with publishers and vendors. These agreements often prohibit the uploading of content to third-party platforms, including Gen-AI tools, even for educational purposes. Non-compliance with these license terms may constitute copyright infringement, resulting in liability for any damages incurred.
Licensing Scholarly Content for AI Training
Many academic publishers are either announcing or negotiating licenses to use their scholarly content as training data for large language models (LLMs). They claim that this licensing can enhance the accuracy and relevance of AI models, reflecting a commitment to ensure authors' ideas make the fullest possible contribution.
To keep track of these developments, Ithaka S+R has launched the Generative AI Licensing Agreement Tracker, which documents public agreements and analyzes their impact and underlying strategies. It is important to note that this tracker only includes publicly disclosed agreements, and there may be additional undisclosed deals.
In response to concerns about the use of scholarly content without researchers' knowledge, some publishers are adopting an opt-in approach, actively seeking permission before licensing their content for AI training. While the decision to participate is ultimately personal, consider the following reasons for opting in or opting out.
| Potential Reasons to Opt-In | Potential Reasons to Opt-Out |
|---|---|
|
Understanding Your Rights in Publisher Agreements for AI Training and Usage
When publishing your work, it’s crucial to understand your rights concerning AI traing and usage. Start by carefully examining your contracts to determine what you have authorized your publisher to do with your work. Pay close attention to the following areas:
CC Signals: A New Initiative from Creative Commons
CC Signals is a groundbreaking framework that empowers content owners to articulate their preferences for how their content can be used in AI training. This initiative seeks to strike a balance between the needs of creators and the demands of AI development, fostering reciprocity, shared benefits, and openness.
CC signals are designed to be interpretable by both machines and humans. The four signals proposed by Creative Commons are:
| Signal | Description | Proposed Combinations | |
![]() |
Credit | Ensure appropriate credit is given based on the method, means, and context of use. | Credit |
![]() |
Direct Contribution | Provide monetary or in-kind support to the content owner for the development and maintenance of assets. | Credit + Direct Contribution |
![]() |
Ecosystem Contribution | Provide monetary or in-kind support back to the ecosystem benefiting from the use of the content. | Credit + Ecosystem Contribution |
![]() |
Open | The AI system must be open, adhering to standards like the Model Openness Framework or Open Source AI Definition. | Credit + Open |
Credit is a fundamental component of each combination, emphasizing reciprocity and benefiting the broader knowledge cycle. For more information on the signals, click here.
Legal and Ethical Considerations: While CC Signals are not a replacement for copyright licenses, they introduce an ethical layer to content sharing. Participation from AI developers is voluntary but is encouraged through community norms and reputational accountability.
Current Status: CC Signals is in the public feedback phase, with plans to launch an alpha version in November 2025. This initiative primarily targets large dataset holders rather than individual creators.
Concluding Notes
Generative AI technology is advancing rapidly, and ongoing court rulings, legal settlements, and legislative proposal are continuously reshaping the boundaries of copyright protection in relation to AI.
As librarians, we are committed to keeping you informed about the key copyright challenges posed by generative AI. We will continue to monitor technological advancements, legal developments, and policy changes both locally and internationally to provide you with timely and accurate guidance.