LibGuides: Copyright: Copyright and Gen-AI

Disclaimer: This guide is for informational purposes only and does not constitute legal advice. For specific legal concerns or detailed guidance, please consult a qualified legal professional or contact the Free Legal Advice Scheme on HKU Campus.

Key Issues with Generative AI and Copyright

This online guide offers informational (not legal) insights into critical copyright issues related to generative AI (Gen-AI), including:

The eligibility of AI-generated content for copyright protection

AI-generated works refer to output created by generative AI (Gen-AI) based solely on user prompts without direct human authorship.

The infographic below is derived from Chapter 2 of the Copyright and Artificial Intelligence Consultation Paper, published by the Intellectual Property Department (IPD) in 2024. For more detailed information, we recommend consulting the original document.

Who is the Necessary Arranger?

Under the provisions on computer-generated works in the Copyright Ordinance, the "author" of a computer-generated LDMA work is defined as the person who undertakes the necessary arrangements for its creation, i.e., the "necessary arranger". AI-generated LDMA works are considered to fall within the scope of these provisions.

However, the identity of the necessary arranger remains ambiguous. It is unclear whether the necessary arranger should be the AI developers, AI owners, AI licensees, or the user who inputs prompts to the AI system. This question has not yet been tested in court. The Intellectual Property Department (IPD) stated in its Copyright and Artificial Intelligence Consultation Paper that the determination will be fact-specific and considered on a case-by-case basis.

The IPD plans to issue guidelines, including practical suggestions and detailed examples, to clarify copyright protection issues related to generative AI.

Originality Requirement in AI-generated Works

LDMA works must satisfy the originality requirement for copyright to subsist. While the Copyright Ordinance does not explicitly define the originality, established case law holds that a human-created LDMA work is considered original if the human author has exercised sufficient independent skill, labor, and/or judgment in its creation.

Since most AI-generated content involves limited or even no meaningful human involvement, the originality requirement for AI-generated works remains open to interpretation. This is expected to be shaped by case law development in the future.

Contractual Arrangement

Copyright ownership of AI-generated works can also be governed by contractual agreements. Some AI system owners claim copyright ownership of AI-generated works while granting users a non-exclusive license to use them. Conversely, in some cases, users are granted copyright ownership, with the AI system owners retaining a non-exclusive license for usage.

It is advisable to carefully review the terms of use or service agreements of different AI tools to understand the specific copyright ownership arrangements concerning AI-generated works.

Copyright Ownership of different types of works

The table below, adapted from the Copyright and Artificial Intelligence Consultation Paper (pages 9-11), summarizes the copyright treatment of various types of works, including AI-generated works.

Please note that AI-generated LDMA works have a shorter copyright duration and more limited moral rights. Additionally, the current Copyright Ordinance does not contain any provisions specifically addressing computer-generated non-LDMA works.

Types	Originality requirement	Creator in real life	Author	First copyright owner	Duration of copyright	Moral rights
Human-Created LDMA Works	Yes	Human author	Human author		Author’s life plus 50 years after death	Right to be identified as the author Right to object to derogatory treatment Right against false attribution
AI-Generated LDMA Works	Yes	Computer (without human author)	Person who arranges the creation of the work		50 years from which the work was made	Right against false attribution of a work
Sound Recordings	No	No statutory restriction excludes a computer as a creator.	Producer		50 years from which the recording was made/released	N.A.
Films			Producer and human principal director		50 years after the death of the last surviving key contributor; or if none, 50 years from the film’s creation.	Right to be identified as the director Right to object to derogatory treatment Right against false attribution
Broadcases			Person making the broadcast		50 years from which the broadcast was made	N.A.
Cable programmes			Person providing the cable programme service		50 years from which the work was made	N.A.
Typographical arrangement of published editions			Publisher		25 years from which the edition was first published	N.A.

Copyright Infringement Liability for AI-generated Content

The Intellectual Property Department (IPD) has outlined in its Copyright and Artificial Intelligence Consultation Paper that when an AI-generated work infringes copyright, liability rests with the individual or entity that made the necessary arrangements that caused the AI to infringe. This principle is consistent with how liability is determined for non-AI-generated works.

Example 1
If an AI-generated work infringes copyright without any prompts from the user suggesting such copying, and the AI developer has the ability to prevent this infringement, the developer should be primarily liable. However, the user may also be held accountable if they copy or publicly share the infringing AI-generated work.

Example 2

Conversely, if a user’s prompts explicitly indicate the desire for the AI to produce an infringing copy of a copyrighted work, the end-user could be held liable for copyright infringement alongside any potential liability of the AI developer.

The IPD plans to issue interpretive guidelines that will clarify how existing legal principles for copyright infringement apply to cases involving AI-generated works. These guidelines will include practical suggestions and specific examples.

Contractual arrangement between AI system owners and end-users

You should also carefully review the terms of use or service agreements, particularly those addressing liability related to copyright infringement from AI-generated works.

Some AI system owners may include clauses that limit their liability to users. In these cases, users might be required to indemnify the AI system owners against any third-party claims resulting from the use of AI-generated works. Conversely, some AI system owners provide protections for users, covering legal costs and damages from copyright claims to foster trust in their AI services.

Ingesting Copyrighted Works as Training Data

The integration of copyrighted works into AI training data raises important legal and ethical concerns, particularly when the outputs compete with original works. Critics argue that this practice undermines the legitimate interests of copyright owners, who deserve compensation for the use of their creations. On the other hand, AI developers assert that Gen-AI enables innovative reuses of data embedded in copyrighted works, claiming that these transformative applications can qualify as fair use.

To address the balance between copyright holders and users, many countries have introduced a Text and Data Mining (TDM) Exception. Below is an update on the proposed exception in Hong Kong.

The Coming Text and Data Mining (TDM) Exception in Hong Kong

To support the development of AI models that require extensive use of copyrighted materials, the Hong Kong government plans to amend the Copyright Ordinance.

The proposed TDM Exception will allow the copying of copyrighted works for computational data analysis and processing, specifically for the development, training, and enhancement of AI models, without needing licenses from copyright owners. This exception will apply to both non-commercial and commercial uses, fostering the growth of AI technology.

TDM is not limited to AI development; researchers also use TDM to analyze large volumes of digital resources, aiming to acquire new knowledge, advance research, and uncover novel patterns and trends.

The proposed exception will include specific restrictive conditions to balance the interests of copyright owners and users. Key conditions include:

The Intellectual Property Department (IPD) will formulate codes of practice or guidelines to assist with the implementation of feasible opt-out options, as well as provide guidance on record-keeping and disclosure.

Ingesting Open Access Works under CC Licenses

Note that using open-access works licensed under Creative Commons (CC) licenses as training data is not always permissible. Some licenses impose restrictions such as non-commercial (NC), no-derivatives (ND), or share-alike (SA), which require more than just attribution.

The following table summarizes the usage of CC-licensed works as training data, assuming that the AI-generated output will be shared publicly and that copyright permission is required (e.g., no TDM exception):

CC Licenses	Credit the Source	Generate Derivative Works	Share Output under the Same License	Commercial Use
CC BY	Required	Allowed	Not Required	Allowed
CC BY-SA	Required	Allowed	Required	Allowed
CC BY-NC	Required	Allowed	Not Required	Prohibited for Commercial Use at All Stages, Including Training and Model Sharing.
CC BY-NC-SA	Required	Allowed	Required
CC BY-ND	Required	Prohibited for Use as Training Data.
CC BY-NC-ND	Required	Prohibited for Use as Training Data.

For more information, please refer to Using CC-Licensed Works for AI Training and the official flow chart.

Friendly Reminder: Do Not Upload HKU Library E-Resources to Third-Party Platforms

Most e-resources provided by the HKU Libraries are governed by license agreements with publishers and vendors. These agreements often prohibit the uploading of content to third-party platforms, including Gen-AI tools, even for educational purposes. Non-compliance with these license terms may constitute copyright infringement, resulting in liability for any damages incurred.

Licensing Scholarly Content for AI Training

Many academic publishers are either announcing or negotiating licenses to use their scholarly content as training data for large language models (LLMs). They claim that this licensing can enhance the accuracy and relevance of AI models, reflecting a commitment to ensure authors' ideas make the fullest possible contribution.

To keep track of these developments, Ithaka S+R has launched the Generative AI Licensing Agreement Tracker, which documents public agreements and analyzes their impact and underlying strategies. It is important to note that this tracker only includes publicly disclosed agreements, and there may be additional undisclosed deals.

In response to concerns about the use of scholarly content without researchers' knowledge, some publishers are adopting an opt-in approach, actively seeking permission before licensing their content for AI training. While the decision to participate is ultimately personal, consider the following reasons for opting in or opting out.

Potential Reasons to Opt-In	Potential Reasons to Opt-Out
Contribute to AI Development: Shape technologies that may benefit society. Royalties and Financial Compensation: Possibility to gain additional financial rewards. Examples here and here. Increased Visibility: Enhance the visibility of your work within AI applications.	Control Over Content: Retain control over how your work is used. Ethical Concerns: Address apprehensions about the implications of AI training. Lack of Transparency: Protect against undisclosed agreements and seek clarity in the licensing process.

Understanding Your Rights in Publisher Agreements for AI Training and Usage

When publishing your work, it’s crucial to understand your rights concerning AI traing and usage. Start by carefully examining your contracts to determine what you have authorized your publisher to do with your work. Pay close attention to the following areas:

Licensing and Royalties: Identify any terms related to royalties and shares of licensing revenue. Make sure you understand how your work can be used and how you will be compensated.
Subsidiary Rights: Look for clauses regarding subsidiary rights, which may permit publishers to license your work for additional purposes. You can negotiate to retain these rights or establish a profit-sharing arrangement for any licensing deals.
Publisher Practices: Stay informed about your publisher’s practices regarding AI licensing. If you find the information in your contracts unclear, don’t hesitate to ask questions or seek clarification.
Rights Assignment: Be aware that if you have assigned all rights to your publisher, they have the authority to license your work for various purposes, including AI training, and can profit from those deals without your consent.
Negotiation for Future Works: For your upcoming publication, consider attaching an author addendum to your agreement that outlines your preferences for AI usage. The Authors Guild offers a model clause that prohibits using your work for AI training without your explicit permission.

CC Signals: A New Initiative from Creative Commons

CC Signals is a groundbreaking framework that empowers content owners to articulate their preferences for how their content can be used in AI training. This initiative seeks to strike a balance between the needs of creators and the demands of AI development, fostering reciprocity, shared benefits, and openness.

CC signals are designed to be interpretable by both machines and humans. The four signals proposed by Creative Commons are:

Signal		Description	Proposed Combinations
	Credit	Ensure appropriate credit is given based on the method, means, and context of use.	Credit
	Direct Contribution	Provide monetary or in-kind support to the content owner for the development and maintenance of assets.	Credit + Direct Contribution
	Ecosystem Contribution	Provide monetary or in-kind support back to the ecosystem benefiting from the use of the content.	Credit + Ecosystem Contribution
	Open	The AI system must be open, adhering to standards like the Model Openness Framework or Open Source AI Definition.	Credit + Open

Credit is a fundamental component of each combination, emphasizing reciprocity and benefiting the broader knowledge cycle. For more information on the signals, click here.

Legal and Ethical Considerations: While CC Signals are not a replacement for copyright licenses, they introduce an ethical layer to content sharing. Participation from AI developers is voluntary but is encouraged through community norms and reputational accountability.

Current Status: CC Signals is in the public feedback phase, with plans to launch an alpha version in November 2025. This initiative primarily targets large dataset holders rather than individual creators.

Concluding Notes

Generative AI technology is advancing rapidly, and ongoing court rulings, legal settlements, and legislative proposal are continuously reshaping the boundaries of copyright protection in relation to AI.

As librarians, we are committed to keeping you informed about the key copyright challenges posed by generative AI. We will continue to monitor technological advancements, legal developments, and policy changes both locally and internationally to provide you with timely and accurate guidance.