Learn Copyright in 5 Minutes for AI Users

2024-10-122026-3-9

An illustration of a high school boy with orange hair studying at a desk while writing in a notebook

Understand the guidelines from the Agency for Cultural Affairs.
Respect the original creator’s views.
Save your prompts for reference.

Introduction

Hello, this is Easygoing.

Today, let's talk about one of the important points to consider when using generative AI: copyright.

Japan is said to be the most lenient country regarding machine learning for AI.

An illustration of a high school boy with blue hair studying at a desk while writing in a notebook

In this post, we’ll take a look at the guidelines from Japan’s Agency for Cultural Affairs, which provides the most detailed interpretation of copyright in Japan.

This article uses machine translations provided by ChatGPT for the English sections, so there may be differences in legal term interpretations.

Please make sure to check the original sources for accurate information.

Agency for Cultural Affairs Guidelines on Copyright

Currently, the most practical interpretation of copyright is presented in the following guideline:

Agency for Cultural Affairs: "Regarding AI and Copyright" (March 15, 2024)

A seminar was held in August based on this guideline, and a video of the event is available.

"Copyright Seminar 2024: AI and Copyright II" (August 9, 2024) - YouTube

Today, I'll summarize the contents of this video so you can read it in five minutes.

This summary is based on my personal interpretation, so please make sure to check the original sources for accurate information.

The Purpose of Copyright Law: Protecting Creators and Fair Use

The purpose of the Copyright Act is stated in Article 1.

Article 1

This law establishes the rights of authors and related rights concerning works of authorship, performances, phonograms, broadcasts, and cablecasts, while paying attention to their fair use, aiming to protect these rights and contribute to cultural development.

Although copyright law is often seen as primarily protecting the rights of creators, it is also designed to promote the fair use of creative works.

An illustration of a high school boy with orange hair studying at a desk while writing in a notebook (second image)

Copyright and Moral Rights of Authors

The rights of authors consist of two main categories:

Copyright: The right to gain economic benefits from a creative work

Copyright gives creators the right to earn financial benefits from their creations. The main rights related to generative AI are:

Reproduction rights
Public transmission rights (e.g., online publication)
Rights to create and use derivative works, etc.

Moral Rights of Authors: Personal rights protecting the creator’s identity

In addition to copyright, the Copyright Act provides for moral rights to protect the personal identity of the author.

While copyright can be transferred to others (e.g., a writer transferring rights to a publisher), moral rights remain with the original creator, even after copyright transfer.

An illustration of a high school boy with black hair studying at a desk while writing in a notebook

Exceptions to Copyright Law

Copyright is a powerful right that restricts reproduction for 70 years after the creator's death. However, there are exceptions allowing fair use of copyrighted works under certain conditions:

Reproduction for personal use (Article 30, Paragraph 1)
Quotation (Article 32, Paragraph 1)
Use for machine learning or other purposes not aimed at enjoyment (Article 30, Paragraph 4)

When it comes to generative AI, Article 30, Paragraph 4 is particularly important.

Article 30, Paragraph 4 of the Copyright Act

Article 30, Paragraph 4

Works may be used, without regard to method, when necessary and not intended to enjoy the thoughts or emotions expressed in the work, as long as this use does not unjustly harm the interests of the copyright holder. Such use includes, but is not limited to: 1. (Omitted) 2. Use for information analysis (extracting, comparing, and classifying data from a large volume of works or information) 3. Other non-enjoyment-based uses, such as the use of works for data processing in computers (excluding the execution of computer programs)

In summary:

If the purpose is not to enjoy the work,
As long as it doesn't unfairly harm the copyright holder’s interests,
The work can be used for information analysis or machine processing.

What Does "Enjoyment" Mean?

Enjoyment refers to the intellectual or emotional satisfaction derived from experiencing a work, like reading a book or watching a movie.

An illustration of a high school boy with blue hair studying at a desk while writing in a notebook2

When Does It Unjustly Harm the Copyright Holder?

An example of unjust harm to the copyright holder would be using paid data without permission for information analysis.

In such cases, using the work for training AI without consent would be illegal.

Large-Scale Learning for AI Training → Not Illegal, As It's Not for Enjoyment

In the case of generative AI, training uses large volumes of data. Since this usage is not for enjoyment, it’s not considered illegal.

Fine-Tuning for Generating Similar Works → Potentially Illegal

Generative AI sometimes undergoes additional fine-tuning with smaller datasets.

In image generation AI, this applies to LoRA creation, for example.

If this fine-tuning is done to generate works resembling a specific creator's style or character, it could be considered enjoyment-based and potentially illegal.

An illustration of a young woman using a computer in a dimly lit room — Backlit shots are tricky

The boundary of what constitutes enjoyment is difficult and will likely be defined over time with more legal cases.

Copyright Infringement Requires Both "Similarity" and "Reliance"

Similarity: The work resembles the original
Reliance: The creator had knowledge of the original

In generative AI, using a proper name in a prompt indicates reliance.

Similarly, inputting a copyrighted work into an image-to-image process would also be considered reliance.

If both similarity and reliance exist, and the work is shared beyond personal use (Article 30, Paragraph 1), it’s illegal.

An illustration of a young woman using a computer in a dimly lit room second image — The keyboard is reversed.

Responsibility for Copyright Infringement in AI-Generated Works

The person who generates the content is generally responsible for copyright infringement in AI-generated works.

Copyright Infringement Is a "Private Prosecution" Offense

Copyright infringement can lead to criminal charges if done intentionally. However, it is a private prosecution offense, meaning the rights holder must file a complaint.

Civil Remedies for Copyright Infringement

If copyright is violated, the following civil remedies are possible:

Injunction
Return of unjust profits
Compensation for damages
Measures for restoring honor

The judgment for copyright infringement by AI-generated works is based on the same standards as regular copyright infringement.

Illustration of a girl with white hair on a large computer monitor — Japanese people think of “AI"

What If You Don’t Want Your Work to Be Used for AI Training?

If you don't want your work to be used for AI training, here are some actions you can take:

Block Crawlers with robot.txt

When collecting data for AI, bots known as crawlers are used to gather information from the internet.

Editing the robot.txt file on your website can prevent these bots from crawling your site.

Upload Content to Password-Protected Areas

By uploading your content to areas requiring login credentials, you can prevent it from being used for AI training.

Sell AI Training Data

If training data is being sold, using copyrighted works without permission would be illegal.

It's recommended to combine these approaches rather than relying on just one.

An illustration of a slender woman with white hair using a computer in a dimly lit room — There’s more to come

Personal Research Findings

From this point on, I'll mention some supplementary findings from my own research.

Crawlers Still Collect Data Even If robot.txt Is Used

While the Agency for Cultural Affairs recommends blocking crawlers with robot.txt if you don’t want your work used for machine learning, this is not legally binding.

AI search engines like Chat GPT, Claude, and Perplexity still summarize websites blocked by robot.txt, indicating crawlers may still gather information despite being blocked.

AI companies are reportedly still scraping websites despite protocols meant to block them

Since content published online is difficult to restrict, the distinction between showing it to humans but not machines is unrealistic. However, using robot.txt is still advisable as a declaration of intent and might be helpful in future copyright disputes.

AI Is Becoming More Closed

The Agency for Cultural Affairs recommends that, in disputes, AI developers disclose the training data used. However, this is unlikely in practice.

In U.S. lawsuits, it has become clear that disclosing training data increases the risk of litigation.

Andersen v. Stability AI Ltd., 3:23-cv-00201 – CourtListener.com

Example: The case where artists are suing over the use of their works in the LAION-5B dataset, used by Stable Diffusion and MidJourney.

Many recent AI models do not disclose their training data, and as legal disputes increase, AI technology may become more closed.

Fan Works Are Technically Illegal

In principle, creating derivative works without the creator's permission is illegal under copyright law.

Tokimeki Memorial Adult Anime Film Adaptation Case - Wikipedia

While many fan works exist, this is often because rights holders tolerate it. It's important to show respect for creators and understand their stance on generative AI.

Even If Guidelines Permit Fan Works, Copyright Claims Can Still Arise

There is a notable case involving copyright and generative AI. If you're interested, you can search for more information using keywords like "copyright," "Vtuber," and "crowdfunding."

Animated illustration of a Vtuber girl distributing at home

Even if a company has issued guidelines permitting fan works, the original creator may still claim exclusive rights to reproduction and pursue legal action.

Guidelines themselves are not legally binding, so even if fan works are allowed, it's crucial to research thoroughly.