The sudden arrival of generative AI over the past few years – and especially over the past six months, with the introduction of the ChatGPT chatbot – has opened a Pandora’s box of issues that lawmakers, businesses and creators are struggling to address.
Aside from the (admittedly important) questions about what AI could mean for human artists and skilled workers, a number of policy questions have also come to the forefront. One that is particularly relevant to the music business is the issue of copyright.
A number of prominent music businesses have expressed concerns about AI using copyrighted music works to train itself to generate music. At the core of that concern is the fear of AI taking human-created works, and using them to generate music that would compete in the market against those human creators.
Those concerns were perhaps best expressed by Universal Music Group (UMG) Chairman and CEO Sir Lucian Grainge, who laid out the problem while talking with analysts on the company’s Q1 earnings call earlier this year.
“The recent explosive development in generative AI will, if left unchecked, both increase the flood of unwanted content hosted on platforms and create rights issues with respect to existing copyright law in the US and other countries – as well as laws governing trademark, name and likeness, voice impersonation and the right of publicity,” Grainge said.
Grainge continued: “Unlike its predecessors [i.e. earlier generations of AI], much of the latest generative AI is trained on copyrighted material, which clearly violates artists’ and labels’ rights and will put platforms completely at odds with the partnerships with us and our artists and the ones that drive success.”
While AI appropriating an artist’s name and voice – such as the “fake Drake” track that went viral earlier this spring – is a fairly obvious example of a rights violation, Grainge’s second assertion – that training AI on copyrighted materials is a copyright violation – is actually far from clear.
And recently, a government representative in Japan made a pronouncement that could have a tremendous impact on the development of AI, and the future of music and other creative industries. In the view of the Japanese government, training AI on copyrighted materials doesn’t violate copyright law.
During a public hearing in late April, the country’s Minister for Education, Culture, Sports, Science and Technology, Keiko Nagaoka, made it clear that her government’s view is that Japan’s copyright laws don’t forbid the use of training AI on copyrighted materials.
That’s true even if that material exists for expressly commercial purposes, so long as that material isn’t reproduced. AI can even be trained on material hosted illegally online, the minister clarified.
That is as broad and liberal a policy towards AI technology as any country has expressed. And undoubtedly, it poses a challenge to many in the creative industry who have asserted that AI should not have unfettered access to copyrighted material.
A controversial stance
As word spread in recent days of the Japanese government’s stance, it elicited heated responses from both those who support it and those who oppose it.
“Japan has become a machine learning paradise,” declared Yann LeCun, the chief AI scientist at Facebook owner Meta Platforms.
To which one Twitter user responded sarcastically: “Being able to steal intellectual property without repercussions = paradise.”
LeCun defended his position, arguing that “the driving principle” behind intellectual property laws “is to maximize the public good, not to maximize the power of content owners.”
(Notably, LeCun isn’t some sort of AI anarchist — he was one of the signatories on a recent letter warning that AI poses a risk of extinction to humanity if it isn’t properly controlled.)
LeCun’s position might be one that many in the AI community would agree with, but it’s also one that’s likely to run into serious opposition among rightsholders, especially in businesses, like music, that are being flooded with AI-generated content.
EU, China take a different road
However, it’s clear that Japan, at least for now, is diverging from other major jurisdictions in its approach towards AI regulation – something many observers attribute to the country’s desire to be a leader in the field.
In its proposed regulations for AI development, redrafted earlier this year, the European Union struck a sort of compromise, in which AI developers would be allowed to use copyrighted material that they have lawful access to, but would be required to publicly declare what copyrighted content was used in training their AI tech.
The EU has also proposed a carve-out that would allow rightsholders to forbid the use of data-mining on their IP, even if it’s publicly available, so long as they use appropriate mechanisms to make it clear that it’s forbidden – for instance, by including such a clause in their terms of service.
China appears to have gone a step further, declaring that AI developers can’t automatically use protected IP in their training of AI models. As China’s copyright makes no exemption for data mining, this implies that Chinese AI developers will have to get express permission from rightsholders to use copyrighted materials for AI training.
Both the EU and Chinese regulations could pose a serious problem for AI developers, because of the enormous volume of material that’s used to train generative AI. These AI models are all roughly the same, in that they suck up enormous amounts of data to find patterns in them, in order to generate an output that is the likeliest to be what the user is looking for.
(Some academics, such as famed linguist Noam Chomsky, argue that this form of AI is destined to fail, because sucking up enormous amounts of information in order to regurgitate the likeliest pattern is not at all how the human mind works. But that’s a story for another time.)
Take, for example, Baidu’s ERNIE bot. Its training is described by Baidu as using “trillions of web page data, tens of billions of search and image data, hundreds of billions of daily voice data, as well as a knowledge graph consisting of 55 trillion facts and more.” Sifting through all of this data to determine exactly what is copyright-protected and what isn’t would be a Herculean task.
In the US, legislation regulating AI seems to be a little further behind than in these other jurisdictions, but steps are being taken. The US Copyright Office has launched an initiative to examine the copyright implications of AI technology, which will run through the first half of 2023.
Lawsuits to proliferate?
It’s possible that many of these issues could be worked out through litigation, and on this front, the US is no laggard. A number of court cases have recently been launched over the use of copyrighted materials in training AI.
One of the most prominent ones, Andersen v. Stability AI Ltd., is working its way through a federal court in San Francisco and involves a group of artists who sued AI labs Stability AI and Midjourney, along with online art platform DeviantArt, over alleged mass copyright infringement in the development of their generative AI models.
The companies have asked the court to dismiss the plaintiffs’ proposed class-action lawsuit arguing that their AI-generated work doesn’t resemble the works of the suing artists, and that the lawsuit didn’t specify which works of art were violated.
Another lawsuit, again involving Stability AI, was launched against the company by Getty Images in a federal court in Delaware. Getty, a provider of news and stock photos, argues that Stability AI’s tech scraped its website for millions of images, which it then used in its generative process. The lawsuit claims Getty’s logo appears in distorted form in some of Stability AI’s creations. A similar lawsuit, involving Getty and Stability, is working its way through the UK court system.
The outcome of these lawsuits is uncertain, because much of it will depend on how courts interpret existing copyright laws in this new context.
The rights granted to the holder of a copyright are generally limited to:
- The right to create copies
- The right to create derivative works based on the original work
- The right to distribute the work
- The right to display or perform the work in public.
It’s on the first two of these that AI developers could be vulnerable. If training on copyrighted materials means copying that material onto a server where the AI does its learning, that could be interpreted as unauthorized copying the work. And courts could also rule that the work the generative AI creates is a “derivative” of the original material it learned on.
But AI developers also have a solid defense to work with: The concept of “fair use,” an exception to copyright laws which – depending on the country – allows the use of copyrighted materials for the purposes of certain activities deemed to be in the public interest, such as news reporting, education or research.
Generally, students are allowed to reproduce copyrighted materials, to an extent, for their classes. If so, why shouldn’t AI be allowed to do the same when it learns?
All of which is to say that the outcome of these legal cases, and the right balance in AI regulation, isn’t entirely clear today. All the more reason to pay attention to the precedent that Japan is setting.
In the world of digital technology, “one is a magic number.” If no entities are engaged in a given activity, it doesn’t happen. But if one entity is engaged in an activity, it will happen even if there are no other entities doing the same thing.
In the future, if its rules on AI development are looser than those of other countries’ Japan could very well become a “wild west” of AI development, where AI businesses coalesce to do the things they aren’t allowed to do in other jurisdictions.
Nevertheless, even in Japan, some voices are beginning to raise concerns about the seeming lack of concern the government has shown in addressing copyright issues in AI. One key voice is Kii Takashi, a parliamentarian representing Fukuoka’s 10th district, who has called for new copyright rules to address the AI revolution.
“There is a problem from the viewpoint of rights protection that it is possible to use [copyrighted work in training AI] even when it is against the intention of the copyright holder, and … new regulations are necessary to protect the copyright holder,” Takashi wrote on his blog.
Additionally, those advocating for stricter rules have warned that Japan’s lack of regulations surrounding AI could mean something other than an AI boom in Japan – it could, instead, lead to a flurry of copyright infringement lawsuits.Music Business Worldwide