Copyright Proposal Threatens to Undermine Europe’s AI Ambitions

Next week, the European Parliament plenary will vote on one of its most controversial files to date – the Copyright Directive. The new legislation touches upon a number of thorny issues that have been the subject of intense debate, yet its Article 3 – aiming to regulate Text and Data Mining (TDM) – is often overlooked.

The current proposal would create a narrow exception for TDM, exclusively applying to non-commercial research organizations and only when they are engaged in purely “scientific” research. Such an approach would render the EU an international outlier and undermine the Commission’s ambitions for transforming Europe into a global hub for the development of artificial intelligence (AI). If the text is adopted in its current form, European AI researchers would find themselves at a strategic disadvantage to their global competitors. Indeed, recent legal developments in Japan highlight just how out-of-step the Commission’s proposal is with international norms.

With economists estimating that AI will create up to $15 trillion in global value and AI’s impact in almost every sector, governments around the world are racing to put in place policies to ensure AI benefits their citizens. Over just the past 18-months, more than two dozen countries have published national strategies or roadmaps to outline policy approaches for meeting their AI ambitions. Because data is a critical input for the development of many forms of AI, it is no surprise that many of these plans involve a close examination of how strategic data sets can be made more widely available.

Sound data innovation policies can help spur AI innovation by ensuring information can move freely across borders, facilitating value-added data services, and opening government data assets to AI researchers. Recent news out of Japan demonstrates another critical component to the development of AI – modernized copyright laws.

On 28 May 2018, the Japanese Diet passed the Copyright Law Amendment Act, a law that some observers have speculated could transform Japan into a “machine learning paradise.” Article 30-4 of the new law clarifies that the Copyright Act allows for the “exploitation” of any copyrighted work for the purpose of performing “information analysis,” including the “extraction, comparison, classification, or other statistical analysis of language, sound, image, or other elements of which a large number of works or a large volume of information is composed.”

While seemingly esoteric, this new provision places Japan at the vanguard of the great AI race. The provision will enable AI developers in Japan to perform machine learning using analytical techniques such as TDM.

Developing algorithms that power AI systems requires researchers to develop mathematical models that are trained using vast quantities of data. TDM is the process by which such training data is generated and used to train AI models. For instance, developers have now created a “Seeing AI” app that helps people who are blind or visually impaired navigate the world by providing auditory descriptions of objects in photographs. Users of the app can use their smartphone to take pictures, and Seeing AI describes the people and objects in the photograph. To develop a model capable of identifying the objects in a picture, the system was trained using data from millions of images depicting thousands of common objects, such as trees, street signs, landscapes, and animals.

You would be forgiven for wondering how any of this implicates copyright and why an exception for TDM is necessary at all. The issue is that the machine learning process may involve the temporary creation of machine-readable reproductions of the material used in machine learning. (In the case of Seeing AI, that would be the millions of photographs used to train the computer vision models that enable the app to identify objects.) Because the incidental copies created as part of the machine learning process are made for the sole purpose of analyzing the factual (i.e., non-copyrightable) information from lawfully accessed content and are unrelated to the creative expression embodied in the underlying works, they do not substitute for the original or in any way undermine the legitimate interests of a copyright owner.

In the United States, reproductions used for analysis or research are considered a fair use. But in legal systems that do not have a flexible fair use provision (civil law systems generally), there can be some uncertainty about the permissibility of such activity. Indeed, in just the last couple of years, we’ve seen the governments of Australia, Canada, Singapore, the European Union, and Japan grapple with the issue. The stakes are quite high, because an unnecessarily narrow copyright exceptions can hurt competitiveness in the data economy and discourage AI-related research and development. Fortunately, Japan’s new law provides a great template as they move forward with their copyright reform efforts.

Japan’s new legal clarifications includes three key features that will be critical to helping Japan realize its AI ambitions:

User Agnostic: Article 30-4 of the Japan copyright amendments permits all users to perform machine learning techniques such as TDM, irrespective of whether they are associated with an academic institution or a commercial enterprise. Japan’s exception will unlock the innovative potential of both the public and private sectors and encourage public-private collaborations that are key to driving innovation.
Purpose Agnostic: Article 30-4 likewise permits users to engage in machine learning for any purpose, whether it is a scientific research to “analyze” data from newspaper archives to help predict outbreaks of infectious diseases a year before the occur, or a small business “evaluating” social media sentiment data to improve customer service. Japan’s copyright rules will enable every sector of the economy to reap the transformational opportunities of AI.
Technology Agnostic: Crucially, Article 30-4 is also technologically agnostic, allowing users to engage in any type of “exploitation” that is necessary to the performance of machine learning. This is critical because it ensures that any copies necessary to an AI project will be covered by the new rules, including the creation of machine-readable copies that can be digitally analyzed and maintained for data validation purposes.

With countries around the world competing to create policy environments that will enable their economy to reap the full benefits of the AI revolution, Japan has provided a tremendous roadmap.

Learn more about BSA’s AI initiatives at bsa.org/AI

Tags: data emerging technologies EU Europe government

Author: Christian Troncoso

Christian Troncoso is Senior Director, Policy for BSA | The Software Alliance. He works with members to develop BSA policy on a range of legal, legislative, and regulatory issues, including copyright, cybersecurity and privacy. Prior to joining BSA, he served as Senior Counsel for the Entertainment Software Association, where he advocated on behalf of video game publishers in the United States and before foreign governments. Troncoso earned an LL.M. with a focus on intellectual property from The George Washington University, a J.D. from the University of Denver, and a bachelor’s degree from the University of Richmond. He is based in BSA’s Washington, DC, office. View all posts by Christian Troncoso >>

Share this:

Author: Christian Troncoso

Leave a Reply Cancel reply