The Generative AI Fair Use Defense Under Google Books

After the Supreme Court’s decision in AWF v. Goldsmith restored what many of us view as common sense to the fair use doctrine of transformativeness, the flurry of litigation against AI developers will test the same principle in a different light. As discussed on this blog and elsewhere, caselaw has produced two frameworks for considering whether the “purpose and character” of a use is transformative. One focuses on differences in expressive elements, like the use of Goldsmith’s photograph to make Warhol’s silkscreen; and the other considers a use made for a unique purpose, like the millions of scanned books used to produce the Google Books search tool.

In Warhol, the Court affirmed that transformative expression must contain some element of “critical bearing” (i.e., comment) upon the work(s) used, and this concept, tied to the different character of work, is distinguished from the use of copyrightable works to create a tool or product that may be considered transformative because it is novel and beneficial for society. Notwithstanding the possibility that generative AI may prove to be harmful to society, the copyright question of the moment is whether the use of many millions of protected works to “train” these models is transformative under the same reasoning applied in Authors Guild v. Google Books (2014).

Because the Google Books search tool could only be developed by inputting millions of digitized books into the database, the argument being made is that this is obviously analogous to ingesting millions of protected works for AI training. And certainly, no one could doubt that generative AIs are novel, even revolutionary. But this may be where the comparisons end under the fair use factor one, which considers the purpose of a use, inherent to which is a “justification for the taking.”[1]

The factor one decision in Google Books turns substantially on the court’s finding that the search tool provides information about the works used. “…Google’s claim of transformative purpose for copying from the works of others is to provide otherwise unavailable information about the originals,” the opinion states. While Google Books “test[ed] the boundaries of fair use,” the court held that the search tool furthered the interests of copyright law by providing various new ways to research the contents of books that would otherwise be impossible. Although unstated (because it would have been absurd), the recipients of the information provided by Google Books were/are human beings. And especially if some of those human beings use the information obtained to produce and/or engage with expressive works, the finding of fair use fulfills copyright’s constitutional purpose to “promote progress.”

Generative AI developers may try to argue that the use of creative works for training serves an “informational” purpose, but unlike Google Books, the information obtained from the ingested works only “informs” the machine itself. A generative AI does not, for instance, provide the human user with new ways to learn about Renaissance painting (or point to Renaissance works) but instead trains itself how to make images that look like works from the Renaissance.[2] Setting aside the cultural debate about the value of such tools, the purpose of the generative AI is clearly distinguishable from the reasoning applied in Google Books.

As discussed in an earlier post, a consideration of AI under fair use should turn on the question of promoting “authorship,” lest the courts become distracted by the broadly innovative nature of these systems—especially for any purpose outside the scope of copyright.[3] In that post, I argued that generative AIs do not promote “authorship,” and I would die on that hill, if the developers’ expectation is that these tools will autonomously generate “creative” works without any human involvement.

For instance, if “singer/songwriter” Anna Indiana is a primitive example of what’s to come—and my understanding is that this is exactly what the AI models are designed to do—then the “purpose” of these systems is not to promote authorship, but to obliterate authorship by removing humans from the “creative” process. As such, the fair use defense cannot apply because without the element of authorship, the consideration is no longer a copyright matter.

On the other hand, as stated in my comments to the Copyright Office, it is conceivable that a human author might “collaborate” with an AI tool to produce a work that meets the “authorship” threshold. For instance, by using a set of prompts that articulate sufficient creative choices in the production of a visual work (or by uploading one’s own work and using an AI tool to modify it), one can make a reasonable argument that this constitutes “authorship” under copyright law. This is one potential purpose of generative AI, and one which could favor a finding of transformativeness under similar principles articulated in Google Books.

But Google Books did not present the court with so many unknown, relevant questions of fact.

The purpose of the Google Books search tool was clearly defined and fully developed when that case was decided in 2014. By contrast, fair use defenses of AI today are presented on behalf of technologies whose development is nascent and exponentially dynamic. Simply put, we do not know yet whether a particular generative AI will promote authorship or become a substitute for authorship—the former being favorable to a finding of fair use, the latter being fatal to such a finding. Here, proponents may argue that so long as there is a mix of uses, resulting in both authored and un-authored outputs, this is sufficient to find the purpose of a given AI transformative, but it seems likely that the current docket of cases will be decided before enough determinative facts can be known.

For now, it is worth remembering that sweeping statements alleging that generative AI training is “inherently fair use” are anathema to a doctrine that rejects such generalizations. Fair use remains a fact-intensive, case-by-case consideration, and one of the many difficulties with AI is that relevant facts are not only evolving, but they describe technologies unlike anything that has been examined under the fair use doctrine to date.

[1] Citing Campbell, informing both Google Books and Warhol.

[2] I recognize that this is an oversimplification of what the AI can do.

[3] i.e., AI’s potential applications in areas like medicine or security should be dismissed as irrelevant to a fair use consideration of generative AIs that make “creative” works.

Photo by: chepkoelena531