Google has been changing the world, in case you hadn’t noticed.  One of the ways that Google has done this has been to embark, with the help of many academic institutions and libraries, on a mission to scan ALL THE BOOKS.


Google, and the universities and libraries that have the books, would like to scan ALL THE BOOKS for three basic reasons:

1. To enhance research by allowing searches of ALL THE TEXT in ALL THE BOOKS. These searches will return varying results – “snippets of the text” at Google, but only page numbers  at searches through the universities;

2. To make ALL THE BOOKS accessible to the blind and print-disabled; and

3. To preserve and archive – well, let’s just call them ATB for short.

This project is enormous, unprecedented, and paradigm-shifting. But some authors and publishers would have liked to have been asked permission, and possibly been paid a licensing fee, before their works were scanned.   Google and the libraries, however, believed that their scanning project was fair use under copyright law. Hence, two lawsuits have been making their way through the federal courts in New York.  The one involving Google (the “Google Books” case) was decided at the District Court level in 2010 and is on appeal to the Second Circuit Court of Appeals.  Google won, on summary judgment. The Second Circuit in June decided the case involving the universities and libraries (this is know as the “HathiTrust” case). The HathiTrust won the case in the earlier District Court, and won again at the Court of Appeals.

The reasons this is a good time to review these cases are (1) we’d like to take a guess at how the Second Circuit is going to handle the Google Books case in light of what just it did in HathiTrust and (2) I’m talking to a group of Wyoming librarians about fair use issues in September, so I better know what I’m talking about by then.

We’ve talked a lot about fair use (here and here, for example).  Briefly, the factors that go into determining whether a use of a work is “fair” and protected by free speech are (1) the purpose and character of the use; (2) the creativity of the original work (3) the amount of the original work that is used; and (4) the effect that the new use has on the market for the original work.  These factors are non-exhaustive, and cases are fact-intensive and the outcome is hard to predict. Courts in recent years have, with respect to the first factor, focused heavily on whether the new use is “transformative,” sort of meaning whether the use makes the work “do something” that it didn’t do before.

The courts in both Google Books and HathiTrust found that the use was transformative, and ultimately found fair use.  Google Books allows users to type in a search term or phrase and call up “snippets” of the book that contain those terms – most of you have probably seen it.  The HathiTrust Digital Library, on the other hand, pulls up only page numbers where the term is located, requiring the user to go pull the book from the library shelf to get the full text. The Court in Google Books (the lower, District Court), thought that the “snippets” were still protected by fair use.  The Google Books court also spent 5 paragraphs talking about the extensive benefits to the public of scanning ATB.   The HathiTrust Court of Appeals, who will soon review the Google Books decision, was quite comfortable that a database that returned no content results but only page numbers, was fair use.  But the HathiTrust court carefully pointed out that merely beneficial uses do not become “transformative” just because they make an “invaluable contribution to the progress of science and the arts.” Never before have we had a case where a Court has found fair use based merely on “public benefit” (Napster was, arguably, a benefit to the public).   So the question is whether the “snippets,” under the Second Circuit analysis, will shift the balance.

**I should point out that we are actually talking about doing a fair use analysis on two of the otherwise exclusive rights of the copyright holders – the right of reproduction (to make copies), and the right of display. In both cases, there isn’t much question that if you’re going to create a functional database you pretty much have to copy the whole work, so the fact that Google scanned (copied) the whole work really didn’t matter to either of the Courts.  But what might very well matter is how much of the original works shows up in the display on  In most searches, Google gives you tens of pages back – perfect scanned copies of the text and images in the book. If there is a different outcome in GoogleBooks than in HathiTrust, it will be based on a finding that Google violated the display right by showing more of the original work in its search results than it needed to to execute the fair-use functions of its database.

Next time I’ll talk about the Court’s analysis of making ATB accessible to the blind and print-disabled.  (Lest anyone think of me as unsympathetic, I totally think that it is a good thing for the print-disabled to have the same access to ATB as the rest of us….).