Editor’s Note: Via beSpacific, October 13, 2022 – “On Friday, October 7, the Internet Archive filed a reply brief against the four publishers that sued Internet Archive in June 2020: Hachette Book Group, HarperCollins Publishers, John Wiley & Sons, and Penguin Random House. This is the final brief in support of our motion for summary judgment (our previous motions can be found here and here) where we have asked the Court to dismiss the lawsuit because our lending program is a fair use…”
You’d have to be pretty grizzled to understand the reference behind Maria Bustillos’s title “Sell This Book!” for her article in The Nation about online book lending. Bustillos almost certainly was trying to invoke the battle cry “Steal this book!” that adorned works by irrepressible yippie Abbie Hoffman in the 1960s-1970s. (I remember Hoffman’s provocations well, because my parents ran a bookstore when these books were released.) Although neither slogan is well aimed, they both try to express joy in a shared culture that is open to all.
Few institutions carry that joy and openness today as well as the Internet Archive. The Internet Archive is a unique, bold project started by librarian Brewster Kahle, aiming no less than to preserve the digital treasures of the world. In addition to the astounding feat of recording many millions of web pages each week, the Archive contains digitized books, films, radio shows, games, and more. I covered the Archive in an article last year.
But a very different philosophy drives a lawsuit by a group of major book publishers, Hachette v. Internet Archive. We’ll look in this article at what the publishers are trying to protect and why they have to wield a large and heavy cudgel to protect it. The inquiry will lead to a look at how culture has been privatized as it has become digitized—an effect quite opposed to the hopes of most public advocates viewing the Internet and the World Wide Web.
Book lending in a digital age
Most people know that their local library offers digital books as well as physical ones. The digital reading process is rather cumbersome: You have to get an account and make sure you have a device where you can read the downloaded version of the book, which is wrapped with encryption that sometimes resists decoding. Furthermore, the display is rarely as attractive as the physical version of the book, although it certainly offers useful features such as online access to a dictionary.
The location of a digital book is irrelevant, because the Internet has global reach. So publishers can offer digital books themselves, but they choose to license the books to libraries for various reasons: The public is accustomed to going to their local library for resources, and the libraries have an existing infrastructure for managing loans.
For reasons known only to obscure analysts in the depths of each company, the publishers don’t release every book in digital form. The Internet Archive has filled the gap over the past decade by buying physical copies and scanning them into digital formats. According to Kahle, hundreds of other libraries do the same. Currently, 94 such institutions are associated with the archive. Lila Bailey, policy counsel at the Internet Archive, points to a consortium of libraries, Project ReShare, who are developing the digitization of physical books.
These efforts are by no means marginal: The Archive itself currently offers more than 3,250,000 online books through this service. Kahle says the Archive has worked with governments such as India, China, and the U.S. to digitize books.
Under controlled digital lending, libraries make scans of their physical books, and loan the scans in lieu of the print copies under rules that mimic physical lending: Only one person can borrow a digital book at a time, and the scans are encrypted so that no further copying or distribution is allowed. The lending library removes access after a fixed lending period.
A basic principle behind controlled digital lending, which is justified by both first sale and fair use doctrines, is popularly known as a one-to-one “own-to-loan ratio” according to Nancy Sims, an attorney and director of copyright and scholarly communications at the University of Minnesota libraries. If a scanned copy is loaned out online, the physical copy from which it was made is temporarily removed from access.
The COVID-19 pandemic raised the stakes precipitously in March 2020. Thousands of physical libraries had to close. Many books would become totally inaccessible to the general public without the online lending services. In response to the loss of access to print books, the Internet Archive relaxed its strict one-to-one ratio for lending out digital books. This change triggered alarmed reactions from authors and publishers, and forms the backdrop to a dramatic lawsuit that aims to cut off such lending.
Summary of the lawsuit
In June 2020, shortly after the dramatic pandemic-related closure of libraries, four publishing companies sued the Internet Archive not just for distributing books without authorization, but for making copies in the first place. The brief filed by the publishers paints an alarming scene, calling the nonprofit Archive a “major commercial business” (paragraph 92) and suggesting that the Archive’s work “causes substantial harm” to the publishers (paragraph 119). In contrast, the Electronic Frontier Foundation, in a brief supporting the Archive, provided evidence that no commercial harm occurred (section II.D).
“This lawsuit represents a direct challenge to the ability of libraries, as stewards of free information for the public, to provide open, non-discriminatory access to culture in the modern age.” says Kyle K. Courtney, lawyer, librarian, copyright advisor at Harvard Library, and co-founder of Library Futures.
But the impact could be even bigger. If this lawsuit is successful, the increasing number of resources that exist in digital form would be controlled entirely by their publishers. Once the publisher withdrew a work or went out of business, the work would disappear. Such an environment would hollow out not only the principles of first sale and fair use, but potentially the third pillar of free information: the public domain.
A successful lawsuit would also tempt other copyright holders, perhaps including web sites whose contents are preserved for posterity by the Internet Archive, to piggyback on the court ruling and require the destruction of online copies. Sims says that the Archive case “is a huge piece of what we’re dealing with in libraries right now.”
You might think you know the direction this article is heading, but I am about to make a dramatic turn. The negative effects of the lawsuit are amply discussed by a number of journalists, including the article mentioned earlier in The Nation and two others by Bustillos, “Publishers Are Taking the Internet to Court” and “You Can’t Buy These Books.” Slate has also been covering the lawsuit. Libraries themselves have also added their voice to the fray.
My goal in this article is to trace the conflict back 30 years and show why copyright law makes the suit possible. I approached the publishers who launched the lawsuit for comment, and none of them responded. But having worked for a publisher myself for 28 years, and having explored the issue of digital copyright since it first emerged as a public issue, I believe I can express their point of view. As you will see later in the article, though, the publishers’ refusal to engage leaves an awkward hole in our attempts to make sense of their position.
A wave of fear
When my articles about the Internet started appearing in various out-of-the-way publications in the early 1990s, it was largely ignored by mainstream media. If newspapers and magazines did take a peek at the Internet, they tended toward alarmist and distorted reporting that revealed mostly their own trepidation (thoroughly justified, as we now know) about what the Internet might do to their businesses. Typical of that era was Time Magazine’s scandalous Cyberporn cover from July 1995, which fed the hysteria that led to the notorious and unconstitutional Communications Decency Act.
The wall of misinformation was broken by a few savvy journalists, notably Dan Gillmor at the San Jose Mercury News, embedded geographically in Silicon Valley’s dynamic computer culture. Gradually, other reputable publications brought on technically sophisticated journalists who turned out responsible reporting.
As an alternate source of information to established publications, the Internet represented one kind of threat. Gillmor has suggested that an even greater loss to news media was the loss of advertising, including the move of classified advertising to online sites such as craigslist and eBay. But copyright holders also worried about the unique magic of digital media to make effectively infinite copies at near-zero cost. Let’s look at the impact of this unprecedented technological development.
Over several hundred years, copyright has sprouted many odd tendrils to accommodate the needs of authors, performers, publishers, broadcasters, librarians, teachers, and more. Unfortunately, an adversarial, conflict-driven culture grew up to manage these needs.
For instance, copyright holders and their lawyers assumed that the individual indulgence of making a magnetic tape from a phonograph record and sharing it with friends would diminish sales; they tried their best to suppress the practice. Their worldview couldn’t encompass the possibility that their purchasers were engaging in free advertising and might ultimately increase sales by exchanging tapes.
Similarly, software companies complaining of piracy wouldn’t consider that the thousands of unauthorized copies of Microsoft Windows circulating the world could bolster the market for legitimate Microsoft products. And did record or software executives really believe that people had the money to replace all those cassette tapes or “pirated” software with the original products at full cost?
Eventually, compromises were struck. The courts recognized that individuals had the right to tape broadcasts so that they could replay them later. Furthermore, a fee was built into the cost of videotapes and cassette tapes to compensate copyright holders for the copies made on those media.
Fair use, among other things, allows a teacher to distribute copies of a document to their class. But the widespread appearance of photocopiers in the 1970s strained fair use, because suddenly teachers had access to all the latest research in journals. They started distributing packets of photocopied articles instead of textbooks in their classes. The standard copyright response—requiring each student to purchase every journal containing every article they read—would have been ridiculous and unviable. The crisis led to the creation of a new middleman, the Copyright Clearance Center, which accepted payments in bulk from academic institutions for their use of photocopied articles.
One could continue these examples indefinitely. Each technological advance created consternation, resistance, and eventual regulatory resolutions. So today, special rules reward composers and performers when their records are played on the radio. Different payment scales govern playing a DVD to a few friends in your home versus playing the same DVD in the social hall of your church.
One persistent conflict has extended for decades between academics who publish research papers and the journals that publish them. Because academics are reward for citations and public recognition rather than sales, they often chaff at publisher paywalls. Open access in various forms represent a compromise or an alternative business model that accommodates the needs of the authors and publishers.
Dave Hansen, executive director of the Author’s Alliance, says, “While some rights holders warmly embrace the march toward more extensive, radical control over information, many authors recognize that the siloing of information it results in will provide them little individual benefit, while undermining the historical legal structure that has promoted so much learning and preservation of knowledge.”
What’s different about the digital era? All these compromises are spoiled by universal access to online digital information.
The potential for infinite infringement
July 2007 marked a high point in cultural excitement and anticipation: the release of the final volume in J.K. Rowling’s fabulously popular Harry Potter series, whose first six volumes enthralled children and adults alike as they rolled out during the previous decade. I’m sure the highest possible security reigned at the publisher for the seventh and concluding book. Yet somehow, a copy was leaked or stolen and appeared online before the official release.
This true-life tale of digital wizardry shows just how hard it is to maintain control over media once it takes on a digital form. If you let a single copy get out of your control, it’s like releasing a female cat from your house. Even if she escapes only once, you can end up with kittens.
Digital copies are even more promiscuous (unless you engage in the digital equivalent of spaying, which we’ll examine soon.) You can tell a teacher to lecture their students and enjoin them not to share digital copies of their articles. But how many will obey? How many have friends all over Indonesia or Peru who hanker after such copies?
Publishers were naturally worried about releasing books in digital form. As with previous technological advances, the copyright holders focused on the potential negatives even as they tentatively explored the boon that digital media and the Internet offered them. My own employer (O’Reilly Media, which incidentally published a book by the previously mentioned journalist, Dan Gillmor) recognized that digital copies could greatly reduce the costs of selling into markets that were hard to reach before, such as governments and developing countries. But nowadays, O’Reilly Media has stringently de-emphasized the sale of individual books or other resources, offering instead a digital learning site behind an online paywall.
The first sale doctrine has covered physical books and recordings for decades. Under this doctrine, once I buy a physical copy I can do anything I want with it. I can lend it to a library, rip out pages to wrap fish, or make a digital copy for my archive or to lend a friend. But this digital practice worries copyright holders. Once a digital copy is made, the copyright holder has no control over distribution. In fact, if I give the copy to someone, I have no further control either.
Panic over the incompatibility of digital media and control burgeoned among publishers, particularly after the Web became highly popular and raised expectations among the public that information wants to be free, and also wants to valuable (tip of the hat to Steven Levy and Stewart Brand).
And it’s important to emphasize that “free” refers to “freedom,” not just cost. The four freedoms defined by the Free Software Foundation are amply facilitated by digital media.
Negative effects of this freedom have become obvious. Artificial intelligence makes it easy to transform photos or videos, putting any words you want into the mouths of politicians, celebrities, or annoying coworkers. These problems wouldn’t be solved by more copyright controls, because the makers of fakes can always capture images and videos. I’m just saying that we can celebrate the creativity of artists who remix or filter other people’s material, while acknowledging that a lot of people won’t do so responsibly.
So don’t just dismiss copyright holders as Luddites. They have a point. Given that there are reasons (financial and other) to limit digital copies, let’s pinpoint the historical moment when law and regulation re-asserted control in the name of copyright.
Digital white-out: The 1995 white paper
In the early 1990s, The Clinton administration set up a working group under lawyer Bruce Lehman to find new ways of approaching the challenges of digital media. The resulting 263-page report, titled Intellectual Property and the National Information Infrastructure: The Report of the Working Group on Intellectual Property Rights, came out with a plain white cover. Nearly forgotten by the public, Lehman’s group has had an enormous impact on copyright law worldwide. The white paper determined key parts of the historic Digital Millennium Copyright Act of 1998, which were then pushed by the U.S. into laws in other countries.
Lehman’s working group was apparently uninterested in the creative potential of mixing and free distribution on the Internet. The working group defined its mandate as embracing and extending copyright holders’ control over their content. And to carry out the mandate, they relied on a strange sui generis doctrine: that you make an infringing copy of a digital work merely by viewing it on your computer (p. 213 of the report). This idea was taken from a single court ruling that didn’t even have anything to do with publishing or other content industries. According to law professor Jessica Litman, author of the oft-cited book Digital Copyright: “In MAI v. Peak, a computer company sued a computer maintenance and repair company for copyright infringement. MAI claimed that when the maintenance company turned on the computer to perform service on it, the copyrighted computer operating system software was loaded into RAM, and the RAM copy infringed the company’s copyright. The court of appeals for the 9th Circuit agreed that that was infringement. Eventually, two or three other courts deciding cases on similar facts went along. Congress narrowly overruled the decision by enacting a privilege to make a temporary RAM copy when maintaining or repairing a computer.”
In 1995, the new definition of copying defied intuition. When you load a document from a CD or the web, certainly you make a copy in computer memory—and indeed, also on your screen. But there is no other way to read anything digital. Consuming content became, through one page of the white paper, copyright infringement.
Let’s think about the audacity of the white paper’s redefinition of copyright. By the same logic, you make an infringing copy of a book when you cast your eyes on it, because the material enters your brain. Placing an open book on a table or placing a recording in a machine to play it back would also be infringing.
To be fair, copyright law had always distinguished between a creative work and the object (book, recording, etc.) that contains it. If you bought a videotape of DVD of The Godfather, you had first sale rights over the object, but you weren’t empowered to take the images or soundtrack and do anything you wanted with them. The videotape was sold to you, but The Godfather was merely licensed.
Now extend this logic to a digital world: If you pay a company to download a copy of a document or recording, you have first sale rights to that copy and can transfer it to others. But the white paper did not draw that analogy. Nor did the scads of courts who later interpreted the white paper. So Lehman’s doctrine broke with centuries of copyright precedent. Litman calls this redefinition a “neat trick” that lets copyright owners have their cake while eating it too: “a digital file may be a copy for the purpose of infringement liability but not a copy for the purpose of transferring ownership” (page 133).
Step back, though, and reconsider the facts in the previous section about digital media. The white paper working group had little choice. Had they allowed people to do whatever they wanted with copies, every cat would have myriad kitties.
The new copyright doctrine went hand-in-hand with legal and technical innovations. Licenses, sometimes with bizarre restrictions, got attached to each piece of digital content. The companies also invented technical controls, variously known as Digital Rights Management or Digital Restrictions Management. Such controls, which I compared to spaying a cat, create the annoyances I mentioned earlier regarding digital library loans.
For critiques of DRM and other copyright-related technology controls, do a search for speeches and articles by author and activist Cory Doctorow. In “20 Years Of Copyright Wars,” For instance, he warns of digital devices becoming “a locus of surveillance and control” and decries monopoly companies in the privatization of Internet services (a topic I cover later in this article).
The working group’s report promised “to recommend only those changes that are essential to adapt the law to the needs of the global information society” (page 2) and rejected the idea that “a dramatic increase in their [copyright owners’] rights be justified” (p. 17). This article will hopefully help each reader decide whether the resulting expansion of copyright controls was “necessary” or “dramatic.”
Lehman is still active. He founded and currently runs a consulting and educational organization called the International Intellectual Property Institute. He did not respond to my request to comment on this article.
Legal basis for the lawsuit
We have to distinguish between the motives of the plaintiffs in Hachette v. Internet Archive and the legal doctrine that gives their suit legs. These are very different things.
Unfortunately, I can’t discern the motives. The publishers wrap their lawsuit in the rhetoric of moral indignation, but don’t actually cite any harm. After all, the Archive is simply doing what libraries all over the country do with publisher approval. The terms of loans are the same, and the tools used to deliver the books (Kindles, mobile devices, etc.) are the same. The only difference is that the books are scanned instead of being licensed from publishers.
This is why I started the article by setting a historical context concerning the fear of losing control that gripped publishers early on during the digital revolution. This lawsuit, as best I can reckon, is driven by the principle of total publisher control, not by any material injury.
But what legal doctrine do they rest their case on? They’re still harping on the sui generis doctrine, spun by the white paper three decades ago, that any digital copy is infringement. The publisher’s brief cites a string of court cases that overrule first sale in the case of digital copies, notably Capitol Records v. ReDigi.
In the article “Copyright, Exhaustion, and the Role of Libraries in the Ecosystem of Knowledge,” Ariel Katz (law professor at the University of Toronto) reveals an alternate chain of rulings (section III, pp. 88-94) that preserve first sale, also known as exhaustion, and that limit the controls companies can place on products through licenses. These precedents have been inadequately considered by courts that uphold the kinds of restrictions desired by the Hachette plaintiffs.
Sims also cited several cases that preserve rights for those making copies in limited circumstances. Most famously, in the Authors Guild lawsuit against Google Books, courts ruled that Google had a right under fair use to copy books to feed search results. The case was complicated by Google’s unclear commercial and data processing agenda. A related, lesser-known case is Authors Guild v. HathiTrust. In that case, the court allowed HathiTrust not only to offer search results but to give full access to copyrighted books to people with disabilities that denied them access to print books. In yet another case, Google and Amazon.com were allowed to copy images in order to offer thumbnail versions up in searches.
A recent letter signed by hundreds of authors shows considerable support for the Internet Archive project, which the authors see as benefiting both libraries and the public.
In other words, there are two parallel trends in the legal battle of digital copies: one intolerant and the other more nuanced. But the nuanced one tends to get overlooked by courts.
As we’ve seen, the Lehman working group was trapped between tolerating total freedom of content distribution and transformation on the one hand, and permitting total control by copyright owners on the other. They came down on the side of the latter. But Hachette v. Internet Archive is just one step in the privatization of digital content and the Internet.
How fences got erected
During the first two decades of the twentieth century, AM radio developed in an ad hoc, uncontrolled manner resembling the early World Wide Web. Broadcasts came from small stations run by colleges and other nonprofits. The establishment of the Federal Radio Commission in the U.S. in 1926 initiated a shift to commercial broadcasting, which favored entertainment (interrupted by advertising) over education.
Although we can’t trace the privatization of the Internet to a single regulatory body, it has proceeded along a similar path. Bulletin boards and so-called “news” groups preceded the Internet. They were a shared public resource, carried by anyone who chose to store and transmit them. For many years during the early Internet, and then the early Web, content was provided by individuals or small organizations on their own servers.
The Internet fulfilled the hopes of many free speech advocates as a place where gay people, victims of abuse, the non-neurotypical, and other marginalized segments of the population could find support and hold frank conversations. I’m convinced that the modern relative openness of modern societies to all these communities—the movement for diversity, equity, and inclusion, in short—can be traced to the power of early Internet conversations.
People are always people, though, so the state of early Internet content was no golden age. Hate speech, misinformation, pornography, and other scourges traveled freely.
Centralization had a certain inevitability. People can find each other more easily through a centralized service. And centralization allows the service to offer a more robust and consistent interface for chatting and sharing information. Numerous experiments in distributed social media have failed to deliver what Facebook does so well.
One would expect governments to possess the resources to create public Internet spaces free from the commercial agendas of social media. Governments have indeed conducted some interesting short-term projects aimed at particular goals, but have by no means replaced private services.
Crypto experts are concocting systems based on distributed payments, blockchains, rich “metaverse” interfaces, and identity services, with the promise of giving agency to individuals without a centralized system. Ironically, these “Web 3” projects replace the freedom-preserving anonymity of the Internet with iron-clad identities. The services are struggling with the same problems that held back earlier distributed services, but if they succeed, they will certainly be a form of private commerce.
Industry concentration is dense at the lowest level of Internet infrastructure: the actual wires (or wireless connections) reaching into our homes. VICE Magazine finds that 40 percent of people in the U.S. have access to only one Internet provider. That market is what industry observers like to call the “last mile”; long-haul lines have a more complicated industry structure but certainly require large corporations to function. The dangers of industry concentration can’t be addressed by frantic appeals for “network neutrality,” which strive to crudely regulate certain traffic behaviors in a concentrated industry.
Movies and television are increasingly siloed. “Watch this on Netflix,” friends tell one another. “And this other show is on Amazon Prime.” Each streaming service draws you deeper and deeper into its proprietary whirlpool with recommendations and special deals. Their business models build on the old cable TV practice of bundling channels, so that viewers have to pay for a lot of junk programs just to get one popular channel such as ESPN sports.
The contrast of siloed streaming content with traditional book publishing couldn’t be more stark. You never had to walk into one bookstore for Simon & Schuster books and another bookstore for Elsevier books. A single bookstore could easily mix books from all different publishers, including obscure publishing houses tucked away in remote places. This open publishing system was not accidental, but required effort: a carefully constructed network of distributors committed to publishing diversity.
Internet services have progressed to the point where key aspects of modern life, such as web searching and videoconferencing, are commercially controlled. You can install a free software videoconferencing service on your own computer, but doing so requires substantial resources for Internet connectivity and administration. Meanwhile, people base entire businesses on commercial services such as Google Docs (in fact, I used it for reviewers of this article), treating the services as an assumed part of the modern digital infrastructure.
So the centralization of content in the hands of large copyright holders is part of a general fencing-off of culture. Publishers were long slighted by commercial Internet services, but recent policies compensate major publishers for the references made to their content by online search and social media services. These payments are reminiscent of the fee imposed on magnetic tape to compensate recording companies. Such attempts to redirect income somewhat closer to the sources of content (ultimately, one hopes, the authors and recording artists) are commendable, but can’t replace the urgent need for public spaces: The compensation mechanisms substitute oligarchy for monopoly.
No softening of the Lehman doctrine
Lehman handed control over digital content to copyright owners because he couldn’t trust the public to limit distribution and alteration of content. Now that copyright owners have virtually full control, one can only hope that they will exert a little forbearance and relax their control where the benefits to society vastly outweigh any realistic prediction of decreased profits.
Hachette v. Internet Archive grievously disappoints in this regard. Even when there is no harm, publishers will spend large sums to extend, not relax, their control. Legislatures and courts have shown little stomach for forcing compromise on them.
No court has explicitly canceled fair use or first sale—nor could one do so, because U.S. law guarantees them. Instead, the courts have employed obscure language to circumscribe fair use and first sale to the point where they barely exist in digital media. Either Congress or a court could re-establish these rights.
Public access to information loses out most of the time when large businesses take their interests to court. But at the Congressional and Administration levels, the story is sometimes more positive. Pressure on the legislative and administrative levels of government may help to restore balance to Internet practices that careen away from sharing and public participation.
Also, technology keeps evolving. Today’s heavy superstructure of digital control will eventually be undermined by some innovation that probably hasn’t been anticipated. New winners and losers will emerge, and we’ll have to climb new ramparts to champion public information.