Information overload might feel like a modern issue, as it coincides with the cultural integration of social media information sources like Instagram, Twitter, Facebook, and Snapchat — not to mention seemingly ancient sources like email and blogs. While it might feel like a new phenomenon, concern about information overload is as old as humanity. Centuries before the internet became part of our daily routine, people complained about dealing with too much information to manage efficiently. The human brain can only remember a finite amount of material, and we can only manage a handful of information sources at a time. Therefore, it’s only natural that we need systems for labelling, storing, and retrieving almost everything that we create or consume.
For centuries, librarians fulfilled this function, at least for materials that were made available for common use. Public and academic librarians built labels for describing books, installed shelves for holding those books, and constructed catalogs for locating which book was on which shelf. These activities were intended to ensure that visitors to the library could easily locate and manage information.
No one was expected to know where each item of information was located. Rather, the crucial knowledge for a library user lay in understanding how to use the library’s retrieval system, whereas the crucial obligation for the librarian was to explain how these systems worked and to improve them over time. This was an effort to alleviate information overload — people needed the skill set to find information, but they did not need to retain unnecessary details in their already overloaded brains.
As long as information exists, information overload will be a concern. So, for a librarian in today’s digital culture, the unique challenge is not solving information overload, but managing it. There are three vital ways that librarians are engaged in improving our digital culture: seeking to increase access to popular e-books as well as the results of scientific research that are published online; determining reliable information in an era of “alternative facts”; and ensuring that online data and web sites are available in the future, in as stable and reliable a manner as books on a shelf are today. Just as they addressed information overload by building an organized collection of printed books, modern librarians are leading efforts on all these digital fronts.
Libraries have always been at the forefront of the effort to democratize information, as their entire purpose is to provide everyone with access to free, unbiased, quality information whether they can afford it or not. The underlying rationale is that enabling an informed community allows people to make thoughtful decisions about everything from personal finance to political affairs. Ability to pay should not determine who enjoys this privilege. In the librarian’s world view access to authoritative information is a right and not a luxury.
In everyday practice this means that a librarian purchases resources on behalf of a community whose members can then access them for free. For example, a librarian might purchase 20 copies of a popular book to serve a community of several thousand patrons. Systems of defined check-out periods and waiting lists ensure that the book circulates to everyone who wishes to read it, as long as some people are willing to wait.
From a publisher’s perspective, this means only 20 books are sold, as opposed to the hundreds more that could have been sold if everyone bought it individually. To mitigate these losses the library acts as an advertising vehicle for these books, often through author readings at the library and other promotional events. In some cases, these efforts drive publisher sales, and always increase awareness of the availability of new books.
Over time, this dynamic between librarians and publishers resolved into a settled equilibrium, with respect to print books. There was socialized access to resources for everyone, funded by tax dollars to the public library district. Crucially, the library owned any books it bought, so it could loan them out to an infinite number of patrons. In addition, a robust market of booksellers — both for new releases and secondhand products, brick-and-mortar and online — enabled personal book purchases.
The rise of e-books challenged this equilibrium, causing librarians and publishers to renegotiate this deal. Libraries circulate e-books via a download to a patron’s device, which disappears once the loan period concludes. Just as with a print book, someone can renew the e-book if no one else has placed it on hold. However, there is a crucial difference: a print book is tangible and the library owns all of its copies, while an e-book file lives on a server owned by the publisher/distributor, and is merely licensed to the library rather than sold outright. Now the question becomes, how many e-book downloads are reasonable, and at what point should the library license another instance of the book? Naturally, librarians want to allow the maximum number of downloads possible, ideally an infinite number. Publishers, understandably, see the matter differently. This is an ongoing negotiation that will (hopefully) reach another settled equilibrium. Both librarians and publishers have defensible points of view. The key point to note here is that public librarians continue their historic role of maximizing access to information, well into the digital age.
Academic librarians face similar challenges in today’s context. “Open access” publishing refers to a movement to make the results of taxpayer funded research — such as that funded by the US National Institutes of Health — freely available to everyone online. Currently, much of this research is only available at the time of publication, to individuals affiliated with a college or university library that provides it to them — legally available, that is, as file-sharing services such as Sci-Hub provide access to copies of research articles in a way that violates current copyright laws. Open access relies on the fact that the online distribution of research articles is much cheaper than printing and mailing issues of scholarly journals. In this research model, rather than asking subscribers or librarians to pay, research funders or authors pay for publication costs. The result is a product that is legally and freely available to everyone, at the moment of publication.
Librarianship focuses on ensuring equal access to knowledge, which is an inherently high-stakes pursuit, especially in today’s digital culture.
Discussions of open access began in 2003, as the reality of how the internet could revolutionize scientific communication began to sink in. At the time, it was a laughable idea given that scientific journals had existed on a subscription or members-only basis since 1665. However, as of 2017, several open access publishers are flourishing, and subscription-based publishers are now offering many open access publications as well. With that said, the large majority of scholarly publications are still published in subscription-based journals and are licensed by librarians just as they always were. Back in 2003, publishers rightfully perceived open access as a threat to their established business models, which counted on library revenues to remain consistent. For many years, there was much hostility between librarians and publishers about this issue, but most of that has dissipated with the broad consensus that open access will become the default publishing model for the sciences in the near future. Researchers and funders alike perceive open access publishing as a valuable asset to increasing the awareness of their work. Although the tipping point toward open access has not yet arrived, it is now in view, which is a testament to the tremendous advocacy of open access by librarians in the years since 2003.
As this discussion demonstrates librarianship is not all shushing, reading, and making dumb Dewey Decimal System jokes. Contrary to popular belief, librarianship is anything but dull, as these examples — settled negotiations with publishers on availability of print books, ongoing negotiations about the loan terms of e-books, and advocacy for open access — point to the essentially political nature of the profession and its goal of maximizing access to quality information as a public good rather than a private right. Librarianship focuses on ensuring equal access to knowledge, which is an inherently high-stakes pursuit in today’s digital culture, and will remain so forever.
Evaluating the Reliability of Information Online
The internet is probably the most democratic of all human inventions. Many people have experienced unimaginable success and built entire businesses by using self-publishing platforms like YouTube, SoundCloud, Medium, Instagram, Tumblr, etc. as vehicles for self-expression. Self-publishing is a godsend to creatives who do not have the social connections or desire to secure an agent, but do wish to build an audience and express their views. Personally, I have maintained my own blog since 2005 — ages ago in internet years — precisely because it provides an unfettered channel of expression. I spend a great deal of time and effort on some posts, others are much more tossed off. The quality may vary, but the venue is mine to control.
Prior to the web, it would have been much harder for me to have such a publication channel, as editors, publishers, and broadcasters controlled almost all the content offered to the public. While I understand the appeal of today’s free reign online, it is hard to deny that it comes at a cost. Simply put: much of what we see online is clickbait with no basis in reality and is only designed to increase web traffic and advertising revenues rather than to inform. This is certainly true of the so-called “fake news” that pollutes the web today — such as claims that a child sex ring was operating in a Washington DC pizzeria (pizzagate), or that Pope Francis endorsed Donald Trump for President.
“The yellow press” (Flickr)
Just as we should not assume that information overload is an entirely new problem, we should recognize that fake news is old news. “Yellow journalism” was a phenomenon at the beginning of the 20th century, in which newspapers printed exaggerated and lurid stories to drive sales. With that said, in this case, information overload manifests itself differently than before. In the days of yellow journalism, the news cycle was predictable — numerous morning papers, supplemented by a handful of afternoon outlets. Only a few people controlled the printing presses and there was no such thing as tweeting and retweeting. Today, a baseless lie can circulate effortlessly and instantly, while rigorous fact-based reporting takes longer.
It would be one thing if everyone recognized such obvious lies, and treated them as cheap entertainment rather than fact. Indeed, some people do. But as a recent study from the Stanford History Education Group points out, many students actually are not very good at distinguishing true from false. And as chain emails from your wacky uncle have evolved into unhinged Facebook memes on his wall, it is clear that many full-grown adults are not good at this either.
Enter the librarians. In an extension of their traditional role in offering guidance about credible sources for research projects of all types (from one page papers to dissertations), many librarians now teach information literacy, which includes the ability to determine the marks of a credible source. And it’s not just news that requires this ability, as much of the non-news information that we consume, such as marketing for the latest miracle weight loss pill, is also highly dubious. This is an increasing concern with the rise of target marketing strategies and algorithms implemented across the web.
It is very easy to be seduced by an erronenous source, particularly if it confirms a belief we already hold. This is known as “confirmation bias,” and it is a universal weakness.
One handy solution is the CRAAP test, devised by librarians at California State University Chico. This is a straightforward approach to evaluating the reliability of an online source. CRAAP is a short for:
Current — Is the source current?
Relevant — Is the source relevant to the question at hand?
Authoritative — Is the individual or group making the claim authoritative about the topic?
Accurate — Is the information accurate? This can usually be gleaned by cross-checking against other sources to see if the same claim appears in more than one place, and by looking for links that support the original claim.
Purpose — And finally, what is the purpose of the information? Strident and emotional language, particularly if no links to supporting evidence are available, are clues that the purpose is probably to inspire outrage and rash thinking and not to enlighten and inform.
At first glance, the tips in the CRAAP test might seem straightforward, perhaps even obvious. Nonetheless, it is very easy to be seduced by an erroneous source, particularly if it confirms a belief we already hold. This is known as “confirmation bias,” and it is a universal human weakness. So, the CRAAP test is a helpful tool to prevent being misled by inaccurate information.
Even if something you see online passes all aspects of the CRAAP test, there is no guarantee that it is a rock-solid and completely accurate source. True critical thinking requires evaluating all the underlying assumptions and evidence that support an argument, and then justifying the conclusion you reach. The CRAAP test is not a substitute for that hard work, but it does offer a way to filter out irrelevant, inaccurate content. Think of it as a contemporary tool to alleviate information overload, brought to you by your friendly local librarian.
Preserving Websites and Data Sets
Both examples offered so far — increasing access to e-books and scientific research results, and offering tips for separating the wheat from the chaff online — have deep roots in the history of librarianship. As a whole, the profession exists to increase access to information, and librarians have long assumed an educational function about what is credible and what is not. Our digital culture brings these issues into focus in new ways, but they are not new concerns entirely.
Although librarians and archivists have long worked together to preserve physical content, one entirely new challenge is the preservation of digital information published on web sites. Printed materials, particularly those published on acid free paper, can be remarkably resilient for centuries. A web site, on the other hand, can go online today and disappear tomorrow. The reasons for this are numerous — for example, the web manager pay simply stop paying the hosting fee, or maybe they maintain the fee but most of the links on their site become unavailable, or “break” over time. The reasons for such “link rot” are vast, and causes our collective digital record to be incomplete.
Built by the librarians of Harvard Library’s Innovation Lab, Perma.cc is a scalable and flexible solution to the challenge of link rot. This tool creates a permanent link to an archived version of a web page, so that it is always available even if the original web host disappears. In addition, the Internet Archive offers a similar service with its Wayback Machine, which regularly crawls the web and its cached versions of web pages to enable indefinite access to and searching of these resources. While perma.cc and the Wayback Machine are currently in full effect, the long-term solution to link rot may involve the development of code that is automatically embedded into every new web page as a means of long-term preservation. In the meantime, these services perform an essential function of archiving the web.
A challenge related to link rot is that of preserving digital data sets. Before scientists can write papers that describe the results of their research, they must produce the experimental data that leads them to their conclusions. This could be any type of data — earth science models that show how global warming affects the earth’s temperature; oceanographic data that tracks changes in biodiversity in our oceans; or test tube data that shows the effect of different reagents on chemical processes. For the most part, such records are kept on scientist’s hard drives or are served up on rickety web sites. Loss of access to these raw materials would directly harm future research efforts. This is where organizations like Data Refuge, which has librarians among its senior leaders, enter the picture. With a focus on federal climate and environmental research, Data Refuge provides tools and guidance that enable this data to be made available permanently. This is a particularly urgent concern now, given that many of our current political leaders deny the reality of human-caused climate change. Once again, librarianship is explicitly and unabashedly political.
As our culture evolves, librarianship is also evolving. Much of the technical expertise required to facilitate web archiving and data preservation lies in the field of computer science. In fact, some librarians are also computer scientists. There’s even a code4lib community of librarians who code. For the most part, though, computer science and librarianship are two distinct fields with many intersecting concerns. Hopefully, the librarian coder will become a much more common sight in future decades. Whether or not this occurs, the essential values of librarianship — maximizing access to quality information, facilitating critical thinking, and ensuring preservation of information regardless of format — are certain to endure. Taken collectively, these values represent the librarian’s hard fought efforts to create a brighter, more thoughtful, and manageable digital culture.