There's A Lot Going on in the World of High-Tech Libraries
Kumar Percy is the Head of Reserve and Media Services for the University of Texas Tarlton Law Library. He is a member of the State Bar of California with a law degree from Tulane Law School and a MLIS from San Jose State University.
Information Today's 16th Annual Computers in Libraries conference was held in the Washington Hilton and Towers from March 14 -16, 2001. This event was designed to discuss the current uses of technology in libraries and upcoming innovations. With that goal in mind, Information Today organized an exhibit hall, pre-conference workshops, and three full days of programs. For those interested in school libraries there was another half day of post-conference sessions. As if that were not enough, there were receptions at the end of the first two days of the event to let the participants network.
The conference succeeded in bringing together speakers from the information industry and the library world. Together they represented both sides of many library-related controversies such as the future of copyright and database protection.
Search Engines Under a Microscope
Greg Notess is an authority on Internet search engines, and host of one of the most comprehensive websites on the topic, Search Engine Showdown (http://www.searchengineshowdown.com/). For several years, Greg has hosted an annual review of the state of Internet searching. This year Gil Elbaz of Oingo.com, Shawn McCarthy of Lycos, Tim Bray of Antarcti.ca, and Stephen Arnold of Arnold Information Technologies joined him.
Notess started the program by discussing recent trends. The biggest one is that many search engines have died out. Infoseek is gone. Ultraseek, the Open Text Index, Webcrawler, and Magellan are all dead. The Alta Vista Usenet database is gone as well.
Search features are also disappearing. For example, Infoseek used to search the "alt" tags of webpages. "Alt" tags are used to explain pictures to the visually impaired. They are often the best descriptions of pictures on the web. Now that Infoseek is gone it is no longer possible to find webpages using "alt" tag descriptions.
Another trend is that more search engines are charging for inclusion into their web databases. Yahoo and Looksmart charge for consideration. Inktomi has some paid inclusion, and GoTo has paid positioning.
The good news is that search engines are still free or cheap, and they are getting better every year. Search engines are now indexing PDF documents. Until recently PDF's were considered part of the "invisible web" because search engines would not find them.
Google is also indexing hypertext links. This feature lets users search for sites based upon how others link to them. A hyperlink to a webpage often describes the page differently than the website’s own author. Hyperlink searching gives you the chance to find useful pages that are not well described by the author.
Gil Elbaz demonstrated Oingo.com (http://www.oingo.com), which features concept searching. Requests for “jobs” would also lead to employment agencies and resume services. Faced with a query about “dc publications,” Oingo would ask the users if they mean DC Comics, articles about direct current, or publishing houses in Washington DC.
Tim Bray later discussed the difficulties of concept searching. Incorporating an index into a search engine requires thousands of decisions by catalogers. It is even harder if more than one language is indexed. For example, he commented that many languages do not use spaces between words.
Bray also spoke about Antarcti.ca (http://www.antarcti.ca/). Antarcti provides graphical representations of databases. The system is based upon Edward Tufte’s theories that maps are the best way to convey information. To see an example of Antarcti’s work, look at its map of the Internet (http://maps.map.net/start). For more about Tufte’s ideas, look at his books, including The Visual Display of Quantitative Information.
To learn more about the current state of Internet searching look at Notess’ site: (http://www.searchengineshowdown.com/). He is always on the cutting edge of the topic.
Dynamic Websites -- Databases, XML, and ASPs
One of the recurrent themes of the conference was that webmasters should consider creating database-driven websites. They are more dynamic and easier to administer. In these systems the text of the website is housed in a database. A program grabs the information out of the databases and creates a new web page every time someone looks at the site. For example, a library can create a database with library hours for the year. A webmaster in the library could then use a program to make a webpage from the information in that database. Each time someone looked at the site it would automatically generate a page with the current hours.
The main advantage of this system is that non-technical staff can update the website. It also means that the technical staff can redesign the look and feel of the site without touching the information that the libraries want to display. Marshall Breeding, from Vanderbilt University, stated it best when he said that the information of a website should be separate from the presentation. In his program Breeding discussed the technical aspects of creating a database-driven website. To effectively work with a web-based database you need database software, an interface layer to generate the webpage, and another middleware to talk between those programs.
Breeding made the following suggestions. For database software, he recommended Oracle, MySQL, SQLServer, or DB Textworks. Web application servers included Cold Fusion, IBM Websphere, and ORACLE Web Application Server. He also suggested using Perl, Java, C++, PHP or visual basic to create the webpages. ODBC, DBI, or SQL can address the database access layer. Dynamic websites do not have to include all of these layers of control, but there must be some way to get the information from the database onto the web browser.
Creating a dynamic website can be done cheaply if the organization has someone with the technical knowledge. On the other hand a library could outsource the project to an application service provider (an ASP). An ASP can design and maintain the dynamic website, freeing the authors of the site to create the content.
Portability is a key concern with any project. Many ASP’s put data into proprietary software, which are not compatible with other systems. In that case, it would be very difficult to end a relationship with an ASP because the data could not be easily transferred into another database.
One way to insure portability is to only use XML ready systems. The World Wide Web Consortium (W3C) designed XML, Extensible Markup Language, to allow for more interaction between databases. Because it is a standardized language, XML systems are more portable. Stephen Arnold of Arnold Information Technology talked about these and other content management concerns. Roy Tennant also discussed several of XML’s advantages during a half-day workshop program. For more information about XML look at Yahoo’s XML page (http://search.yahoo.com/bin/search?p=xml), or WC3’s own XML website (http://www.w3.org/TR/REC-xml).
Useful Tools: Gadgets in Libraries
Another theme of the program was how to effectively harness the power of high-tech gadgets. The programs discussed e-books and handheld PDAs in detail. Susan Gibbons of the University of Rochester reported on her study of e-books. She asked library patrons to read a book with an e-book reader. To her surprise she discovered that patrons quickly embraced the books; 67% finished one or more books. None of them complained about eyestrain. 35% preferred e-books over print, and another 23% had no preference between print and e-books. Participants enjoyed the ability to increase the font of the book. They also liked being able to read in the dark with the backlighting. Those who traveled found the e-book easier to carry than most paper books. Other advantages of e-books are the ability to hyperlink to a dictionary and to create bookmarks.
Gibbons suggested several uses for e-books. Those with disabilities might like the ease of use. The e-book’s hyperlink ability might help people learn a language by linking a translation to a foreign language work. The voice e-book is a new innovation that will read books aloud. It may help those with learning disabilities because they could listen to the e-book while reading along with a book.
During a segment on wireless libraries, Denise Watkins and Nancy Carroll-Klein from Smithkline Beecham Pharmaceuticals discussed PDA’s, such as Palms. They commented that the little computers are more than expensive schedulers. The power of PDA’s is the ability to easily transport literature. Newer models allow wireless delivery of information. The speakers discussed the possibility that libraries could deliver current awareness news to a patron’s PDA. They also listed several information providers that already create content for PDA's. This includes newspapers, magazines, trade journals, map makers, and travel publications.
However, there are significant problems with PDA's. They are expensive to purchase. Wireless access may be convenient, but it comes with a large monthly bill. Finally, there is very little security. It is not currently possible to encrypt information on a PDA.
Sandy Schlosser of the Consumer Union (creators of Consumer Reports) discussed how her library creates e-mail alert services for its Intranet. The librarians continually scan the relevant literature and create news updates, which are regularly delivered to interested staff members. The information is archived and searchable on the company’s Intranet. One big advantage is that the librarians have become subject-matter experts in the organization.
Napster in ILL Land: Docster
Daniel Chudnov wrote an article entitled, “Docster: Instant Document Delivery” (http://shylibrarian.com/ebooks/articles/docster.htm), where he discusses the possibility of using Napster-like technology in libraries. Computers in Libraries hosted two programs based upon this work. During those sessions, Chudnov explained that Napster is an Internet Relay Chat (IRC) channel. IRC protocols allow the server administrator to control who can log on and what they can transfer over the channel. Napster searches for songs using metadata nametags embedded in each MP3 file. The artist’s name, the song’s title, and other metadata are easily found in a standardized format at the beginning of each file.
Chudnov suggests that we use this technology to share electronic documents through a Napster-like server. Libraries already share documents by scanning them and sending them electronically with the Ariel system. Docster would take ILL one step further. The main advantage of this system is the ease of administration. Administrators could limit usage to only authorized users. They could also block unauthorized file transfers. Additionally, libraries could track usage of each document and then pay copyright fees accordingly. In order to better find a specific document, libraries would have to embed identifying metadata in each one, just like MP3 files. With Napster anyone can listen to all of the commercial recordings in the world. Even the most obscure recordings can be found there. Docster would allow libraries to go one step further and share all the knowledge of the world.
Tomas Lipinski and Roy Tennant rounded out the Docster programs. Lipinski gave the audience a legal overview by explaining secondary and vicarious copyright infringing. He explained that Napster itself does not infringe on copyrights, rather they are liable for leading people to material that infringes on copyrights. He suggested that libraries should also avoid referring people to online sites with infringing materials.
Tennant finished the program by talking about Gnutella and peer-to-peer networking. Gnutella has no nexus of control. Instead people share access to files over the web's HTTP protocols. Roughly 40,000 computers use Gnutella every day. The only way to stop them would be to destroy the web. He explained that he found everything he searched for on Gnutella, including a full text copy of the Harry Potter novels. Tennant believes that Gnutella will end copyright as we know it. He also believes that copy protection software cannot work. For example, hackers recently broke the copy protection programs embedded in DVD films. (For more about DVD copyright protection see the Copyleft organization’s website (http://www.copyleft.net/index.phtml).
Tennant sees libraries playing a key role in a world without effective copyright rules. Much of the information online is inaccurate or could be falsified. Free online books could be fakes. On the other hand, libraries only provide authorized copies of books. In the world of peer-to-peer sharing libraries will remain the comprehensive source of authority.
The Real Dirt about the Conference
As a whole, Computers in Libraries 2001 provided a valuable overview of library innovations. Like any conference, some of the programs were dull. Many of the speakers had viewpoints very different from mine. However, as a whole it was very informative. The conference highlighted those who are experimenting with cutting edge ideas. It provided me with an opportunity to see what non-law libraries are doing with technology. By the end of the conference I felt that I had learned about new technologies and applications.