The most recent version of this article is available at http://www.llrx.com/features/engine3.htm. You will be automatically redirected to that page shortly.
Diana Botluk is an online legal information professional who lectures, teaches and writes about finding law-related information in an online environment. She is the author of The Legal List: Research on the Internet, and a columnist for Internet Law Researcher newsletter, with a column called Finding Information on the World Wide Web. She teaches basic, advanced and online legal research at the University of Maryland, and Internet classes at CAPCON Library Network. She has lectured at many professional conferences, is actively involved in the Law Librarians Society of D.C. and the American Association of Law Libraries. She is a reference librarian at Catholic University Law School, where she earned her J.D. in 1984.
Click Here for a PDF Search Engine Information Chart
(Archived July 15, 1998)
Using a search engine is not really difficult. Type in a couple of search words, press the button, and retrieve a list of thousands of web pages containing at least one of those words. A couple of those pages will be real gems. And hundreds will be pure junk. There must be a way to deal with the deluge of information we retrieve.
A serious Web researcher is on a constant quest for more precision in Web searching. Many of us are used to searching premium databases like Lexis and Westlaw, and have learned how to use them with maximum efficiency. It causes frustration when we don’t get the same stellar results from a web search engine. Although we appear to be going through the same motions, there are some underlying differences causing this frustration. First, Lexis and Westlaw ARE premium services. Their subscribers pay top dollar for the information they provide. They do not contain junk.
The Web, on the other hand, has a high proportion of junk. While you can find many quality resources on the web, they are interspersed with millions of web pages filled with nothing but fluff. Thus, while performing a Web-wide search, it is inevitable that some junk will be retrieved, even though many search engines try to avoid indexing the outside fringe.
Secondly, those of us used to searching premium services are also used to having maximum control over the searches we perform. These services were designed to put the researcher in control. The search language is extremely flexible to allow for more precision, and the databases are a known quantity – we know what we’re searching through.
World Wide Web search engines, on the other hand, were designed at first for the novice, with the idea that the user knew nothing about database searching. The earliest web search engines had the researcher relinquish all but the slightest control to the search engine program itself, with the idea that the engine could construct a better search than the user. But the search engine powers that be have been heeding the voices of researchers who desire more control over their searching. Search engines have been evolving to include more features which allow us to attempt to create more precise and efficient searches.
Of course, no hungry spider awaits those lost in cyberspace. Yet knowing the sundry ways in which search services collect and index data, and how these methods affect retrieval of information, may avoid frustration like that experienced by my colleague and friend.
Common Features of Search Construction
Alta Vista Excite Hotbot Infoseek Lycos Northern Light Webcrawler
Alta Vista is a powerful search engine designed to give its users maximum control in search construction. It provides for truncation of words with a *, and for proximity of words with a ~, which means that search terms must appear within 10 words of each other in the resulting documents.
One of Alta Vista’s most unique qualities is its Refine feature. Refine provides the researcher with a list of related or alternative terms. The researcher can then pick and choose from those terms specifically to include or exclude some of them from the resulting documents. Researchers using Alta Vista can also give certain search terms added weight by placing them in the ranking field on the advanced search form.
Field restrictors provide the researcher with advanced search precision by allowing a search for terms that appear only in certain parts of a document, rather than the entire document. For example, to search for pages entitled The Legal List, use the title: field restrictor followed by the terms: title:”the legal list.” That way, only pages that have that title will be retrieved, rather than all Web pages mentioning The Legal List somewhere in their text.
Alta Vista does some interesting things with language, too. First of all, it allows searches to be restricted to results in a particular language. Secondly, it will translate any of the resulting Web pages into the researcher’s choice of languages.
Excite is the search engine to use when you don’t know anything about search construction. It doesn’t provide as many features for constructing search statements as some of the other search engines, and therefore, can sometimes prove a little frustrating to more advanced searchers. But for those who like to type a few words and see what happens, Excite has some programming working in the background that is designed to help you get what you need without sophisticated search construction.
Excite’s default method of searching is called concept searching, or Intelligent Concept Extraction. Concept searching takes a search term and automatically generates a list of alternatives, synonyms and related terms, and searches for those, too. Thus, you might search for the term “collie” but Excite will also search for other forms of the word, like “collies,” as well as related terms, like “dog.” Concept searching only works on Excite when you type in a single word or a simple string of words. Once you add Boolean operators, like AND or OR, your search turns into a strict keyword search, and the alternative terms are not automatically searched Excite also has a feature called Search Wizard that appears on the results screen. The Search Wizard suggests other related terms you may wish to add to your search.
Excite’s “More Like This” feature allows searches for other web pages that are similar to a particular result. Once you have a list of search results you can choose a page that most suits your needs, then click on “More Like This.” Excite automatically formulates a search designed to retrieve similar pages.
HotBot is a favorite search engine of many experienced researchers. The beauty of HotBot is its form based interface. HotBot’s advanced search form allows searchers to construct a sophisticated search without remembering complicated search language. Just fill in the appropriate portions of the form, and HotBot will put it all together for you. HotBot is an ideal search engine to use when you want to restrict parts of a search to particular fields, since the field restrictors are built into the search form. The form allows a search to be restricted by date, location, media type, or page depth. For example, to search for audio files containing the theme song from Perry Mason, type perry mason theme song in the search box, and click audio in the media type.
Infoseek is another search engine that permits field restrictions to be built into its searches. However, Infoseek’s most unique feature is its ability to narrow searches in levels. Infoseek allows searchers to begin with a broad search and retrieve a certain set of results. Searchers can then perform a second search with the option of searching through only the results from the first search, rather than searching the entire database of web pages all over again. Searches can continue to be narrowed in this fashion until you get the results you want.
Infoseek’s advanced search form assists with search construction. It allows the search to be restricted by domain or location, and also allows it to be focused to one of several topical categories.
One of Lycos’ most unique features is its wide variety of flexible proximity connectors. It allows its users to specify how near two words must be to each other, how far apart they must be, and even the order in which they should appear. Be sure to check the chart for the various proximity options
Additionally, Lycos’ advanced search form allows a search to be restricted by language, domain or title. It provides search construction options directly on the form, making sophisticated searching a little easier.
Northern Light is a search engine for serious researchers, although their online search ability need not be at an advanced level. Northern Light is really a hybrid that allows simultaneous searching through both World Wide Web pages and a special collection of journal articles and reference sources. The search engine part of Northern Light is free, although there is a small fee for each of the special collection articles retrieved, if the researcher chooses to read the full text. Northern Light also allows its searches to be restricted by industry category.
One of Northern Light’s most unique features is its organization of results into customized search folders. Northern Light really provides two results lists. One lists results by relevancy score, like most search engines. But the other divides results into similar categories, like pages from a single domain or a certain type of domain, and organizes them into easy-to-use customized search folders.
WebCrawler is another search engine which provides a proximity connector for more flexible searching. Additionally, WebCrawler, like Excite, will automatically construct a search based on one of the pages from the initial results. Just choose a result that suits your needs, and click on “Similar Pages.” WebCrawler does the work and puts the new search together for you.
Is Yahoo! a search engine? While Yahoo! is often compared to search engines and appears on lists of search engines, Yahoo! is really a subject directory that classifies web sites into topical categories. It is easy to use. A main menu of major categories appears on the first screen, and as you point and click, the categories get narrower until you find the web site that suits your needs.
While Yahoo! is a subject directory, it does have a search engine component. The search box appearing on any Yahoo! page allows you to search through the Yahoo! directory to locate categories or Web site descriptions within Yahoo! that contain the key words you search. If no category or description satisfies the search request, then and only then, will Yahoo! go beyond its own directory pages to try to locate web pages that meet your search criteria. At that point Yahoo! links to Alta Vista and provides results through that service.
Many of the search engines have a subject directory component. Excite Channels, HotBot’s Wired Cybrarian, Infoseek Channels, Lycos Web Guides, and WebCrawler Channels all attempt to classify information into subject categories like Yahoo! They allow researchers to restrict their searches to certain subject areas by using these topical directories.
Other Special Features
Finally, most of the search engines offer access to other special features, from free e-mail accounts, to stock quotes, to UPS package tracking, to weather and travel information. Many will allow you to search through usenet postings and recent news articles. You will also be able to search for e-mail addresses and company information. See the PDF chart for many of the features offered at each of the specific search engines.