Features - Getting It Right: Verifying Sources on the Net

by Sabrina I. Pacifici

Sabrina I. Pacifici is the Founder, Editor, Publisher, Web Manager of LLRX.com.  For the past 23 years, Sabrina has worked as a library director and legal researcher in Washington, D.C.  She is the author of numerous articles on legal-tech issues, many of which are available on LLRX.

Introduction

This guide provides strategies and tools to assist you in the task of evaluating website content. Determinations about quality, value and reliability are certainly subject to the discretion of the reader, and can be based on personal or professional perspective. When considering whether to use a site with confidence, consider that the criteria for evaluating web site content has similarities to the methodology used for print publications, including newspapers, magazines and newsletters.

However, choosing to rely on content published on Web sites does require an extra level of due diligence on the part of the reader. Web sites exist in a more fluid and time sensitive environment than do print publications. Sites are subject to rapid changes in the marketplace. New sites are constantly introduced while the ‘lights are being turned off’ at hundreds of established, content-rich sites each year, often with no warning. For example, the abrupt closing of the well-regarded Industry Standard site (http://www.thestandard.com) in September 2001 was a real loss to its loyal readers. In an unusual turn of events, the site left its archives online, which is a rare occurrence in such situations. More often, content from defunct sites disappears without a trail (other than perhaps via Google’s cached pages or the Internet Archive). On the other hand, sites of dubious value continue to flourish despite the fact that they do not meet quality standards acknowledged by professional organizations, or even readers at large. The web provides access to content that spans the spectrum from the good, to the bad, and extending to the downright ugly. Out of necessity, researchers have to be diligent in their efforts to evaluate sites with a critical eye.

With this caveat in mind, I offer the following information for your consideration. It is by no means a comprehensive analysis, but rather a resource to assist you in the process of separating the wheat from the chaff online. I will not address the issue of e-commerce sites or search engines, as these topics present challenges that are so extensive that they warrant their own article. However, many of the resources I recommend can also be useful in evaluating these types of sites.

Beginning Your Evaluation

In evaluating sites for which there is no immediate recognition factor, it is a good idea to start at the beginning - the web site address. The domain address itself will in many cases provide you with important information; .edu indicates the site should be sponsored by an academic institution, .gov or .us indicates sponsorship by the federal government or a state, .mil is a U.S. military site, .com or .net indicates a commercial site, .org is supposed to be for nonprofits (apparently not always the case), .info is a corporate information site, and sites located outside the U.S. use a two letter designation (ex.: .uk for United Kingdom – see http://www.norid.no/domreg.html or http://www.iana.org/cctld/cctld-whois.htm for a list of domain registries worldwide). Web users will have different expectations of a site’s content based on the designation associated with each of these domains.  However, researchers should be aware that the regulation of domain names can be inconsistent, resulting in variations from specified designations.  A recent article of interest on the newest domain names is New suffixes help increase Internet's population, from February 18, 2002, NandoTimes.

According to NetFactual.com (http://www.netfactual.com), there are more than 30 million sites with Top Level domains (TLDs), and of these, over 75% are dot-coms. [New TLDs are currently being introduced (see http://www.internic.net/faqs/new-tlds.html). This is a staggering number, but when put into perspective, you will probably use somewhere between 10 and 30 sites on a daily basis. In making this estimation, I am not including search engines.

A web site’s name may or may not be useful in helping to establish its credibility, but a well recognized name often does confer a higher level of trust. For example, web sites owned by news media organizations and newspapers such as CNN, NBC, the New York Times and the Washington Post, among many others, trigger a positive response from users as these organizations are already well entrenched in the marketplace. In addition, content on these sites is subject to a reasonably high level of editorial review, thereby increasing the confidence of readers.

Chances are that you will be also be inclined to trust the reliability of web site content published by academic institutions and federal government or state agencies in much the same manner.  This is not to say that all the aforementioned sites do not publish content that may contain errors, display a bias of some kind, or support a specific agenda. In general however, verifying sources from these sites is less of an issue than with sites that indicate no discernable affiliation, and whose domain addresses offer little or no indication concerning the site’s content and origin.

That being said, in addition to the heavy weight players on the web, researchers often choose to rely on sites maintained by advocacy groups, individuals or companies who have worked diligently to established their credibility and reputation online. Researchers are attracted to such sites due to their narrower focus, often encompassing just a few clearly defined content areas. Such sites have earned their readership by consistently publishing comprehensive, focused materials that are authoritatively authored, well edited, and concisely presented over a period of several years (in ‘web time’, that means a lot!).

Is Time of the Essence? Ask Your Librarian

After noting the designation provided by the domain address, the chances are that you will probably be dealing with a .com site. Consequently, you need to decide how much time and effort you want to spend conducting research about a specific site. There are many ways to investigate a site, but if you are fortunate enough to have a law librarian in your firm or organization, then he/she will be an invaluable resource in assisting you with this task. If your organization also supports a research intranet, then you are really in luck.

With extensive experience using the web, law librarians routinely make determinations about the value of sites on behalf of their user communities. These sites generally represent subject matter that includes government and court documents, current awareness and news, marketing, professional activities, continuing education and e-commerce. In addition, law librarians are often responsible for maintaining and updating research intranets that provide links to fee and free web sites that they have carefully vetted for inclusion. Their intranet related work generally includes continually refreshing content and eliminating sites that no longer exist, or those whose ownership has substantially changed, thereby rendering them no longer useful. As an example of how sites can become unreliable, in an article originally published in the January 14, 2002 issue of the Baltimore Sun, the author said that “a cyberspeculator has bought the rights to baltimoremaryland.com and a host of other dot-com city names, and linked all these seemingly geographic domain names to a pornographic Web site” (see http://www.sunspot.net/technology/balte.md.internet14jan14.story?coll=bal%2Dtechnology%2Dheadlines).

The Internet Corporation for Assigned Names and Numbers (ICANN) finally responded to the rising hue of complaints voiced by web users and site owners on cybersquatting by proposing a "grace period" to allow legitimate domain name holders to renew registrations which may have lapsed.  ICANN's White Paper on this issue, Redemption Grace Periods for Deleted Names, was posted on February 14, 2002.

Librarians are acutely aware of the importance of researching domain addresses, especially in light of a flurry of cybersquatting by porn sites that escalated in early 2002. This is a trend that concerns seasoned web researchers as well as casual Internet users. It was precipitated by legitimate web site owners who unfortunately let their registrations lapse. Their addresses were then quickly purchased by so-called ‘porn-nappers’ who then redirected traffic through these portals to their sites. To make matters worse, the porn-nappers then offered to sell back the addresses to the previous owners at inflated costs. This is a real racket, and it has ruined the reputation of some sites.

Beware the unsuspecting users who went to one of these sites believing them to be legitimate resources but actually landed in another world altogether, so to speak. News of this new type of cybersquatting spread on listservs and in online news updates, and librarians quickly advised their user communities to change their ‘favorites’ or ‘bookmarks’ to avoid wasting time on what once may have been useful resources.

Test Driving A Site:  A Checklist

First impressions are often the most important, so your immediate, visceral reaction to a web site is often a fairly good indicator as to whether you want to undertake a more thorough review to concretely establish its reliability and value. Initially, it is always advisable to take the site out for a test drive. This first-look should take you several levels down into the site so that you can adequately put it through its paces.

While you are reviewing the site, consider the following questions, the answers to which can contribute to your final decision to add the site to your ”must-read” list.

1. Is it a site that primarily publishes opinions or commentary?

§         Is it a vanity site?

§         Is it a site sponsored by a law firm or corporation?

§         Is it a blog? 

§         Is it an index that links to other sites and content? 

§         Is it a webzine that publishes original content?

§         Is it a news media or newspaper site?

§         Is it an advocacy site?

§         Is it an e-commerce site?

  1. What domain address does the site have (commercial, academic, government, military, etc.)?
  2. Is there a privacy statement? 
  3. Does the site require registration?

§         Does the site collect basic or extensive data on users? 

§         Do you feel comfortable providing such data?

  1. Is there a copyright notice on the site and its content?
  2. Is the site unique?
  3. Is the site always available and are the internal and external links reliable?
  4. Does the design, organization, navigation and overall “look and feel’ of the site indicate a professional effort? 
  5. Does the site publish original content, and if so, is it well researched, appropriately documented, and updated regularly as necessary?
  6. Is the content comprehensive in terms of coverage of the respective topics, well written and authoritative?
  7. Is information clearly posted throughout the site indicating the date on which content was published/updated?
  8. Is the content primary or secondary? 
  9. Does the site support an archive?

  10. Does the site provide an effective search engine to locate and access all the content?

  11. What are the credentials of the authors, editors, and publishers?

  12. Is there an “About” area on the site that provides data about its ownership, editorial policy, authors, publication schedule, targeted user community, and fee or free status?

  13. Does the site represent a specific cause, issue, bias, or clearly maintain an affiliation with another entity?

  14. What is the focus of the site’s content, and does it meet the parameters of this focus effectively and consistently? Is the content tailored to a specific user community, such as legal, marketing, IT, etc.?

  15. Does the design, organization, navigation and overall “look and feel’ of the site indicate a professional effort?

  16. Does the site load quickly and completely, regardless of browser?

  17. Is there contact information for owner(s), editors and authors?

These are a lot of questions to ask, but it may well be worth your time to do so if you intend to rely on web site data for important research tasks. To synthesize this long checklist in to one that is easier to remember, you can use the acronym for a web site evaluation system created by professor and author Robert A Harris, called CARS (Credibility, Accuracy, Reasonableness, Support).1 These four areas identify the core factors that will help you verify sources on the Web. Please remember that these touchstones are not infallible, and it is not always possible to establish with absolute certainty the integrity of all content that you read and use in any medium.

Investigating a Site: Start With the Domain Registration and the Authors

The one certainty about the Web is that it is always changing, with established sites failing while other sites are regularly introduced. It is indeed a challenge to stay abreast of all the sites that may be useful to you, let alone effectively assess their reliability. As a result, researchers may be inclined to stick with a group of “tried and true” veterans; sites upon which they have come to rely for their timeliness, high quality content, and focus. There is nothing wrong with this approach, but there are many sites that could prove useful to you were you made aware of them, and in turn felt confident referring to them.

Let’s start with some initial steps you can take to locate basic information about a site. You have identified a web site that you find interesting and informative. You have reviewed the questions above and answered many of them, but think it is appropriate to do some background checking. Where should you start?

There are quick, free ways to locate information about a site, or there are more time consuming methods that involve some cost. The path to pursue is up to you, and may also involve considerations such as whether you are recommending specific sites to clients, want to use them for research, or are just investigating them for casual, personal use. In any case, the more informed you are, the more confident you will feel about using a site.

So, let’s start with some initial steps you can take to locate basic information about a site. There are free resources that will often provide domain registration information which includes the following; name, snail mail address, telephone number, fax number, email address, domain server data, the date the registration was entered and the date it expires, and in some cases, the latest date on which the site was updated.

1. VeriSign: http://www.netsol.com/cgi-bin/whois/whois
2. ARIN: http://www.arin.net/whois/
3. InterNic: http://www.internic.net/whois.html
4. For the UK, try Nominet.uk: http://www.nic.uk/
5. SamSpade.org: http://www.samspade.org/

It is important to note that there are instances where web site ownership is purposefully kept anonymous, resulting in no negative ramifications in terms of the site’s popularity and credibility.  For example, in the New York metropolitan area there are sister sites, launched in late 2001, that focus on local politics. PoliticsNY.com (http://www.politicsny.com/) and PoliticsNJ.com (http://www.politicsnj.com/) indicate mutual ownership, but said owners and all the authors have chosen to use pseudonyms. Nevertheless, these sites have managed to scoop authoritative, veteran publications on political stories and have gained a loyal following. It just goes to show, there are exceptions to most every rule. It certainly is advantageous to identify who owns a site, and the credentials of the authors whose works appear therein. But in some instances you may choose to proceed on faith in the site’s content based on your experience of the accuracy and value of their data, and in the general professional consensus.

In the case of many sites however, you will be able to identify the name of the site’s owner(s), and the respective authors of content on the site. Such knowledge may in turn lead you to conduct a series of searches for available data on the owners and the authors. There are a number of tools that can assist you in this task, some of which are free, and others fee-based. The resources I recommend include sites that review web site content, search engines, newspapers, journals and newswire stories, a site that allows you to search through public records, a biography meta-site, professional listservs, and sites that monitor fraud on the Web. Using a combination of several of the resources listed below will provide you with assistance in your quest for background on specific sites.

Here are some suggestions for resources to check:

1. Run searches on Google (http://www.google.com), iLor (http://www.ilor.com/, and AllTheWeb.com (http://www.alltheweb.com) for the web site owner(s) and/or authors whose articles are published on the site you are evaluating.  If you prefer other search engines, try them as well.  It is sound advice to always use more than one search engine (well worth the time and effort in terms of better results.)
2. Use the Google Advanced Search Group feature (http://groups.google.com/advanced_group_search) to search the 700 million record Usenet archive for any postings made by persons associated with a site.
3. As long as you are using Google, check to see how many sites are linked to the site you are researching: [link:siteurl - example: link:www.pewinternet.org].
4. Author biographies that include links to their previous works are worth reviewing to verify their expertise in specific areas, as well as to evaluate their overall skill as writers.
5. Buy, or borrow, a copy of The Essential Guide to the Best and Worst Legal Sites on the Web2 and The Lawyer’s Guide to Internet Research3. (The author of this second book, attorney and professional writer Kathy Biehl, also maintains a web site associated with her book at http://www.fortunaworks.com/fortunaworks/contact.htm). Both of these books provide a wealth of detailed information on hundreds of sites, sponsored by the government, states, courts, commercial organizations, law firms, associations and many more entities, that have been vetted by two lawyers proficient with web research.
6. Subscribe to, or take the time to regularly review, web sites and email update services that specifically review web sites in your areas of interest, and have an established track record. I recommend three sources that provide a search engine to enable quick and easy location of specific site reviews: LLRXBuzz (http://www.llrx.com/cgi-bin/llrx.cgi?function=browsecol2&id=6); LLRX Latest Links (http://www.llrx.com/cgi-bin/llrx.cgi?function=browsecol2&id=5); and ResearchBuzz (http://www.researchbuzz.com/).
7. Run the name of the web site owner through databases available via SearchSystems.net (http://www.searchsystems.net), host to over 5500 searchable public record databases, the majority of which are free.
8. If you subscribe to LexisNexis or Westlaw, use their comprehensive news libraries to conduct a search on the web site name, its owners/and or authors.
9. Post a question to one or more professional listservs to which you subscribe enquiring about members’ experiences with and/or opinions of a specific site.
10. Use the Librarians’ Index to the Internet (http://lii.org/) to locate reviews of the web site.
11. Check the Argus Clearinghouse for a web site review at http://www.clearinghouse.net/.
12. Try the Internet Scout Report (http://scout.cs.wisc.edu/) to check on reviews and recommendations from this experienced group of researchers.
13. Scholarly sites are reviewed on a terrific site called INFOMINE (http://infomine.ucr.edu/).
14. Use online directories and bar association web sites to locate information about legal professionals who own web sites. A detailed list of such resources is available at http://www.llrx.com/columns/reference32.htm.
15. Refer to biography resources - Finding information on the famous, infamous, and obscure, at http://www.ala.org/acrl/resjan02.html.
16. If you are concerned with the issue of fraud as it applies to web site content, check out the SEC’s resource called Internet Fraud: How to Avoid Internet Investment Scams at http://www.sec.gov/investor/pubs/cyberfraud.htm.
17. Also take a look at the FTC Consumer Protection site (http://www.ftc.gov/ftc/consumer.htm).
18. Go to the National Consumers League’s site, Internet Fraud Watch (http://www.fraud.org/internet/intset.htm), where you can search their database.


The Final Analysis

The resources I recommend are ones to which I refer on a regular basis, as I often visit up to 50 different web sites each day to stay abreast of the best of the new, and the tried and true, on the web. However, in the final analysis, many of these resources rely on the opinions of others as a determining factor in evaluating web site content. So in a sense, you are back to square one; evaluating web sites is highly dependent on someone’s perspective. That someone can be another lawyer, a law librarian, a scholar, a friend, colleague, or even a client. As we all know, the opinions of experts are very useful, but we can, and do, disagree with them with varying degrees of frequency. If you have been using the web for the past several years, you probably already have a good sense of what sites you can trust. But when you are interested in sites that are new to you, take the time to ask a core group of questions about the sites, and refer to reliable resources to help you decide whether you are indeed comfortable using these sites.  Feeling confident with sources that you consult daily makes the process of using the web more efficient, more effective, and in the end, less time consuming.


Footnotes

1. A Guidebook to the Web, (WebQuester Series. Guilford, CT: Dushkin McGraw-Hill, 2000. ISBN 0-07-235083-02.)

2.  http://www.amazon.com/exec/obidos/ASIN/0970597037/lawlibraryresourA

3.  http://www.amazon.com/exec/obidos/ASIN/0810838850/lawlibraryresourA/