Dennis Kennedy is a lawyer in the Intellectual Property and Information Technology Department of Thompson Coburn, LLP in St. Louis. Many of his articles on Internet and technology topics may be found at his web site.
Link to LLRX.com Marketing Resource Center for all previous issues of the Internet Roundtable
Web Site Analysis, PC Magazine, June 27, 2000, (five web site analysis tools compared)
The Results Are In, PC Magazine, March 10, 1998, (introduction to web site analysis tools)
Web Site Statistics: How, Why and What to Count
Charles Morris, There’s Gold In Them There Log Files
Webalizer (as used by VisaLaw.com)
Jerry Lawson (JL): Our guest panelist this month is LaVern Pritchard, one of the country’s true “E-Lawyers.” LaVern operates Pritchard Law Webs in Minneapolis and is featured in the ALI-ABA CLE Internet for Lawyers instructional video. One of his innovative projects is the LawMoose! legal search engine and Minnesota legal community knowledge server, an MVP Site Award winner this year. Not so coincidentally, he is an expert in this month’s topic, how lawyers can benefit by analyzing web site logs, or “traffic reports.” Before we get into too much heavy-duty analysis, let’s make sure everyone knows what they are. Brenda?
Brenda Howard (BH): Web site logs are statistics that are kept in a database. These statistics are then converted into reports and let a Web site owner know a great deal about the activity on their Web site. You can get a basic client-side script that will tell you how many “hits” your site is getting all the way up to server-side software that provides so much information that you need a professional to assist in analyzing the data.
LaVern Pritchard (LP): I think firms are spread all along a continuum. Savvy legal web publishers have always valued the insights they can get only from careful analysis of their traffic reports. Some firms have been on the web five or six years or more. Others are joining the legal web community every day. You’ll find novices and veterans. This is one area, though, where novices can quickly get up to speed and gain the same insights others have long valued. It’s hard to get a broad overview of what law firms as a whole are doing, since there is no centralized collection point for this sort of information.
JL: Operating in a vacuum, not knowing what others are doing, is a handicap for many law firms. Have you come across any specific law firm web site usage information?
LP : We conducted three surveys in 1999 based on data from Minnesota law firm web traffic reports. I suspect if we repeated it today, we would see about the same thing. We wanted to identify the most visited law firm web pages in Minnesota. The only criteria for participation were (1) you had to be a Minnesota law firm with a web site (2) you had to have data and (3) you had to be willing to share just enough of that data so we could produce rankings. We found some firms up to speed on their statistics and eager to participate. Others were interested but did not have statistics. Some had them but did not know how to read them very well. So we ended up asking a good number of follow-up questions to get the data we needed. One large firm declined to participate, saying it lacked confidence in its statistics. Maybe our discussion today can help more firms more quickly come up to speed on this subject. It’s not really a technical topic. It’s a site management topic.
BH: That’s true. It isn’t rocket science. However, it is like a new language and it contains measurements that are foreign. I generally work with my clients for 3 months in analyzing their data before they start analyzing it themselves. I find it easier to have them study five areas of information and then add on to that. We keep adding until there are able to get a complete picture of the performance of their web site. I also caution clients about using one month’s worth of data. They really need to track the statistics for a 3-month period before they will see viable trends that provide valid feedback.
DK: Great observation. Let’s go back to the basics for a minute. By site analysis or traffic statistics I do not mean one of those little web counters that you see on some sites. Using a web counter establishes very quickly that you are an amateur on the web.
JL: There are technical problems with visible web counters. For one, they tend to make your page download more slowly.
LP: That’s right, but there are other problems. It seems like when I visit a site with a hit counter, I almost always find a site that is telling the world it isn’t visited very often. Unfortunately, that tends to scare people away. It’s like walking down a deserted street when you know everyone else is going another direction. I think visitors click onto a site, and give it about five seconds to convince them they ought to stay a while. A hit counter is usually a signal to leave or at least not to return. You don’t see a hit counter on Yahoo’s home page. If it was cool, they’d do it. Serious sites have more sophisticated and private means, though their statistics, of getting the information they want. Hit counters seem stuck in Web time at circa 1996.
BH: All true, and there is another drawback: visible web counters are not accurate. That’s an obvious problem, but there’s another one. No one has said it diplomatically, so I’ll be the one to say it. Using a visible web counter is simply not professional. One does not post their financial statements on the window of their office and it’s not professional to show people how many – or how few – people actually visit your site.
DK : Finding a better source of traffic data is a necessity if you are serious about your web site. If a third party hosts your site, some form of regular traffic reporting is usually part of the standard hosting plan. Check your hosting agreement. Most web hosts will provide statistics upon request or for nominal additional costs. Some web hosts even will provide real-time access to your web statistics over the Internet.
LP: While there are many factors to consider in choosing a web site hosting company, one of them is clearly the particular web statistics package. It’s one area that can distinguish one hosting company from another. You want a host that has customers who demand good data about their sites. Be sure to look at a sample report before you sign up.
BH: As a general rule, the Web hosting company will provide the basic statistics and these are good enough for small to medium sized sites. If you have a super cheap monthly Web-hosting package, you might not have access to any Web statistical data, so this is one of the considerations when weighing price over quality.
DK: There are free traffic analysis tools such as HitBox, but they usually have some limitations, mainly that you can track traffic only for one page. They also have a reasonably-priced “premium” version that overcomes the limitations. That seems like a great deal if your hosting provider doesn’t provide this service.
JL: Another free service is eXTReMe Tracking.
BH: These are good and I’ve used HitBox before. It’s easy to add to your site and does provide valuable information. However, and it’s a big one, keep in mind that these free services get to “keep” your data. Meaning that they are collecting it for you and they also have access to it. HitBox actually aggregates all their statistical data and generates larger reports. They then sell this information to others. It won’t say, creativewriting.com has “x” amount of visitors, but they do use the data to track Internet-wide traffic, usage, trends and patterns. This isn’t a “bad thing” and I’m glad that someone is trying to consolidate the statistics on many web sites, but you should know that others would use your data.
LP : Greg Siskind, publisher of Visalaw.com, makes his traffic statistics public. Every year or so, I seem to develop some curiosity about how his site is doing or want to direct people to his statistics. Last time I visited I noticed he was using a free program called Webalizer that produced some nice reports. There are more out there, too, of course. Just do a web search or find a helpful review on the Web.
BH: LaVern, I would post mine too – if I had as many visitors as Greg’s site. Some sites deserve “bragging rights”. (laugh) Seriously, there are times when this is appropriate, but only a few. The government has a requirement to provide performance related data and we post the Web statistics on intranet sites to meet this performance requirement. In an agency that has 7 branch offices, they can see which office is using the Web based manuals, applications and information the most. This is an incentive for each branch to utilize the resources available to them. It generates a little “friendly” competition within the organization.
DK: Law firms hosting their own web sites (there are some such firms, but not many) can analyze traffic data in a number of ways, including custom database reports. I once was involved with a self-hosted site that used Microsoft’s Internet Information Server software. IIS provides great capabilities for analyzing traffic and creating custom reports, but IIS users must be very aware of the historic pattern of security risks with using IIS and be very vigilant about new security issues. The “big name” in web traffic analysis software is WebTrends and most people are used to seeing standard WebTrends reports. Someone once told me that they were at an evening meeting and noticed several people checking standard WebTrends reports during the boring parts of the meeting. Those standard reports are pretty comprehensive and provide a lot of useful information.
BH: WebTrends is a leader in the market. It’s relatively expensive – coming in at just under $1,000 for a single server license. This is definitely worth the price based upon the information that you obtain. One client recently told me that he gets a headache every time he has to analyze the reports – just because there is so much data. It’s really a statistical tracking program for larger sites, though, and it’s simply not cost effective for smaller sites – unless your Web hosting company provides it as part of their package. Many do.
LP: Here’s another statistics program to keep in mind. While we’ve had WebTrends in the past, our current hosting company provides LiveStats from DeepMetrix (formerly MediaHouse.) I like the ability to see what is happening at the site in real time and to manipulate the time period and level of detail of reports to get just the view of data that I want. It’s a great tool both for current and trend analysis. It is offered by many hosting companies and is well worth looking at.
JL: Webtrends seems to be particularly popular. Funnel Web and Flashstats are other alternatives to it, as well as LiveStats as LaVern mentioned. To make this more concrete for people who haven’t worked in this area, see the illustrations below. The first is an example of what a single raw web log entry looks like, along with an explanation. A real web log will contain thousands of such entries. The second is an example of just one of the many charts that a program like WebTrends creates to help you make sense of the raw data.
DK: The traffic reports let you know if your site is working, how well it is working and what parts of it are working. The typical traffic report will let you know how many hits you’ve gotten, how many page views and how many unique visitors. It will also tell you what sites are referring traffic to you, paths through your site, some demographic information and even the browsers being used. Everyone tends to be most concerned with the number of hits, but it’s important to understand the difference between hits, page views and unique visitors and what the numbers tell you and what number may be the most important. Jerry, would you tell us the difference?
JL: Good idea. People can’t really understand what the log numbers mean without knowing what these terms mean:
- Hits are requests for the server for a particular file. Because most web pages are made up of numerous files, a single person, visiting a single page of a web site, could easily run up 20, 30 or more “hits.”
- A Page View is one person looking at one web page. This is sometimes referred to as a “Page Impression.” This is measured based on the basic text file, and doesn’t count images, CGI scripts, Java applets, etc.
- A Unique Visitor means one person, visiting any section of a web site.
- A Session means one person visiting one or more pages at a site during one visit. There is no generally agreed period of time that makes up a “session.”
- The key point to keep in mind is that “hits” are a misleading measurement of site popularity. Brenda, are there any other terms we need to define?
BH : No. You’ve just provided an excellent tutorial on the basics. You’ve also made an excellent point about “hits”. Yahoo and other large sites don’t even count the number of “hits”. Advertisers are only interested in how many people really visited the site and saw their advertisement. Therefore, hits aren’t counted and Unique Visitors or Page Views become the numbers to watch.
LP: I never accept reports of how many “hits” a site receives at face value. I’m always concerned that the report is inaccurate or that “hits” are reported to inflate traffic. If you want to boost your hits, just add some images to your pages. You won’t have any more visitors and they won’t view any more pages. But you’ll have more “hits”.
DK: The most useful number for me is actually the ratio between page views and unique visitors. That gives me a rough idea of how many pages that people are looking at. Some people refer to this ratio as the “stickiness” of your site. Do people stay on your site for a while or do they look at a single page and move on? The higher the ratio the better. On the other hand, you might expect this ratio to be low on a “links” page and would not be too concerned with a low figure.
JL: Another thing to watch out for is “404 errors,” for “page not found.” Are you getting a lot of these? If so, figure out what is causing the problem, most usually bad links due to moved or deleted pages, and stop it. Few things do more to make a bad impression on visitors.
BH: I also like to track the “path” that visitors take as they travel through a site. If they all go to a certain area, but never visit another area, I know that we need to evaluate the less visited area and figure out what’s wrong with that section of the site. This is a good way to evaluate the effectiveness of each section of the site. If the section turns out to be the copyright page, then I understand why it’s not visited, but if it’s a new area of law that I’m trying to emphasize, then I have a problem. LaVern, how do you use the statistics?
LP: Here are a couple of things I like to review. First, what are the most popular and least popular pages on the site? Has that changed over time? Second, I’m interested in knowing which pages serve as the main points of entry for traffic. You might think your main entry point would be your home page, but that is not always true. The more entry points, the better. Sites should be porous, letting the outside world in at many points according to interest.
JL: Dennis’s site, http://www.denniskennedy.com, is a great example of this. He told us in an earlier Roundtable that only 15% of his visitors come in through his home page. Imagine how much less popular his site would be if he were not effectively using your “porousness” theory.
BH: It is surprising to many people when visitors don’t come in through the home page. I visited a site last week and they prevented anyone from coming in through any page except the home page. Then you had to keep clicking on “next” to get to other pages. I shut down the browser window and gave up on that site. They obviously wanted to control the distribution of information that a user received. That’s simply not what a user wants.
DK: Two other stats that I like to look at are the referral sites and the international audience. It will surprise you how many hits you get from countries around the world. The referral site information is useful to help you assess the value of your placement in search engines, the value of links from other sites and the value of links from ads or from articles or other materials published elsewhere on the web. Statistics are cool, but the important thing is what you do with them. Let’s talk about that.
LP: I study referring URL traffic closely for new developments. I want to know what drives traffic to a site. Pages that start sending traffic to your site are the most interesting. The Web is a big ecosystem. These pages elsewhere on the Web that add links to your site are real time evidence of how the ecosystem is evolving. Referral business is important to lawyers. Referral traffic is just as important to web publishers. I like to say that links are a new form of “currency.” You count and classify your wealth by studying referrer data.
JL: Referrers can be static pages or search engines. The logs will tell you not merely which search engines are referring visitors to your site, but what search terms they are using. Knowing what is working can help you fine tune your site.
BH: These are also great for evaluating link exchanges. If you’ve agreed to put someone else’s link on your site, check to see if anyone is coming to your site from their site. It won’t take long to figure out whether or not the link exchange was beneficial.
DK: Traffic reports give you feedback on what is working and what is not working. You might identify problems with navigation or the structure of your sites when you see how people actually use the sites. From time to time a page on your site will surprise you and be more popular than you guessed. The traffic reports can help you decide whether to put more effort into a particular page or create similar pages. I know of one attorney who started to put more checklists on his site after, to his surprise, a simple one page checklist became the most-visited page on his site.
JL: That’s a perfect example of how traffic analysis is supposed to work.
BH: Here’s another example: what would you do if your logs show that you have a large number of visitors leaving your main entry point after 30 seconds or less?
JL: The first thing I would look at in that situation is the size of the home page. It sounds like the page could be taking too long to load, which tends to drive visitors away fast.
LP: Here’s just one example of how you can use your statistics to determine if a change in your site actually improves it. A couple of years ago, we decided to improve the search function on our site. Each page had been displaying a prominent link to a special search page. But we thought it would be better to actually display the search box on every page. That’s a small change, and it’s not a big deal, but it shows why statistics are so useful. We put up our new site, then watched our statistics for the next few days. Visitors immediately started searching twice as often as they had before. And it stayed that way. Our statistics told us right away that we more than achieved the goal for this redesign.
JL: It is very smart to put a search engine box on every page of a content-heavy site. I reached this conclusion a long time ago, but on the basis of experience and judgment, not numerical analysis, like LaVern used. It’s good to hear that the numbers back up the theory.
LP: Once you develop some comfort with your statistics and a baseline for what is “normal” at your site, you can answer all sorts of interesting questions. For example, say you send out an E-mail newsletter with links to articles on your web site. There’s no need to speculate about whether the newsletter “worked” as you hoped. Just study your web site statistics for the period right after it goes out. If there is an impact, the statistics will show exactly what it is. The point is that you can use this analysis technique time and time again to make your site and your overall marketing program better.
JL: Well, you’re right. General impressions, experience and judgment will only take you so far. Log analysis can show useful patterns that otherwise would not occur to you. Here’s another example: Traffic analysis can show you the last page that people visit at your site, the “exit page.” Suppose this shows that a high percentage of people are leaving your site from a “Favorite Links” page? Unless you are operating your site solely as a public service, the link page that is draining off so many of your visitors would be a good candidate for elimination.
LP: I agree with you about exit pages. On the Web, you can’t really control where people go or what interests them. But if you know their interests, you can respond accordingly. Maybe you can figure out what’s so attractive on that “Favorite Links” page and substitute your own information in its place.
BH: It’s also a way to determine your return on investment. For example: You might have paid to print off 100 brochures the year before. Now, your Web statistics show that you have 500 people a month visiting your site. What would it cost to provide print versions of your brochure for 6,000 people over the same 1 year period? Tracking your exposure to visitors on the Web can translate into direct “hard cost” savings. At some point, you have to evaluate the effectiveness of your investment and your Web statistics are the key to that evaluation.
DK: Without traffic analysis, you simply cannot answer the question of whether your site is working or whether your promotion of the site is working. As Brenda’s example shows, the numbers will help you get a handle on return on investment and help you improve the site. Since so many hosts give you statistics for free or for a modest cost, I can only shake my head in disbelief when someone tells me they don’t know what kind of traffic they are getting. Know your numbers.
A Raw Log File Entry
|126.96.36.199 – – [23/Sep/2001:17:57:29 -0400] “GET /chess/index.html HTTP/1.1” 200 13900 “http://www.chesslinks.org/chess/hof/reinfeld.html” “Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)”|
What It Means
“GET /chess/index.html HTTP/1.1”
“Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)”