The Idea Dude

CONNECTING THE DOTS ONE AT A TIME

Friday, April 06, 2007

Why blog statistics don't tell the full story

Mark Evans, had a post about web statistics. . Of course, we're all obsessed with statistics. Given no other measure for success or reputation, we use our visitor count as the ultimate meter stick. AWstats and some of the popular open source web stats packages do a great job of keeping track of that information. But stats do lie or perhaps our misguided interpretation of what the numbers really mean. My beef is that most of these stats are there for webmasters so that they can manage their websites. They are interested in issues such as bandwidth, bots and anything else that may degrade their website performance. Here's the truth about webstats:

  • Why the numbers are too high. Perhaps you're reading the column that counts pages served, that includes images, plugins and anything else you may have on your page. So your main page together with several images and 2 or 3 CSS files will run you easily to 10-50 hits on your server boosting your visitor count 10 to 50 times what it really is.
  • Why unique isn't really unique. So you figured that pages served is not a good measure of visitor traffic (but still useful to tell how hard your web server is working). So you look at the number of uniques. Well, 'uniques' is a misnomer, it's based on a time period that IP has requested a page. Assuming the time period is 15 minutes and you have a long post and someone takes 15 minutes to read or comes back after a phone call and clicks on another link, guess what, that's 2 uniques instead of 1. Or the bot that checks once a hour, that's 24 uniques a day instead of 1. On the other side if you're on a home or office network with only 1 external IP, you're being robbed of multiple uniques because 5 people in the network will look like 1 to you.
  • The trouble with bots. Most people will be surprised at the number of bots that actually hit your site every day. From Google, to Technorati, to Feedburner and a couple dozen smaller ones, each one crawls your site. Again, if they do that with regularity, they boost both your page count and your uniques. Your page count goes up dramatically because they crawl as many pages as possible and your uniques because they may hit your site several times a day. Not many packages allow you to easily remove or filter out those statistics.
  • Boosting numbers artificially.Say you have a network of blogs with different domains on the same server. They reference each other using images and even CSS files. Because they have different domains, each request artificially generates more hits because of calls to images and css files on a different domain and is seen as a different unique visitor by your web browser. So your numbers could double or triple depending on how many cross-references you have.
  • The single pixel image. Many sites and email marketeers do this. Add a single pixel image to the page and only look at downloads of that image. That would be a good way removing statistics of all the other pages served. Remember to serve this file with the proper expiry settings otherwise it may be cached on the browser side.
  • Using external sources. Barring server availability issues and installation issues, using an external source like Google analytics and Sitemeter gives you a far better indication of visitor traffic. Although they suffer from the same issues regarding uniqueness, they generally filter out the bots. Bots will usually only follow pages on the same domain otherwise they would end up crawling the entire web. The other caveat is to make sure you add it to all your pages. Sometimes blog platforms create a page for each blog entry in addition to the main page and these page may not call the external code or plugin if you don't configure your template correctly.
  • Time spent on a page.This is only an estimate of what the server thinks the user is doing on your site. They either track time spent between page requests or have a meta-refresh in the header that tells the server every x number of seconds your page is still being displayed in a browser. However, short of tapping into someone's webcam, you really have no clue if someone is really looking at your page or turn around to speak to someone for 15 minutes and then came back and clicked away.


BTW: feedburner is in some ways a better measure of loyal readers because these are the folks who take the time to add you to their RSS feeds although you have very little indication whether they continue to read your feed or just too lazy to remove it from their lists.

At the end of the day, the only real idea of who your real readers are is by the real-world dialog you may have with them through email, forums or comments on your page.

2 Comments:

Blogger R said...

GoStats can help you better understand and quantify the traffic to your site.

9:41 AM  
Blogger The Idea Dude said...

Thanks, will definitely take a look.

12:43 PM  

Post a Comment

<< Home