Thursday, January 15, 2009

How Visitor Tracking Works

If you maintain a blog or a website, then it is most likely that you want to find out who is visiting that blog/site. The most common and easiest approach to monitor the visitors is to install a visitor meter. There are many free visitor meters (trackers or counters as they are called sometimes) available today, and almost all of them work based on some Javascript and/or HTML based tracking code invoked from the client-side. (The other primary method of visitor tracking is server-side log analysis). In this article, we'll take a brief look in to how these tracking systems work.

When you sign-up with one of these tracking services, you get a piece of code typically called the tracking code. Then you have to install that in all the web pages you wish you track. In the case of Blogger blogs, an HTML/Javacript widget can be used to embed this tracking code to your blog. As Blogger widgets load up on all blog pages (unless you limit them to specific pages), that way you can easily track your entire blog, even the posts that you write in the future. Given below is the tracking code for this blog provided from Site Meter.

<!-- Site Meter -->
<script type="text/javascript" src="http://s44.sitemeter.com/js/counter.js?site=s44idssl">
</script>
<noscript>
<a href="http://s44.sitemeter.com/stats.asp?site=s44idssl" target="_top">
<img src="http://s44.sitemeter.com/meter.asp?site=s44idssl" alt="Site Meter" border="0"/></a>
</noscript>
<!-- Copyright (c)2006 Site Meter -->

In the above code, the <script> element refers to a Javascript (type="text/javascript") named counter.js located at http://s44.sitemeter.com/js/. When someone visits a page in The Blogger Guide blog, that visitor's browser will execute this Javascript code, by passing the argument site=s44idssl in to it. This argument carries the codename (or ID) given for this blog by Site Meter. The code inside the <noscript> element comes in to play when the visitor's browser has Javascript disabled or has no support for Javascript.

Once installed, this tracking code does two things every time a tracked page loads up. Firstly, it will fetch the relevant Javascript code from the tracking service's web server and execute it. When this script executes, it will gather data such as the referrer to the web page (i.e. from which page did the visitor reach your tracked page), visitors IP, the ISP, browser type, OS, screen resolution etc. The collected data will be sent to the tracking service, piggybacked on another HTTP request. This second request is typically to download some web resource such as a dummy image (e.g. a transparent 1x1 px image) or an image showing the cumulative total of visitors. Given below is such a request sent to Site Meter when a page from this blog loads up.

http://s44.sitemeter.com/meter.asp
?site=s44idssl
&refer=http://groups.google.com/group/…
&ip=124.43.143.75
&w=1680
&h=1050
&clr=32
&tzo=-330
&lang=en-US
&pg=http://bguide.blogspot.com/
&js=1
&rnd=0.16926656137301965

Note that all this data are sent in a single line. The line breaks are added for clarity. The request shown above is sent to a web page called meter.asp, located at http://s44.sitemeter.com. The &refer parameter says that the visitor has reached from a link via a Google Groups page. An application running on the tracking service's web server will extract the data sent via the request and will populate their database. It is this data that you see in various summarized forms when you later login to see the visitor statics.

Another common requirement of bloggers/webmasters is to exclude their own visits to the blogs/sites maintained by them. Chances are that you will visit your blog many times a day and you don't want them counted as actual visits. Most tracking services offer a simple cookie-based method of achieving that. For instance, in Site Meter's, the ignore visits option in the manager section offers a simple one click method of excluding own visits. Feedjit also has a similar simple method. However, it is not that simple in certain services (e.g. Google Analytics). (See this article to learn how to exclude your visits from Google Analytics)