Client-Side (Cookie) Tracking vs. Server-Side (Log File) Visitor Tracking

Prelude
I’m a wordy person. Email me about something I care about or have a conversation with me about most topics and you’ll find that out – perhaps to your dismay. Anyhow, this is what I’d like to do a bit more of… Use this blog to publish comments that I make in emails or other communications that may only be seen by one or a few people, and leverage that time I spend constructing those ideas by making them available here for my small but very valued readership 🙂

As such, I’m pasting some recent comments I made on a post on Jason Dowdell’s Marketing Shift blog post, entitled “Cookies vs Flash for Client Side Storage“.

Cookies vs. Server Logs in Tracking

I agree with Eric’s comment. In fact, I’ve seen even greater than a 25% difference in the few large-sample-size cases I’ve studied. More like 30-45% in some instances when comparing a basic log file record versus cookie-based client side tracking.

I just switched over one site last week that gets between 6,000 and 12,000 unique visitors per day, depending upon which method you go by. The difference really was that big, which surprised me – right about a 45% lower figure with the cookies vs. the log files only. When I looked at the log files I could indeed account for maybe 20-40% of that difference as search engine bots and spiders, but couldn’t really figure out the rest.

In using the same two measurements on another fairly large site about 6 months ago I noticed a 35-40% difference – that much lower with the cookie tracking. I also used the cookie tracking to measure actual sales, (which is a metric we actually know for sure, for benchmark purposes) and it was only off by 3-4%.

Thus, its my experience that cookie methods (client side data collection) might be off in their accuracy from the “real” number of visitors by maybe 5% or so (understating totals), but log-file only (in my case I’m using this to me logs compiled from requests to the server, not client side data) methods seem to overstate traffic by as much at 40% in some cases. That’s almost too high to be even moderately meaningful!