Jetpack’s Misleading Statistics

Jetpack is an extremely helpful plugin for WordPress installations, and I’m grateful that Automattic makes all this fine software available for free. That said… Jetpack has one feature that seems to be all but nonfunctional, and worse, that fact is not obvious. I’m talking about post view statistics. (That’s on-site views; Jetpack does not track feed views.)

Let’s take my recent announcement post, Weblog Move Complete. This post did not exist on my older WordPress.com blog, so Jetpack shouldn’t get confused on that account. My local Apache server logs, as evaluated by WebLog Expert, show a total of 116 views by an estimated 92 visitors. And how many raw views did Jetpack log? Sixteen.

Jetpack Stats

Jetpack’s all-time post views

These results are identical for my local WordPress admin panel and the WordPress.com statistics view. Now my server logs do record some views that Jetpack correctly ignores, such as my own editing actions or perhaps web crawlers that WLE doesn’t recognize as bots. But could that account for such a huge discrepancy? Let’s look at the server logs for Wednesday, 10 April 2013, when Jetpack recorded no views for our example post.

aaa.aaa.aaa.aa - - [10/Apr/2013:06:48:23 +0200] "GET /2013/04/06/weblog-move-complete/ HTTP/1.1" 200 28218 news.kynosarges.org "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)" "-"

bbb.bbb.bbb.b - - [10/Apr/2013:12:04:51 +0200] "GET /2013/04/06/weblog-move-complete/ HTTP/1.1" 200 28218 news.kynosarges.org "http://news.kynosarges.org/2013/03/31/cost-overruns-in-public-projects/" "Mozilla/5.0 (Windows NT 6.1; rv:19.0) Gecko/20100101 Firefox/19.0" "-"

ccc.cc.ccc.cc - - [10/Apr/2013:16:10:47 +0200] "GET /2013/04/06/weblog-move-complete/ HTTP/1.0" 200 28218 news.kynosarges.org "http://plus.url.google.com/url?[…]" "Mozilla/5.0 (Windows NT 5.1; rv:10.0.12) Gecko/20100101 Firefox/10.0.12" "10.111.10.92"

ddd.dd.dd.ddd - - [10/Apr/2013:17:11:52 +0200] "GET /2013/04/06/weblog-move-complete/ HTTP/1.1" 200 28218 news.kynosarges.org "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.112 Safari/534.30" "-"

ee.eee.eee.eee - - [10/Apr/2013:17:30:21 +0200] "GET /2013/04/06/weblog-move-complete/ HTTP/1.1" 200 28218 news.kynosarges.org "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.112 Safari/534.30" "-"

The aa/bb/… blocks are five different IPs which I’ve anonymized for privacy. My server actually recorded ten views, but the other five were spiders. So we’re left with five legitimate views at different times from different systems, none of them my own and none of them web crawlers. And Jetpack did not record a single one of them.

This is not an exception. Jetpack missed non-spider views on every day I checked. As for other posts, their total views showed a similar factor 5+ discrepancy between Jetpack and my server logs. If Jetpack routinely drops multiple legitimate views per days, I must assume that WLE’s evaluation is closer to the real totals than Jetpack’s. Indeed, Jetpack statistics appear so badly broken that they should best be ignored.

How can this happen? Maybe Jetpack got confused by the fact that my WordPress installation folder is accessed through a subdomain (news.kynosarges.org) but that’s a very common setup. Besides, Jetpack does record some views – just very few of them. I wonder if statistics for WordPress.com blogs are just as erroneous? Without access to the server logs it’s impossible to find out. All I can say is, better don’t trust WordPress statistics.

2013-04-18: It was suggested that the dropped views might come from known spam IPs, but that can’t be the explanation either. Akismet catches far fewer spammers – currently less than one per day – than the number of views ignored by Jetpack. Why would a mass of spammers only look at posts but never try to comment? Moreover, some of the dropped views were referred internally, i.e. people browsing the blog, and others from my Google+ profile where I announce new posts. That looks like legitimate readers to me, not spammers.

2013-04-20: Turns out WordPress statistics use client-side scripting. See my follow-up post, Investigating WordPress Statistics, for an explanation based on that fact.

2013-09-16: After more analysis I found one major source of logged requests that WordPress correctly ignores: preloads of pages reached by the previous & next post links at the bottom of each page! Some browsers fetch those automatically, and that can distort statistics quite a bit. There also seem to be a great deal more bot requests hidden in the server logs than I had first thought. For whatever reason, many bots apparently do just send perfectly benign requests, without any attempt to break in or leave spam. While WordPress definitely underestimates legitimate visitors, it’s likely closer to the truth than my server logs.

Leave a Reply