I was taking a stroll through this site’s Google Analytics account when I noticed that my recent post Follow Back: How I Choose Who to Follow on Twitter had attracted a whole lot more visitors than I typically get. I was curious to see where they had come from, so I checked out the referring traffic to find out where people were finding the link and discovered an apparent break-down in the usefulness of Analytics.
For those not in the know: your web browser reports the last page you were on to the next page you visit as the “referrer”, which Analytics uses to track the source of traffic to your site. When someone types your URL (jaygoldman.com) into their browser without coming from another page they will have no referrer and therefore count as a ‘direct’ visitor. This also applies if they click on a link in an external application like TweetDeck and Tweetie (my current choices for desktop and iPhone Twitter clients), since they land in your browser and on your site without coming from another web page. A large portion of Twitter users tweet from third party apps (anyone know the percentage?), which means that a large portion of the people who find your content from Twitter leave no referrer and look like direct traffic. All direct traffic gets lumped into one big pool, so there’s no way to tell if they came through Twitter, a non-web based RSS reader, a link in an email not read through webmail, manually typing in your URL, an iPhone app with your link in it, etc. Let’s call this Analytics Problem #1: the increasing number of specialized apps that run outside of your web-browser all get counted as direct traffic.
That got me thinking about other places that Analytics might fail to provide accurate tracking as we move deeper into the realm of social media. As a data and analytics junkie, I find Conversion Goals are one of the most powerful ways you can track your online presence, particularly if you have an ecommerce site or webapp. The basic idea is that your site converts different types of users into other types of users (e.g.: catalogue browsers into paying customers, casual readers into RSS subscribers, etc.), and that tracking those conversions helps you to optimize for your end goal (e.g.: more ecommerce revenue, more exposure, etc.). Goals are usually measured at the end of a ‘funnel’, which allows you to track a specific path to a goal and then compare different paths to find the most effective (e.g.: clicked on newsletter link, browsed catalogue page(s), checked out vs. landing page from google, add to cart, checkout). See this excellent Conversion Goals four-part tutorial from WorkHappy.net if you’d like to know more about Conversion Goals. I started thinking about how I could measure Conversion Goals for this site, which made me very quickly realize that my dependence on third party services means I don’t control key pages in the funnels and therefore can’t instrument them. A quick example: I’d like to measure the number of people who follow a link to my blog from Twitter and end up becoming RSS subscribers. This falls down in two places: I can’t distinguish them from other direct traffic if they come from third party applications, and I can’t set a Goal on the final page because it happens on FeedBurner. Let’s call this Analytics Problem #2: your ability to track Conversion Goals decreases as the number of non-web-based traffic sources and third party utilities involved in your site increases.
What’s a poor data hungry blog writer to do? I can think of a few things that might work, though none are particularly awesomesauce:
- Inbound Interstitials. Part of the solution could lie in the use of interstitial pages that are inserted in the flow between the first click and the target. For Analytics Problem #1, third party app developers could direct all traffic to a page on their own server that could then send users to their ultimate destination with a referrer in place. This isn’t ideal because… it will annoy users, there are definite privacy concerns, some browsers may not track the referrer if a page auto-redirects rather than following a clicked-on link, and it will be inconsistently implemented by third parties. Verdict: no dice.
- Browser Tracking. It should be possible for the browser itself to receive a request to open a URL and track the referring application. When you click (or tap) on a link or button in one application that ultimately opens a page in your browser, the operating system steps in to handle the communication between them. The inbound request to your browser might have the name of the app that sent it included, so browsers could start using it as the referrer. It wouldn’t follow a standard URI scheme, but they could cheat and make it look like one (e.g.: macos://tweetdeck or something similar). This isn’t ideal because… it requires a change in browser behaviour across the board (or, at least, by Mozilla and Microsoft) and that’s a full time lobbying job. It also may not be possible on some platforms if the requesting app’s name isn’t included in the request. Verdict: iffy. Might be testable with a Firefox extension.
- Unique URLs. If you’re particularly concerned with tracking those direct visitors, you could borrow a page from an old junk mail handling play book and use a different inbound URL for every source of traffic you list your posts on. I used to sign up for things like Columbia House CD club with a slightly different spelling of my name (or a fake middle initial) so that I could track who they sold my mailing address to when the junk started pouring in (and it sure did). You could take a URL like http://jaygoldman.com/2009/01/15/follow-back-how-i-choose-who-to-follow-on-twitter/ and turn it into http://jaygoldman.com/2009/01/15/follow-back-how-i-choose-who-to-follow-on-twitter/source/twitter, and then use something like mod_rewrite on your server to strip out the source bit and still serve the right page. Analytics will record the pageview on the pre-strip-out URL so you can still track it in your reports. This isn’t ideal because… the requests wouldn’t get tracked as referrers so you’d have to count each of the pages in the Content section to get a total count. You would have to remember to use a different source every time you linked to the post (e.g.: “source/twitter” when you tweet about it, “source/rss” in your RSS feed, etc.). Your stats will be off if other people link to your post (yay!) but strip out the source or use the wrong one (boo!). Verdict: answer unclear, try again later. This would work but the logistics are almost more effort than the payoff.
- Outbound Interstitials. The second problem is a little easier to solve — at least on the tail end of the funnel — by either calling Analytic’s
<link rel="alternate">tag that gets handled by the browser without a click. This should be almost entirely invisible to your readers and you’ll be to still track your funnel as long as you use it everywhere you would have just linked to FeedBurner. This isn’t ideal because… you have an interstitial of your own to maintain, but it should be almost unnoticeable. Verdict: should work well for outbound links from your site, provided the funnel ends there and doesn’t require tracking beyond the first link.
- Your Chocolate in their Peanut Butter. Twitter (and other services) could give you the ability to insert your analytics tracking code into their page. This is a bit of an unorthodox idea (in that I’ve never seen it done), but since Analytics supports tracking across multiple domains, it should be possible for Twitter to insert the modified tracking code listed there into your profile page and record a pageview in your analytics. This isn’t ideal because… service providers need to jump through (small) hoops to get it working. The reports in Analytics don’t show the domain on requests by default, so you would need to follow the instructions listed on that page to setup an advanced filter or Twitter would need to log the request as being something obvious (e.g.: /twitter.com/profile). Verdict: this would actually be pretty great if it worked, since you could track how many people are viewing your profile in addition to using it as the start of a conversion goal.
There are many smarter people than your humble scribe who read this blog. How can we solve the Terrible Twosome of Analytics Problems and restore order to the world? Maybe I’m missing something obvious?