Thursday 12 July 2012

Cookies and Google Analytics

Recent changes to the law as it relates to the use of web site cookies has focused attention on Google Analytics. If by some freak chance you haven't met Analytics, it's a free tool provided by Google that lets web site managers analyse in depth how their site is being used. It can do lots that simple log file analysis can't, and many web site managers swear by it.

Analytics uses lots of cookies, and there's quite a lot of confusion about it. In the UK, the Information Commissioner has been quite clear that cookies used for this sort of purpose don't fall under any of the exemptions in the new rules (see the final question in 'Your questions answered'):
"The Regulations do not distinguish between cookies used for analytical activities and those used for other purposes. We do not consider analytical cookies fall within the ‘strictly necessary’ exception criteria. This means in theory websites need to tell people about analytical cookies and gain their consent."
However he goes on to say:
"In practice we would expect you to provide clear information to users about analytical cookies and take what steps you can to seek their agreement. This is likely to involve making the argument to show users why these cookies are useful."
and then says:
"Although the Information Commissioner cannot completely exclude the possibility of formal action in any area, it is highly unlikely that priority for any formal action would be given to focusing on uses of cookies where there is a low level of intrusiveness and risk of harm to individuals. Provided clear information is given about their activities we are highly unlikely to prioritise first party cookies used only for analytical purposes in any consideration of regulatory action."
which looks a bit like a 'Get out of jail free' card (or 'Stay out of jail' card) for the use of at least some analytics cookies. The recent Article 29 Data Protection Working Party Opinion on Cookie Consent Exemption seems to have come to much the same conclusion (see section 4.3). They even suggest:
"...should article 5.3 of the Directive 2002/58/EC be re-visited in the future, the European legislator might appropriately add a third exemption criterion to consent for cookies that are strictly limited to first party anonymized and aggregated statistical purposes."
Which is fine, but there's that reference to 'first party cookies' in both sets of guidance, and the reference to "a low level of intrusiveness and risk of harm to individuals"

Now that should be OK, because Google Analytics really does use first party cookies - they are set by JavaScript that you include in your own pages with a scope that means their data is only returned to your own web site (or perhaps sites, but still yours).

But there's a catch. The information from those cookies still gets sent to Google - it rather has to be, because otherwise there's no way Google can create all the useful reports that web managers like so much. But if they are first party cookies, how does that happen?

Well, if you watch carefully you'll notice that when to load a page that includes Google Analytics you browser requests a file called __utm.gif from a server at www.google-analytics.com. And attached to this request are a whole load of parameters that, as far as I can tell, largely include information out of those Google Analytics cookies. __utm.gif is a one pixel image, as typically used to implement web bugs. And the ICO is clear that:
"The Regulations apply to cookies and also to similar technologies for storing information. This could include, for example, Local Shared Objects (commonly referred to as “Flash Cookies”), web beacons or bugs (including transparent or clear gifs)." (emphases mine).
So while the cookies themselves may be first party, the system as a whole seems to me to be more like something that's third party. And third party using persistent cookies into the bargain (some of the Analytics ones have a 2 year lifetime), and one that gets my IP address on every request.

But it's not all bad. There's some suggestion that Google do understand this and are committing not to be all that evil. For example here they explain that they use IP addresses for geolocation, and that "Google Analytics does not report the actual IP address information to Google Analytics customers" (though I note they don't mention what they might do with it themselves). They also say that "Website owners who use Google Analytics have control over what data they allow Google to use. They can decide if they want Google to use this data or not by using the Google Analytics Data Sharing Options." (though the subsequent link seems to be broken - this looks like a possible replacement).

Further, the Google Analytics Terms of Service have a section on 'Privacy' that requires (section 8.1) anyone using Analytics to to tell their visitors that:
"Google will use [cookie and IP address] information for the purpose of evaluating your use of the website, compiling reports on website activity for website operators and providing other services relating to website activity and internet usage.  Google may also transfer this information to third parties where required to do so by law, or where such third parties process the information on Google's behalf. Google will not associate your IP address with any other data held by Google."
which seems fairly clear (or as clear as anything you ever find in this area).

So what do I think? My current, entirely personal view is that Google Analytics is probably OK at the moment, providing you are very clear that you are using it. It might also be a good idea to make sure you've disabled as much data sharing as possible. But I do wonder if the ICO's view might change in the future if he ever looks too closely at what's going on (or if someone foolishly describes it in a blog post...), so it might be an idea to have a plan 'B'. This might involve a locally-hosted analyics solution, or falling back to 'old fashioned' log file analysis. Both of these could still probably be supplemented by cookies, but they still wouldn't be exempt so you'd still need to get consent somehow. But this should be easier if they were truly 'first party' cookies and the data in them wasn't being shipped off to someone else. Trouble is, most good solutions in this area cost significant money. There is, as they say, no free lunch.