cute robot

TimNash.co.uk

Dev/Sec/Ops with a splattering of humour

Removing Google Analytics

Today I removed Google Analytics from my site and I thought it was worth discussing why. I actually came to the decision a while back after a stray thought from a tweet.

When I relaunched the site I was already pondering if including Google Analytics was the right choice. I kept it as  I was scared to turn off Google Analytics after all it’s important right? I’m all for data so for me to be turning it off well I mean What the!

What is Google Analytics really?

Ultimately its a tracker of people. There is some attempt at anonymisation but the data its storing is people and their browsers. Location, how long folks are on site. If they have reading difficulties or visual impairments I can learn a great deal about folks visiting my site. Far more then the stated purpose of looking at traffic patterns to content. When you start to look closely the amount of data that you mine, that every site collects and sends Google it is scary.  Yet for some sites super useful and indeed it could be argued needed. So let’s assume turning off Google Analytics is for most a silly idea with out an appropriate alternative.

What do I need from my Analytics?

Currently I don’t make any serious use of my analytics at best I use it to answer the following questions:

  • Number of visitors are on my site right now?
  • How many visitors visited my site over x period?
  • How many visitors visited x page over y period?
  • Did people stay on  on site for some time
  • Where are the majority of my visitors from?

These are pretty much all the questions I ask.. Maybe I occasionally want to know browser versions or some other random thing but its clear my “drilling down” is probably a light rub with sand paper.

One of obvious question though, Why?

Why do I need to know how many people are on my site right now? Will this in any way help me become a better writer or person? Does it matter that most of the people who come to my site are from the UK? When I start questioning why I need the data it becomes hard to justify its collection in the first place. Knowing which content is popular might be a legitimate interest.

Really Google Analytics is creepy and actually I don’t think I have a need. However I’m vain and it’s pretty has graphs and I like data and…

GDPR

:( You have to have been living under a brick in Antarctica avoiding penguins at all costs not to have heard the words GDPR (General Data Protection Regulation). GDRP  is EU wide legislation regarding how anyone handles personal data. I’m not a company, a mere individual but like all scary legislation the simplest way to avoid getting involved in the nitty gritty is to get out of scope of it as much as possible. Now I’m NOT SAYING using Google Analytics is in any way going to be a breach of GDPR.  It does provide an obvious potential place where you are going to need to work out whats being collected and why. Removing it removes a lot of pesky paper work and mental thought.

It also simplifies my Privacy Policy (or is that now a notice) which you know thats a good thing.  There is a broader privacy aspect. Let’s face it the folks at Facebook have really done a great job of making even my parents tin foil hat wearing privacy folks. The fact I already sat in that camp and have Google Analytics was a teeny weeny bit hypocritical.

Security

Much like GDPR I’m not saying Google Analytics is in itself a security risk but it is a huge target. Now the Google Security team are one of the best in the world and as far as I know Google Analytics has never been hacked. But just imagine if a bad actor managed to get code into the Google Analytics javascript how terrifying that is?

My little site is nothing in such an event. Its a truly terrifying thought how much of the web would have issues? For me turning off Google Analytics means reducing the attack surface. It also allows me to tighten my content security policy header reducing the number of whitelisted domains. In addition reducing the amount of javascript and third party code which I can’t provide SRI hashes for.

Performance

For my homepage Google Analytics was 33% of the homepage size. Over a third of the site content was in fact tracking code…

Even with Asynchronous loading and HTTP/2 it’s slowing the page load down as you would expect. While I have used tricks such as locally loading the initial JS to allow browser leveraging its still overhead. Removing it, removes a lot of the page bloat on many pages.

It’s also third party coding loading from a remote domain, which while high performing does occasionally slow down or worse go down. Maintaining performance and security is one of my goals and I would rather have a fast site then one I know the every click a user has ever made on.

The final straw

Well I haven’t looked at my stats for nearly 3 months which shows how totally pointless they are for me. Even when I realised I hadn’t looked I didn’t go and check. I hope I haven’t been super popular!

So what now?

Well I did look at Matomo (formerly Piwiki) as a potential alternative and a bunch of others but most had similar issues to Google Analytics with no massive benefits.

I am however looking forward to testing Kownter by my friend Ross Wintle when its in a beta state as I think we have many similar thoughts. In a similar vein Fathom looks interesting and worth playing with when its a bit further along.

For now I’m relying simply on my access logs for some basic analytics and GoAccess as a client rather then reading the log files themselves. At the moment my logs are rotated weekly and stored for 28 days max, meaning I have very short term analytics I was pondering slurping the data out anonymised with IPs removed and replaced simply with country as per max mind into an ELK stack for future use.

I’m not hundred percent comfortable with that but its a potential solution, however first I want to see if I can live with as a temporary measure.

Should you dump Google Analytics?

Of course not, that would be stoooopid. Being serious for a second this is a personal choice made on a site where the ramifications of removal is minimal.  I help manage e-commerce sites and Google Analytics is staying on at least in the mid term. Why? Because for e-commerce knowing where sales are coming from, what’s driving traffic, how much is being spent and being able to identify advertising spends with customer purchases is vital for an online store to succeed. Going blind would be silly and just using access logs won’t cut it. In this scenario and many others having analytics is important and not something you just dump. However it is worth looking at the options Matamo is one such alternative and others are out there.

Like most things there is a balancing act, while I wrote this article I was for the most part sitting in a hipster Leeds cafe. Sitting two seats down were 3 gents discussing their new site and marketing plan. The youngster of the group was expelling the virtues of using Google Analytics and discussing how important such data was for a range of reasons, he was so excited at the idea.

The irony of what I was typing made me smile, I think they were just a little freaked out as I sipped my coffee listening.