We should start looking for google analytics alternativesJun 21, 2022 - 20 min read
Briefly about google analytics and cookie-based tracking
The most important takeaway from this post: In 2022 and beyond I strongly believe that it is key to own the data yourself. Not being married to a system or a service, but having full control of what you collect, where you store it, who has access to it, and what you do with it. This is relevant both in terms of being GDPR compliant and getting the most out of your data in a responsible way. The goal is to get rid of the data-handling middleman and reach a higher level of transparency and control.
Google analytics has been the go-to method for tracking traffic on websites for years. Google analytics has been implemented on ~28 million websites and 87% of the most popular 10.000 are using it. (source: https://trends.builtwith.com/analytics/Google-Analytics)
Those numbers are crazy. But if you look at the product, it makes sense. It's very easy to implement, it's free, and just works. It might not be 100% accurate, but it has been good enough for a lot of website owners for decades. You do get a lot of insights completely free of charge and without any effort.
The big question is; Is it still good enough? - I don't think so.
It is no longer good enough, and that is for several reasons:
- When you get something for free, you should ask yourself; "What's the catch?"
- With the new cookie law, you need consent from the users before you can track them with cookies. (Google analytics is cookie-based tracking)
- Even though google analytics is a 1st party tracking cookie, some would argue that it is a marketing cookie.
- Personal data is potentially transferred to "insecure" third countries.
- Adblockers can block the tracking.
- You might lack control over the data, what's collected, where it's stored, who it's distributed to, and so on if you don't own the system.
To get the most out of your data, you need to have more control. You need to be able to select which data points you collect, where you store them, and how to access them. It is important to be able to pull the data out of the analytics tool and utilize it elsewhere in the business. Being able to do that, will open up a wide range of possibilities, where you can combine data from different sources and channels.
So the 2 big problems with Google Analytics:
- GDPR compliance
- Data quality and control
New Cookie law (or ePrivacy Directive)
With the rollout of GDPR, a new cookie law or ePrivacy directive followed. It made the marketing life harder because it is no longer possible to follow the user around on the internet like it was before.
So suddenly the users have a choice. Before this change websites usually had a popup saying it was using cookies and your only option would be to click "ok".
The rules have been there for a long time, but it's only until recently that they started to enforce the rules. The Google Analytics cookie is a statistic cookie, so some users would say ok, but it's not a necessary cookie. The thing is that a lot of users will block it, so the number of trackable users for Google Analytics will drop significantly.
Another issue that is not directly related to cookie-based tracking, but is still relevant, is the rules of General Data Protecting Regulations (GDPR).
For Google Analytics, there's an option to track users anonymously. It helps, but the website owner still depends on a 3rd party system.
To reach GDPR compliance you need to be able to give the users the right to be informed, they need the right of access, the right to rectification, and insurance.
This means that you need to be able to send and/or delete the data you store. So before you depend on a 3rd party system, you need to make sure that these features are supported.
With 3rd party tracking systems, like Google Analytics, you do not know for sure what is being tracked and you cannot retrieve it or delete it for singular users. Remember an IP address is person attributable data.
If you choose Google Analytics and set it up to only track users anonymously, then you might be GDPR compliant, but you can't know for sure.
That is due to another small twist; Data that do not seem like personal attributable data can turn out to be personal data if mixed with other data. An example could be if you collect unique IDs on users, but without any reference to their email, IP, or whatever. You might collect this to distingue between unique visitors and returning visitors. If this data is mixed with other data, where you can match the ID to an email, then this unique ID will in fact become persona attributable data.
What options do we have?
All the above points us in the direction that we need to look for alternatives. Google Analytics is no longer a viable option.
There are a few technical alternatives we should look into:
- Tracking on the server-side
- Manual tracking
- GDPR compliant cookie-based tracking
One option I see is to move the collection of pageviews from the client-side to the server-side. The traditional tracking is collected through the user's browsers:
While with server-side tracking you do not depend on the client's browser but can track requests on your webserver.
When you track on the server side you will get a more accurate dataset, compared to the cookies. Simply because you do not rely on the user's settings in the browser, adblockers, cookie consent, and so on.
One issue with the server-side tracking is that you do not have the same data points available as you do on the clientside. So your data might be accurate but not as detailed.
When a client hits the web server, the request log on the webserver will contain information about the client:
- URL / path
The above is actually enough to track page views with different sub-values like device and location. So a very basic tracking is available by using server-side tracking alone. With the basic serverside tracking it is hard to distinguish between unique visitors and returning visitors. A solution could be to generate a unique ID based on the data above and use this ID to determine if it's a new or returning visitor.
Using cookies to track events in the frontend, gives us a few more options for what to track. We can create events that will track certain behavior.
It is also possible to do with manual tracking, where you fire an API call with the relevant data from the frontend on an event. (That could be pressing a button, watching a video, etc) We do have to be aware of the performance hit our site potentially can take by implementing this.
So a combination of 1st part cookies or manual tracking and server-side tracking could be the way to go. You can store the data at the same place, and control what is being collected, and how and where it is stored.
An example of the combination could be on a webshop where a user login or places an order, then with a cookie in the browser or some manual tracking, we would be able to see the pages that a particular user has browsed. This could also be achieved by paring analytics data with data from the CRM. But it does require that we attach some kind of unique ID to each pageview that will determine if it's a new or returning user.
One thing that is very important though; Every time to collect a piece of data to store you need to ask yourself the following questions:
- Is this personal data?
- If I mix the data with other data, is the person identifiable then?
- If it is and it's ok, then ask yourself, why do I store this? Do I have a legit reason to do so? And where do I store it? Can I retrieve it easily?
- What kind of consent did your user give?
So how do we implement server-side tracking? And combine it with 1st part of cookie tracking?
A closer look at server-side tracking and alternatives
There are quite a few ways to achieve this. I do not have experience with all of them, but I did try out a few different ones.
I will briefly go through the ones I 'v tried, but this is by no means an exhaustive list, but a few cheap or open-source tools I've stumbled across.
One thing to keep in mind when looking into tools like this is if there is an option to do self-hosting and if they are GDPR compliant.
Snowplow analytics, google big query, and data studio
I added Snowplow Analytics to a site with a lot of traffic. It was a very basic implementation, where data is collected with Snowplow, stored in google big query, and visualized in google data studio. The data is collected from the caching/web server combined with a client-side tracker.
It is pretty manual. You do get some stuff out-of-the-box, but it is very flexible and you can choose to store it wherever you like. They also have a cloud service where you can store it. The options for what to store and how to build your data models seem endless.
Snowplow is open source and free. You do have to pay for storage no matter if you select their cloud solution or store it elsewhere.
- Initial implementation went pretty smooth
- Very customizable
- You can use whatever visualization tool you want since you have the raw data in the big query database.
- Can combine client-side and server-side tracking
- Storing the data can be a bit expensive if you have a lot of traffic (and store a lot of data)
- Visualising the data in google data studio seems to work fine, but with limited options.
- The great flexibility has a cost in manual work and implementation complexity.
I tested Matomo on another site with a fair amount of traffic.
It is free if you host it yourself. There is a cloud hosting option at Matomo for those who would like that. And they have added extra things you can purchase, like Activity Log, WooCommerce Analytics, Search Engine Keywords Performance, Funnels, Users Flow, Whitelabel, Heatmap & Session Recording, and so on.
- Seems very feature-rich
- Free to use
- Initial implementation seemed easy
- Great initial dashboard/visualization
- Combines client-side and server-side tracking
- You can access the raw data through their live reporting API
- I can't seem to figure out if it's as flexible as Snowplow
- Building a detailed and useful tracking setup takes time
- Does require a lot of manual work to get the utilize it fully.
Manual tracking in rails with ahoy and blazer
I did a server-side tracking test in a Ruby on Rails app, where I implemented a tracking gem called ahoy and blazer for visualization. It is very easy to set up, but a bit hard to use. Blazer can do a very basic visualization of the data if you know your SQL queries.
It is free and open-source.
- Easy to setup
- Very flexible
- Very manual
- Not a lot of extra features/options
Besides the ones above, I've briefly looked into a few others, which I, unfortunately, haven't had the time to try out on real websites with traffic.
woopra is interesting too. They combine client and server-side tracking if you implement their SDK.
I did notice that their tracking cookie gets caught by adblocker though. It has a very nice interface and visualization is pretty customizable. They do also have a lot of integration options for different apps, like Facebook, ActiveCampaign, BigQuery, MailChimp, Salesforce, Slack, and so on.
It seems like you can get the data out of Woopra, but it is unclear where it is stored by default. Before choosing Woopra, I would make sure I have full control of the data. I would check if I can access the raw data and if I can choose to store it in my database or at a cloud service where I am the only one who has access.
The fair Analytics is a very lightweight tracking tool with a heavy focus on privacy and GDPR. It's very easy to implement and is very similar to a lightweight version of google analytics. And it is not using cookies!
They do store the data for you, so you have to be aware. But the servers are hosted in Germany, on green electricity, and certified according to ISO 27001.
This would be a good choice for websites that needs a GDPR and privacy by default alternative to google analytics, which is easy to implement.
This is a very lightweight solution.
I am biased in terms of this one because I simply love Cloudflare. It's just an awesome product! Their analytics tool cloudflare web analytics is a privacy-first service that is very easy to set up even if you are not using Cloudflare already. Their tracker did get caught by my adblocker though, but it is not setting a cookie in the browser. Cloudflare is measuring traffic on your website at the edge, so it is supposed to be very accurate.
I did not see an option to access the raw data or choose where to store it. It is privacy first, but your options are limited.
An even shorter list of tools that I want to try out.
Plausible : https://plausible.io/ - I heard about them in the, the craft of open source podcast epsiode #2. As a privacy-first google analytics alternative, this seems like a very good option. Netlify analytics for my jam stack sites: https://www.netlify.com/products/analytics/ - I love Netlify, it's awesome. So far I 'v only used it for small hobby project sites, but I want to take a deeper dive where I investigate their analytics too. Poeticmetric : https://www.poeticmetric.com/ a privacy-first, no cookie, and no personal data tracking tool.
What to choose
The answer is of course; It depends ... The list above only touches the surface of a few of the options out there. It is a field in progress with new tools popping up every day.
The fairAnalytics solution is an easy-to-implement lightweight solution for those who are just interested in the very basic data. You can see page views, devices, and so on. There are not a lot of integration options, but the basic features are enough to cover the most basic needs.
Matomo and Snowplow fall into the category of more heavy alternatives. I would advise you to look into it, but only implement this if you have a plan and a data strategy that supports the extra complexity this adds. It is possible to do a lightweight implementation of the two though, where you just utilize the features that work out of the box.
Manual tracking does also fit some use cases. If you are using one of the bigger CMS for your site, they might provide data collection as a module or extension. It is also a viable option in some use cases.
Plausible.io seems to be the one that fits in between categories as a flexible lightweight solution. It is open-source, so you are free to inspect the code. You can self-host the solution, so you do have the option of complete ownership of the data. It does also seem very flexible and is getting a lot of attention.
No matter what, I would advise everyone to do some research in terms of needs and afterward look at the options, before selecting a tool for tracking.
It is important to have some clear goals defined before selecting your analytics tool.
Make sure you understand the needs and do not overengineer.
It can be very costly and very complex, epically without a clear strategy, plan, a strong team, and so on. Like any other project, really. So some of the questions you could ask yourself are: Do we need in-depth tracking with a lot of custom options? And connections to different marketing channels? Or do we just need to track pageviews on the website? Do we have data elsewhere, where it would create value to combine the data? Like a CRM or something else.
It might not matter that much which tool you choose, as long as it gets the job done in a responsible manner. Again; It is very important to have control of what is being collected and own the data yourself.
Conclusion and next steps
I have done some brief research on a range of different analytics and tracking tools. Getting my hands dirty is always the first step when I try out new tools. So next step is gathering even more (and even more relevant) data. The goal of the next steps is to reach a point where the data can be visualized and usable in other areas of the business.
But how do we get the data to be usable in the business? Besides the visualization, then there is a need for integrations with other systems. Lately, I've seen CDP (Customer Data Platforms) popping up and they do seem to promise to solve a lot of the problems I am facing.
Collecting data from a wide range of sources (web, app, CRM, whatever) in ONE platform, from where you can make integrations to your marketing tools, economic tools, personalization tools, and so on. It sounds too good to be true?
I will continue my journey and explore a few different CDPs and try to do it the manual way. It usually gives me a lot of insight and helps me innovate.
So what do we expect from a CDP? Visualization of data from different sources, but combined. So I would like to see a customer's orders and which URLs/products he visited. If he is getting the newsletter AND reading it? Does he ever click on anything? And if so, what kind of products? Also, it would be great to label users with tags and create segments. Maybe even define your segments and then auto-add users to them based on historical data.
I hope to get to a level where I can seamlessly create different segments based on all the data points I have from the different systems. And from there, I would like to be able to expose these users to either Facebook adverts of a certain type, a newsletter, maybe a push message, and if possible some personalization on their app/web. Maybe even understand their pattern for purchasing and visits, so I get the most value out of both the marketing money and the time the users spend on my website.
Do I have too high expectations? I don't think so. The data is there and we have the tools. It is just about combining the 2... and in a responsible way of course!
Times are changing and we need to adapt. For ages, we have been used to a free web analytics tool that just works. Now that is in the past we need to readjust and figure out how to transform the new expenditure of a web analytics tool to our advantage and a valid business case.
Thank you for reading my unstructured thoughts about the next steps for tracking and the first steps for data setups. I hope I managed to stir up some thoughts or even inspire some new ideas.