One of the concerns we are hearing from our clients is, “why don’t GA4 and Universal Analytics match?!?” The first thing that enters my head when I hear this question is, “did you actually think Google Analytics was truth?”
I wrote a blog post that gets into the mechanics of why web-based tracking is never an accurate reflection of what actually happened, but the TL;DR version is that digital analytics is good at capturing events that get sent to the analytics server. Metrics like “users” and “session duration” are fictions based on choices made by developers. Engage in a technical conversation with an analyst, and you are likely to hear, “it’s directionally accurate” after not too long.
That said, there are some practical reasons why you are likely to see differences between GA4 and Universal. Some are fixable and some are not. Following are the top causes of discrepancy we’ve encountered as we have set up and troubleshot hundreds of GA4 properties.
- GA4 and UA are configured differently – people often make the assumption that GA4 is the problem, but in my experience the number one cause of discrepancy is that GA4 and Universal Analytics are set up differently.
- Filters – GA4 doesn’t have views like Universal Analytics, and only supports filtering by IP address. Actually filtering traffic by IP in GA4 is also quite confusing.
- Tag coverage – it’s fairly common that Universal Analytics tags have been implemented directly on a site versus via Tag Manager. We often find that there are gaps between the page coverage of UA tags and GA4.
- Different event triggers and rules – since GA4 is an entirely new platform and requires a fresh start, it’s easier to get things right. I’m working with one client right now whose UA setup predates the Internet, I’m pretty sure.
- 3rd-party add-ons – in my experience, chat widgets, ecommerce platforms, form plugins, etc. that have built-in GA tracking features behave a little or a lot differently. For example (and I still can’t believe this), Shopify calculates revenue differently for GA4 vs. Universal Analytics with its native integration.
- Cookie-consent – I’ve been troubleshooting a number of cookie-consent setups recently and so far I haven’t encountered one that’s set up properly. Different consent behavior between UA and GA4 can result in significant differences.
- GA4 and UA calculate things differently – sessions and pageviews are not that different between the two, but how users are identified has changed and any metric related to engagement is totally different. These two videos cover a lot of the changes:
- Thresholding – GA4 applies thresholding when it deems there is a risk of compromising personal information.
For example, dimensions such as search queries, age and gender could theoretically narrow down to individual users. I can live with thresholding when reporting on demographics data – that makes sense to me, but thresholding can also result in event counts zeroing out even when you are not using demographic dimensions.
This video has a good explanation of thresholding and a clever trick for getting around it (sometimes):
- Spam filtering – I’ve found that Universal Analytics does a better job of spam filtering, especially if the ‘Exclude all hits from known bots and spiders’ setting is enabled. I’ve seen situations where there is a huge traffic spike in GA4 resulting from spam, and no corresponding spike in UA. Sometimes, you can apply a report-level filter to get rid of it, if the traffic comes from a specific browser, location or other identifiable dimension.
- Cardinality – cardinality can be an issue when you have a dimension with a lot of unique values, such as Page path. Google indicates that a row limit will kick in when a dimension has more than 500 values, grouping values below row 500 into (other), but I’ve seen an (other) row show up with a lot fewer rows than that.
- Sampling – reports based on sampled data were an annoying problem in Universal Analytics, but I rarely run into it in GA4. Google says that reports only get sampled when they are based on > 10 million events, which is pretty hard to get to for most sites. If you do run into sampling, you can just shorten your date range to make it go away.
A different issue that often gets bunched together with data discrepancies is the fact that the GA4 data source for Looker Studio is not nearly as robust as the Universal Analytics data source. There are quite a few metrics and dimensions that exist in GA4, but not in Looker Studio. I’ve got a lot of tricks up my sleeve for getting around this issue, so reach out if this is a problem you are struggling with.