How Conversions are Modeled in Google Analytics 4
In July 2021, Google's analytics and advertising products started applying machine learning models to attribute online conversions to a marketing channel. In this post I will explain when conversion modeling is applied, and how it impacts the reporting in Google Analytics 4.
The Future of Digital Marketing Data
Before we get into the weeds of conversion modeling, it's helpful to understand what the leaders at Google think about the future of digital marketing data. At Google Marketing Live in May 2021, a common message among speakers was that all data should fit into one of two categories:
- Consented, first-party data that has not been fragmented by browser restrictions
- Modeled data
Vidhya Srinivasan (Google’s VP of Ads Buying, Analytics and Measurement) put it this way:
The future is based on first-party data.
The future is consented.
And the future is modeled.
All of the data that Google uses for reporting, optimizing campaigns, and automated bidding will fall into one of those two categories. And, if you are running a legacy version of Google Analytics today (such as Universal Analytics), the majority of data that appears in your reports does not fit into either of those two categories.
For example, how many of your customers using Safari or any iOS device appear to be new users any time they neglect to visit your website within 7 days? How many customers who use multiple devices appear as different users in your reports? What about the black hole of customers who have not consented to advertising cookies, and are subsequently creating a mysterious black hole in your data that you know nothing about?
So How Does Conversion Modeling Work?
Conversion modeling is different from attribution modeling. The purpose of conversion modeling is to fill the gaps in your data that are created by cookie restrictions and cross-device behavior.
Google has been talking about conversion modeling since August 2020 (two months before Google Analytics 4 was officially released). In the article Why conversion modeling will be crucial in a world without cookies, Philip McDonnell described conversion modeling this way:
Conversion modeling refers to the use of machine learning to quantify the impact of marketing efforts when a subset of conversions can’t be observed.
It works by analyzing the subset of your users who generated high quality conversion data (what makes up the orange half of the pie above) to identify correlations and trends between key data points, and then using the behavior of this subset to fill data gaps in the larger population.
If you've read my post on Google Signals, you may remember that Google receives high quality conversion data from users who meet the following criteria:
- Using Chrome
- Consented to analytics and ad storage (if required)
- Logged into a Google account with ads personalization enabled
Where you See Modeled Conversions in Google Analytics
In the past, Google Analytics conversions that could not be attributed to a marketing channel defaulted to “direct”. With conversion modeling, Google Analytics 4 is able to apply the trends observed within the high quality conversion data to conversion events that cannot be connected with previous engagement events.
Conversion modeling will not change the total number of conversions collected by Google Analytics 4, but it will change the channels that those conversions are attributed to.
|CHANNEL||BEFORE CONVERSION MODELING||AFTER CONVERSION MODELING|
How Modeled Conversions Benefit Marketers
As a result, modeled conversions provide marketers with a more accurate view of campaign performance.
For example, if you spent $1,000 on a paid search campaign, you might receive 40 observed conversions attributed to your campaign, and another 40 observed conversions attributed to "direct". This would mean that you paid $25 per conversion.
However, when the conversion model evaluates the data, it may determine that 20 of the conversions attributed to "direct" should actually have been attributed to paid search, which would raise your overall count of conversions to 60, and reduce the cost per conversion to $16.67.
Privacy Implications of Conversion Modeling
Conversion modeling does not rely on individual user data. Instead, models are trained to predict the likelihood of conversions on aggregated data, such as historical conversion rates, device type, browser, location, etc.
Executives at Google have repeatedly announced that they will not build alternative identifiers to track individuals as they browse across the web once third party cookies are phased out in 2023. Conversion modeling is a tool that allows them to keep that commitment as restrictions to limit cookies tighten over the next few years, while continuing to enable marketers to measure performance.
How to Start Using Conversion Modeling
There is no toggle to enable or disable conversion modeling in Google Analytics 4 at this time (or in any other Google product that relies on this feature). Instead, reports in Google Analytics 4 will automatically apply conversion modeling to data that meets the criteria defined above.
Reports that incorporate modeled data are indistinguishable from any other report, and you do not have the ability to separate modeled data from strictly observed data.
Modeling Beyond Conversions
The idea of modeled data showing up in Google Analytics makes many marketers uncomfortable, because it requires us to trust a process that we do not see or understand. In the future, however, modeled data is going to start appearing more and more in Google Analytics 4.
In an article published in September 2021, Senior Director of Product Management Saurabh Sharma described several other areas where machine learning is currently, or will soon be applied to Google Analytics 4 data. The most exciting of these is being called "Behavioral Modeling" with consent mode.
Behavioral Modeling with Consent Mode
Behavioral modeling with consent mode is very similar to conversion modeling, but it could be applied to a wide range of behavioral metrics.
With consent mode in place, when a user denies analytics storage within the consent management tool, no additional cookies will be written or modified, and no existing cookies will be used. Google Analytics 4 will continue to send events from the browser, but these will be flagged and will no longer be linked to this user (the device/client ID is randomized). As I write this in September 2021, any data collected while the user has denied analytics storage is filtered out of your reports in Google Analytics 4.
When behavioral modeling rolls out, however, Google Analytics 4 will begin modeling the flagged events to help marketers approximate how users are likely behaving within the website or app. The process is similar to conversion modeling, where inferences will be drawn from the activity of a subset of similar users with complete data, and applied to those users where data is incomplete. The goal of behavioral modeling is to enable marketers to answer questions like, “How many new users did I acquire from my last campaign?” even if a portion of those users have chosen not to allow analytics cookies.
It is unclear when behavioral modeling will roll out to Google Analytics 4, so sign up for my newsletter to stay on top of product releases as they are announced.
Where to Read More
- You can find Google's official help documentation on conversion modeling HERE.
- Get comfortable with more types of machine learning models – Saurabh Sharma, September 2021
- Why conversion modeling will be crucial in a world without cookies – Philip McDonnell, August 2020