At first glance the User ID seems simple enough, but it can get a big complex when you start pulling reports. The purpose of this post is to explain how the User ID is used to generate reports. If you need help installing the User ID and making sure that it works properly, take a look at my post on Properly Setting the User ID.
User ID in Summary
To explain how the User ID works in Google Analytics 4, I’m going to break it down into two categories: data collection and reporting.
On the client side, all data is collected and stored differently in a mobile app than it is on a website. On a website you must explicitly set the User ID with every event that fires, but in a mobile app the User ID automatically persists after it is set once. In this way, mobile apps treat the User ID as a special User Property which will continue to fire with all events after it has been set.
All events that are generated for a user (with a User ID or not) will pass a User Pseudo ID (read more about the various ID’s in my other post about Properly Setting the User ID). If the user authenticates and begins passing a User ID, Google Analytics 4 will use the User Pseudo ID to attribute the User ID to previous events where the User ID was not set.
In the example above, the User Explorer report will only display one “Device ID” set to “12345” for both of these events. BUT, if more events are detected by the same User Pseudo ID without the User ID set then these will not be attributed to the User ID.
In this example, the User Explorer report will show you two Device IDs: “12345”, and “abc”. This next part can be counterintuitive: The first two events can be attributed to both of these Device IDs. If you drill down into “12345” you will see two events, but if you drill down into “abc” you will see all 3 events.
Using the User ID Across Sessions & Platforms
Let’s walk through a more detailed example. Say that a user follows these steps to create 4 events over 3 unique sessions:
- The user opens your iOS app and views the home screen without being authenticated.
- The user authenticates and views a second screen (User ID is set to “ItsMyNewDevice”).
- The user closes the app and returns 2 hours later without authenticating again (no User ID is set).
- Finally, the same user opens your website where she is already authenticated and views the homepage (User ID is set again).
Here’s an approximation of what these events will look like in BigQuery:
|Event Number||Session Number||Event Name||Platform||User_ID||User Pseudo ID|
There are two things to notice in this chart:
- The User Pseudo ID is different on each device. Again, you can learn more about the various identifiers in my post on Setting the User ID.
- The User ID will persist across sessions on a mobile app automatically. As you recall, the User ID was not set in event #3, but it still appears in the data because it persisted on the mobile device. My recommended best practice is still to set the user ID once per authenticated session, but as long as the data is not deleted the User ID will actually persist in this way.
So How Many Users will the Example Scenario Create in my Reports?
The standard reports will all show 1 single user. You can verify this by creating a test property and firing only those example events for a single date.
If you create an Exploration report and set your dimensions to “user_id” and “Event Name” (remember to create a User Property called “user_id”), you will see a list that looks like my table above. All of the events you recorded appear, but the User ID is only listed next to those events that were sent after it was set. However, the “Totals” row will only show 1 active user.
User Explorer Report
Next, If you create a User Explorer report, your dimension will be automatically set to “Device ID”, and you’ll notice that there is now only 1 row because “(not set)” has been removed. Hmm… what about the first event without a User ID?
If you click on the test User ID you created you will see the event that previously included a “(not set)” User ID is now correctly attributed to this user! This means that GA used the User Pseudo ID to correctly attribute the User ID to the events that fired before it existed as I described above.
Warning about the User Explorer report:
If you replicate this test you might be tempted to create a filter for your user_id to find yourself, but you cannot do this. This will filter out all events where the User ID was Null and they will not appear in the detailed User Activity report.