How to Flatten a BigQuery Table with UNNEST
As mentioned in my post on Using BigQuery and Looker Studio with GA4, the Google Analytics data is stored as a JSON object in BigQuery (the same is true for Firebase Analytics data collected on a native app). Unfortunately this structure is not good for visualizing your data.
The UNNEST operator allows you to run queries that flatten the data into the simple format required by your data visualization tools (such as Looker Studio).
How Google Analytics 4 Events Are Structured in BigQuery
A single event that you send to Google Analytics 4 will have multiple parameters, and each parameter is made up of a key-value pair (example: “content_author”: ”Jen”). When you view this event in BigQuery, you’ll notice a column called “event_params” that contains an array of all the parameters set with this event (BigQuery calls this a “repeated record”, but I like to use the word “array”).
As an example, let’s say that you have a news app, and your authors create content in three categories: “sports”, “politics”, or “fashion”. Every time a user shares an article, you fire an event called “share” with the parameters “content_type” and “content_author”. Here is a very simplified example of a single row in your BigQuery table:
event_name | event_params | |
---|---|---|
key | value | |
share | content_author | Jen |
content_type | politics |
How the UNNEST operator Works
UNNEST allows you to flatten the “event_params” column so that each item in the array creates a single row in the table with two new columns: “event_params.key” and “event_params.value”. These results can then be CROSS JOINED with your table.
So, this query…
…will create two rows (one for each item in the event_params array). Then when the unnested table CROSS JOINS with the existing table, the results look like this:
event_name | event_params | event_params.key | event_params.value | |
---|---|---|---|---|
key | value | |||
share | content_author | Jen | content_author | Jen |
content_type | politics | |||
share | content_author | Jen | content_type | politics |
content_type | politics |
Now, if you would like to pull a count of “share” events by content_author, you simply write a query that uses a WHERE clause to identify the event name and parameter that we need. Also, you can simplify this a bit by replacing “CROSS JOIN” with a comma as shown below.
Author | eventCount |
---|---|
Jen | 39 |
Bill | 27 |
Want to See It In Action?
Check out my posts on how UNNEST allows you to view multiple parameters on the same event, or how UNNEST allows you to view a parameter across multiple events.
Other Resources for BigQuery and GA4
Here are a few other posts that I’ve created for using BigQuery with data from Google Analytics 4:
- Cross-Stream First-Touch Attribution with Google Analytics 4 and BigQuery
- How the WITH Clause Can Simplify Your SQL Queries
- BigQuery + GA4: How To Get the First or Most Recent Event for a User
- How to Use the Over clause to Create a Moving Average in BigQuery
- Looker Studio + BigQuery + GA4: Viewing a Parameter Across Multiple Events
- Looker Studio + BigQuery + GA4: How to View Multiple Parameters on the Same Event
- How to Flatten a BigQuery Table with UNNEST