Integrating BigQuery with Google Analytics

Integrating GA4 with BigQuery.

Let me start by saying that you need to do this. BigQuery is such a fundamental part of Google Analytics 4, that neglecting to enable the integration is like eating nachos without chips. You don’t want to be that guy at the superbowl party just spooning queso into your mouth, do you?

BigQuery is Google’s fully managed data warehouse, and the Google Analytics 4 (or Firebase) integration gives you access to all of your data in a format that can be analyzed and joined with other sources.

  • Updated recommendations for companies who are not using Firebase.

  • Added an explanation of why the data will not match what is presented in the user interface. Updated some of the language and images.

OPTION 1: How to Set Up the Integration from Firebase

If you’ve read my post about How to Set Up a Google Analytics 4 Property then you know that I recommend everyone with an app to create projects in the Google Cloud Platform and Firebase when building your GA4 property. For websites only, skip down to OPTION 2.

If you've followed these instructions, then you will want to enable the BigQuery integration from the Firebase Console. This is because the enabline from the Firebase console passes data from all of your Firebase services (not just the Analytics data).

Simply choose “Project Settings” > “Integrations” > “BigQuery” (NOTE: You do not want to use the extension “Export Collections to BigQuery” because this is something different) and follow the instructions.

Integrate Firebase with BigQuery

Wait! There’s One More Step for Web Streams. If you have only set up a web stream in your Firebase project, it will not export any data and you’ll see this message on the “Integrations” page.

Project not linked

This is sort of a bug that only exists because web streams are new, and surely it will be fixed soon. But, to get around this all you need to do is follow a few short steps:

  1. Go to the “Project Overview” page, click “Add an App” and select “iOS”
add an app
  1. Create an iOS bundle ID (it can be anything. I like to use “com.this.isDumb” as shown below) and click “Register App”.
Create an iOS bundle ID
  1. That’s it. Now just click the Back button to skip the rest of the setup steps and return to the “Project Overview” screen. Then navigate back to “Integrations” and click “Manage” in the BigQuery card. You’ll now see your iOS app with a radio button next to it. You must flip this on to enable the export.
Enable streams radio button

OPTION 2: How to Set Up the Integration from Google Analytics

If you have not created a Firebase project and linked it to your Google Analytics 4 property, then you will need to follow the standard integration steps.

BEWARE
The standard integration is limited to 1 million rows per day if you are using the free version of Google Analytics. To get your full data into BigQuery you will need to either upgrade to New GA360 (starting at $50K per year) or stream data into BigQuery from a server-side GTM container.

See Google's Official Documentation HERE or watch the video:

When does data show up in BigQuery?

Tomorrow. After about 24 hours you’ll see a new dataset in BigQuery with two groups of tables.

BigQuery tables
  • Daily Tables [follow the naming convention “events_(date)”]: Similar to the BigQuery integration with GA360, every day the prior days data will be added as a new table with the date as the suffix of the table name.

  • Intraday Tables [follow the naming convention “events_intraday_(date)”]: If you need real-ish-time data, the intraday tables will give you a glimpse of the current day. However, beware that your web streams currently do not appear in the intraday tables.

Set Table Expiration

By default your data will be set to delete after 60 days, but you can modify this in BigQuery by clicking on your dataset name ("analytics_") and selecting the pencil next to "Dataset Info".

Set table expiration in BigQuery

From now on, new tables will be created daily without an expiration date. But if you already have tables in your dataset then you'll need to click on each one and manually change the expiration.

What does the data look like?

Each row in the table represents an event that has undergone some light processing:

  • The IP address is removed, but geographic information has been added
  • The user agent is removed, but device information is added

Also, the data you see in BigQuery will not match what is presented in the GA4 user interface. Google has a great article about why HERE, but the primary differences are:

  • The UI can apply Google Signals to identify users, but this is not available in BigQuery.
  • The UI can apply behavioral and conversion modeling to fill gaps in the data caused by cookie restrictions or declined user consent. This will impact a whole host of metrics, including traffic attribution and metrics such as "active users".
  • The UI uses an algorithm called HyperLogLog++ to estimate metrics with low processing requirements, and this is not necessary in BigQuery where the processing costs are covered by you and not Google (READ MORE).
  • Data collection delays (mainly in Firebase and when yo uare using Measurement Protocol) can cause data to be updated up to 72 hours after it is originally collected (READ MORE).
  • Cardinality limits can cause reports in the UI to create the "other" label that is not applied in BigQuery (READ MORE).

Since an event can have multiple parameters each row is stored in the JSON format you may be familiar with from Universal Analytics.

I've written another article called Using BigQuery and DataStudio with Google Analytics 4 to help get you started. You can also see the full export schema HERE.

What does it cost?

Billing is calculated by the amount of data you store and the processing required to run your queries (more here). This is extremely vague, which is a big blocker for many companies, but just know that there is a very generous free tier (I've never been charged for my blog) and my average client probably pays about $200 per month.

It’s worth it — and you can test it out on a small site to take advantage of the free tier, and you can set billing alerts so you will not get a surprise bill.

Previous
Previous

Looker Studio + BigQuery + GA4: Viewing a Parameter Across Multiple Events

Next
Next

Validating Google Analytics 4 With Debug Mode