How the WITH Clause Can Simplify Your SQL Queries
This is a quick post to highlight how you can simplify your SQL using the WITH clause. I’ve used this technique a lot lately to prepare queries for clients that are easy to understand and work with (the queries, not necessarily the clients).
Benefits of using the WITH Clause
The WITH clause allows you to reduce joins and logically sequence your code. It does this by creating temporary tables (technically they are views) that are usable by your main SQL statement so that you can break your code into smaller and easier to understand snippets.
Syntax of the WITH Clause
NOTE: As always I am working with Google Analytics 4 data in BigQuery for this post, but the following will be applicable to many other database systems.
The WITH clause is a prefix for your main query. You can use it to create one or more SELECT statements, give each a name, and then reference them in your main query. As an example, here is how you might pull the unique ID’s of users who visited your site yesterday and then came back again today:
# Step 1: Get a list of users who visited the site yesterday
WITH yesterday AS (
SELECT
user_pseudo_id AS yesterday_user_pseudo_id
FROM `<project name>.<dataset name>.events_08012021`
)
# Step 2: Get a list of users who also visited the site today
SELECT
user_pseudo_id
FROM `<project name>.<dataset name>.events_09012021`,yesterday
WHERE
yesterday.yesterday_user_pseudo_id = user_pseudo_id
In the example above I’m defining a temporary table with the name “yesterday”. You could also put a comma at the end to add other temporary tables if you choose, like this:
WITH yesterday AS (
SELECT
user_pseudo_id AS yesterday_user_pseudo_id
FROM `<project name>.<dataset name>.events_08012021`
), last_week AS (
SELECT
user_pseudo_id AS yesterday_user_pseudo_id
FROM `<project name>.<dataset name>.events_02012021`
)
Performance
Typically the WITH clause is not executed until it is referenced, and the results are cached in case the reference is used multiple times within the query. However, I have not confirmed whether or not BigQuery operates this way (I’ll update this post when I do).
Examples and Use Cases
As I publish samples that utilize the WITH clause I will link them here. Additionally, feel free to add any other links to the comments and I will move them here as well:
- Coming soon…
Other Resources for BigQuery and GA4
Here are a few other posts that I’ve created for using BigQuery with data from Google Analytics 4:
- How the WITH Clause Can Simplify Your SQL Queries
- BigQuery + GA4: How To Get the First or Most Recent Event for a User
- How to Use the Over clause to Create a Moving Average in BigQuery
- Data Studio + BigQuery + GA4: Viewing a Parameter Across Multiple Events
- Data Studio + BigQuery + GA4: How to View Multiple Parameters on the Same Event
- How to Flatten a BigQuery Table with UNNEST