GA4 & BigQuery: Benefits, Drawbacks, and Cost

GA4 & BigQuery: Benefits, Drawbacks, and Cost

In this post, we’ll provide a primer on BigQuery and GA4, why it’s beneficial, what are its drawbacks, and costs to use it.

What is BigQuery in GA4?

BigQuery is a type of data warehouse solution running in the cloud, specifically on the Google Cloud Platform.

With the new Google Analytics 4, Google is now allowing you to connect your GA4 account directly with your BigQuery account, which allows you to have your raw GA4 data exported to a BigQuery cloud database.

Why use BigQuery with GA4?

This is extremely powerful, because if you know how to read and write in SQL (Structured Query Language), you can use that to transform and manipulate the data in any way you want.

Previously, with Google Analytics 3 (Universal Analytics), we did not have this option. You would have needed to connect to the Google Analytics API separately and provide your own code to extract data from GA and load it into a data warehouse.

With GA4, with just a couple of steps, you can easily and quickly have raw GA data flowing into a warehouse.

Benefits of Using GA4 with BigQuery

The benefits below can be reaped if you know how to model the raw data and transform it into tables that are easy to query and extract insights from.

1. You will have access to individual session data

How it works:

GA4-BigQuery connection includes a field named ga_session_id. When combined with the field user_pseudo_id, you can create a unique key called “session_key”. With this session_key, you’ll be able to see attributes (different characteristics) for any single session.

Different characteristics of a single session you can lookup:

  • the date that session started
  • the time that session started
  • number of pageviews
  • number of signups
  • total duration on site
  • converted or not
  • how did the user enter the website in that one session

Insights you can extract from a sessions table:

After creating a “sessions dimension table”, you can query it to find out:

  • How many sessions did you get in a specific date or date range
  • Which sessions converted, and how many average pageviews did it take for them to convert
  • Of the sessions that did convert, how long did they spend on the site during their session?
  • What were the traffic channels (and/or campaigns) that produced the highest number of conversions?

2. You will have access to individual user data

Similar to the sessions point above, you can also see characteristics of each individual user.

Note: Google Analytics does not collect any PII (personal identifiable information).

How it works:

GA4-BigQuery connection includes a field named user_pseudo_id, so even though you don’t have any tracking setup to track individual users, this field is provided by default. And if you do have your own user_id’s that you’re tracking, GA4 provides mechanisms where you can stitch your user_id field and the user_pseudo_id field to get a complete view of who is who.

Different characteristics you can lookup for a single user:

  • all the sessions, the corresponding session key, and all the corresponding data points for those individual sessions that is connected with a single user
  • if the user has converted into a lead or sale or not
  • the types of conversions that user has triggered
  • the value in dollars of that user (how many times has he/she have bought from you and the total dollar amount he/she has spent on you)
  • how many times did that user have to visit your site before they converted

3. You can create new characteristics that describe any one session or user

The points above are just starting ideas. You can essentially create new characteristics and attributes that describe any one user or session.

Drawbacks of Using GA4 and BigQuery

The power of using GA4 and BigQuery is there. But with great power also comes… (responsibility)… just kidding.

In this case, with great power, there are also some drawbacks with using it.

1. You Need to Know SQL:

You need to have a knowledge of SQL in order to transform the raw data and get it into a format where you can extract insights from it and turn it into reports and dashboards.

2. You Should Have a Basic Understanding of Dimensional Modeling:

Ralph Kimball is the very well known author of dimensional modeling and has several books out that describe it. With this knowledge, you can turn a raw data table into a strong structural data foundation where you can build reports, dashboards, analysis, predictions, forecasts, and models on top of it.

3. Takes Time to Build Out the Models (Tables in the Data Warehouse):

In addition to the time it takes to build out the models, you should also have tests developed to ensure your data models don’t have any bugs or misses in expectations. Since every thing else (dashboards, reports, analysis, etc.) will be built on top of these models, you need to be 110% sure that your data models are correct and are outputting the correct data.

Is BigQuery free with GA4? What is the cost?

A common question asked is… is BigQuery free to use?

Regarding the GA4 and BigQuery connection – that is free. There is no fee to connect and have data flowing from your GA4 property to your BigQuery database.

Regarding the use of BigQuery – there is a sandbox version, and that is also free. There are just some limitations to it, including:

  • 10 GB of active storage
  • 1 TB of processed query data per month (it takes data to run SQL queries)
  • All tables have a default expiration time of 60 days
  • Does not support streaming data (aka. real-time data)

In order to get past the limits (like the 60 day expiration time), you can remove the sandbox limit by entering your payment card details in your Google Cloud Platform billing section. After that, you can use the free tier of BigQuery.

Regarding the free tier of BigQuery – the usage limits are similar to the sandbox version:

  • First 10 GB of storage per month is free
  • First 1 TB of processed query data per month is free

After the free tier of BigQuery, (if you go over the usage limits), your card would be charged for any data you store and query after those limits.

Regarding the paid version of BigQuery – the two main components to BigQuery pricing are 1) storage costs and 2) querying proccesing costs.

  • For querying cost – anything over 1 TB processed in a month will be charged at $5 per TB. (Resets per month, and you can still use up to 1 TB for free in a month.)
  • For storage cost – you pay for both 1) active storage and 2) long-term storage.
    • Active storage is any table that has been modified in the last 90 days
    • Long term storage is any table not modified in last 90 days. Price of long term storage drops by about 50%.

There is no difference in performance, durability, or availability between the two types of storage.

Storage pricing is anywhere between $0.01 per GB and $0.04 per GB (after the first 10 GB free each month.)

Note: this pricing is subject to change at any time, at Google’s discretion. You can go here to view their current pricing details for storage.

What has been my cost so far with using GA4 and BigQuery?

After doing some data modeling and frequent querying, I am still ranging at about $0.05 cents per month in both storage and usage costs, so there is definitely some room to grow.

The good thing about cloud data warehouses nowadays is that storage and compute costs are so cheap.

The thing that makes it expensive nowadays are the costs of a developer’s time to develop the data models.

How to Export Your Universal Analytics (GA3) Data the Fastest & Easiest Way [w/ Free Templates Included]
A Beginner Data Engineer Project: Extract Random Jokes via API with Python (Part 2)
A Beginner Data Engineer Project: Extract Random Jokes via API with Python (Part 1)
What Are Python Lambda Functions?
How to Make a BigQuery Real-Time MTD Job Cost Monitoring Dashboard (Part 1): Setting Up the Data
3 Things You Need to Get Paid More
How to Connect BigQuery with dbt
How to Model GA4 Data Easier in BigQuery with this Helpful Tool
When to Use IF NULL THEN 0 (How to Handle Null Values in SQL)
GA4 & BigQuery: A SQL Model for a Base Events Table (How to Unnest Fields)

Related Posts