Google Data Studio Aggregation Gotcha #1

I was just asked to fix a Google Data Studio dashboard that was trying to calculate a cost per lead.

Cost per lead (CPL) is the most important metric in many digital marketing campaigns and getting it right is critical to making good marketing decisions. We love it when leads cost us nothing and hate it when leads are super expensive.

The formula is super simple:

CPL = Cost / Number of Leads

When I looked at the dashboard I saw a table that looked like this:

table of poorly aggregated data

Huh?

What's going on here? This certainly doesn't pass the smell test.

The math isn't even close to what I'd expect to see.

Fixing aggregation issues for an accurate table.

If your super spidey sense is firing on all cylinders, you're not alone.

I was befuddled when I looked at the calculated field formula:

Wrong CPL Calculated field formula CPL = cost/lead

I saw the Cost / Leads and I saw that Sum was selected below.

This is aggregated properly right?

NOPE!

Headscratching ensued.

Before we get rolling with a fix we need to answer an important question.

What is data aggregation?

According to Wikipedia, data aggregation is the compiling of information from databases with intent to prepare combined datasets for data processing.

For our example, we wanted a sum of all the cost and a sum of all the leads. We wanted to aggregate each field as a sum.

Hold your horses, Leads you say?

Glengarry Glen Ross Gif

My first clue was this metric named Leads. I knew that that was not a standard field coming out of Google Analytics so I wanted to take a closer look at the formula for how the calculated field was built.

Hold on a second.

What are Calculated fields?

We need to take a quick peek at Google's documentation for clarity.

A calculated field is a formula that performs some action on one or more other fields in your data source. Calculated fields can perform arithmetic and math, manipulate text, date, and geographic information, and use branching logic to evaluate your data and return different results. The output of a calculated field can then be displayed for every row of data in charts that include that field. How this new data is displayed depends on how it's used.

Not only that, but we learn that there are two types of calculated fields.

There are 2 kinds of calculated fields, determined by where you create them: in the data source, or in specific charts in a report. Each kind of calculated field offers certain advantages over the other.

I navigated to Manage Added Data Sources > Google analytics data source and found the calculated field named Leads that was built in the data source. I then looked at the formula for how it was built.

Lead formula that does not have aggregation

So I thought I'd add aggregation inside the formula by adding a sum to each field. This produced an error.

Field error when trying to add in sum

No Joy.

And more headscratching.

I'm so confused

I then remembered a section of Google's Docs that said that, "You can't use a data source calculated field with blended data." which made me think I'd need to look at fixing it with the other type of calculated field, a chart-specific calculated field.

Give us the fix already!

It turns out the fix is pretty simple.

This chart was built from a blend of a Google Sheet and the Native Google Analytics connector.

This meant that I had to make a calculated field in the right hand side of the canvas (a chart specific calculated field). This is because you can't create a new field any other way for blended data.

This also told me that this was where I had to add the aggregation.

In other words, when creating a field Like Cost per Lead in a calculated field, you need to aggregate in the calculated field like this:

proper calculated field aggregated CPL = sum(cost)/ sum(leads)

Which thankfully yields a table chart with clean accurate data like this:

proper cacluclated table chart

Which of course made me feel like this:

Rocky Victory at top of steps

Takeaways

  • This issue is just another reminder of how Blends in Google Data Studio are like crack. On the surface they look like the answer, but once you use them you realize that agregating data outside of Google Data Studio is way less error prone.
  • If you're building a reporting solution for an agency, relying on chart specific calculated fields makes the report building process more error prone. See above. Anything you have to customize and modify each time you duplicate a report from client A to client B is an opportunity to make a mistake. Aggregate outside of Data Studio.
  • My preferred method is to aggregate in either Google Sheets or in BigQuery, depending on how the data warehouse is built, and to just visualize the data in Data Studio.


Google My Business Q + A Add-on for Google Sheets

We built this Add-on to help you manage GMB Questions and Answers in an easy familiar way with the aid of Google Sheets.

Get the Add-on

Join the official Agency Automators Slack Group

Share and learn about automating your digital marketing agency.

AgencyAutomators is a Bike Shop SEO and Kogneta company devoted to helping your agency become more profitable with automation.

AgencyAutomators © 2020
Sitemap
Terms of Service
Privacy Policy