Case Study: Google Broad Match Experiment & Results

If you’re like me, you may have dabbled in testing regular Broad match keywords without stellar results over the past few years. However, with the continued rollout and adoption of auto-bidding, where Broad match is recommended for use, we’ve been retesting it through experiments to see what kind of results we can generate, and some are surprising so far.

Test Methodology

For this particular account, we launched Experiment campaigns with only regular Broad match keywords. Our Control campaigns contain Phrase and Exact match keywords. Here are some key strategy points to note:

  • All the same negative keyword lists applied to the Experiment campaigns.
  • These tests were launched in both the US and UK.
  • All campaigns are using tCPA bidding.
  • These experiments were only launched for our non-Brand campaigns. This client has both a B2B and B2C side. Since our efforts focus on B2B and the Brand campaign has the greatest tendency to pull in some Consumer queries we stuck with just non-Brand for now.
  • We set up the Experiments to split traffic 50/50 with the Control campaigns.

Test Objective

Our client was looking to see how much we could scale our campaigns heading into the new year to help better plan for 2022 budgeting. We knew we could scale with regular Broad match keywords, but we wanted to also evaluate the lead quality to ensure we weren’t scaling up just to acquire less qualified leads. We were also curious to see how CPAs compared between Broad and Phrase/Exact variants.

In-Channel Performance Results

After about one month of running, these are the in-channel results we saw:

Volume/Spend Discrepancies

As you can see, even though we set up the experiments to split the traffic 50/50, Google seems to be heavily favoring the Experiment campaigns with the Broad match keywords. In the US, the disparity is greater, with a whopping 84% of spend going to the Broad Experiments, and 64% of the UK spend going to the Broad Experiments.

In reviewing Control campaign spend levels prior to launching the experiments, we know this isn’t an issue of the Control campaigns being unable to spend more. For example, the US Control Campaign 3 spent $1,626 the 11 days before the experiment was launched but only $749 the 11 days after the experiment was launched, compared to $2,247 in the Broad Experiment campaign.

We reached out to our rep to try to get some insight into these significant volume differences between our Control and Experiment campaigns. Our rep reached out to a product specialist and we got the following feedback:

“Please note that using a 50/50 split only guarantees that the control and experiment elements will enter the auction at a 50/50 split and not that they will necessarily serve in that split. The 50/50 isn’t an impression split, but an auction and budget split. The system can control how many times an ad enters the auction, but it cannot control the results due to different real time dynamics prevailing in different auctions. And ultimately, uneven serving demonstrates which arm of the experiment was more effective.”
Based on this info, we took a look at a couple more metrics to try to help us understand the uneven spending and volume between our Control and Experiment campaigns. We looked at Lost Impression Share due to Rank and percentages were very close in most cases between our Control and Experiment campaigns. Additionally, we looked at RSA ad strength and the ratings were identical between most campaigns as well.
That said, this leaves some questions around how Google is viewing the Broad match keywords vs the stricter match types. It seems to us like Google might be deeming the Broad match keywords more relevant which might be contributing to them winning auctions more frequently.

KPI Differences

As expected, CPA was higher in our Broad experiment campaigns in the US — 91% higher at $353 vs. $185 in the Control campaigns. Additionally, Conversion Rate was 54% lower for the Experiments.

In the UK, CPA was 38% lower in the Broad campaigns at $144 vs. $231 in the Control campaigns. Additionally, Conversion Rate was 42% lower for the Experiments. With Broad Conversion Rate being lower, we realized it was the significantly lower CPC in the Broad campaigns helping to drive the lower CPA ($2.91 CPC for the Broad campaigns vs. $8.11 for Control). We thought this was odd and unexpected so we started digging.

Brand Search Matching

When looking at the UK Search terms in the Experiments, we found that 71% of impressions were for the Exact match of the client’s brand name (keep in mind these experiments were being run for non-Brand campaigns). Last year, the client had decided to have us pause the Exact variant of their Brand name in the Brand campaign to see if they could keep traffic up with organic efforts instead. We had since reactivated the Exact match version of their Brand name in the US, but it had remained paused in the UK, so the Broad variants of the Generic keywords started matching to that traffic. While it’s a good sign that Google was able to identify the relevancy of the Brand searches to our non-brand keywords, for our purposes this was not ideal as we had been working to exclude Exact match Brand traffic.

Knowing CPA was much higher in the US, and that UK results were skewed by Brand searches, we took these initial learnings into the next stage of our performance review — evaluating lead quality.

Lead Quality Findings

In this account, we’re linked to Salesforce and conversions are being imported so we can easily review lead statuses in Google. The client considers both the Converted and Blocked leads to be good leads. Here is what we saw for both the US and UK:

Key Learnings

In the US, we saw what we expected where our cost per good lead was higher in our Broad Experiment campaigns at $2,270 vs $1,478 in our Control campaigns.

In the UK, results were the opposite where our Broad Experiment campaigns had a higher conversion rate and lower cost per good lead at $1,097 compared to a $1,977 cost per good lead in our Control campaigns. This is due to the Branded traffic that was coming through our Experiment campaigns in the UK which was not coming through our US Experiment campaigns.

One interesting result is that the percentage of Rejected leads was very similar in both regions for both the Broad Experiment and Control campaigns. We expected our Broad campaigns to pull in less relevant traffic so we were surprised to see this.

Next Steps

Ultimately, a large number of the leads are still in New status. It takes time for these to move through the pipeline. We paused our campaigns for now since we are heading into a slower time around the holidays and we’ll be revisiting lead progress and standing in Q1. If we find conversion and rejection rates are comparable and if our costs per good lead aren’t much higher for our Broad campaigns we’ll know that these keywords can provide a good way to scale when we have the extra budget available and the client needs to drive more lead volume.