Category: Data Analysis

Programmatic SEO: Tracking & Performance Monitoring

Programmatic SEO: Tracking & Performance Monitoring

So you’ve got a shiny new programmatic build, and want to know how it’s performing.

Monitoring indexation, traffic, conversions, rankings, there’s so much to look at so what should you focus on?

Let’s take a look into a few things you should be monitoring, and how to go about it.

Key metrics that can gauge performance

The key metrics you should be monitoring will change through the life cycle of your build.

A build that has just been launched, should have more emphasis on tracking indexation, rankings, and even impressions.

For existing builds, or as a build gets older, more emphasis should be placed on two other key metrics. Conversions & clicks.

Everything else is still important, particularly for understanding what modifications could be made to improve performance, but if your setup isn’t driving converting traffic, then what’s the point?

Unless you’re a build that’s purely advertisement reliant, of course.

 

Indexed Pages

The overall count of pages indexed. This can be monitored via the coverage report, and XML sitemaps.

 

Ranked Pages

Kind of an extension of indexed pages, ranked pages is the count of pages ranking in the top 100 for the most related keyword possible. You load up a rank tracker with keywords exactly matching specific pages, and then monitor them showing up in the top 100. It’s great for ensuring that Google is actually ranking the pages, albeit low to begin with, rather than just reporting they’re indexed.

 

Ranking Performance

Most will just use average rank for rank monitoring. Whilst it’s great, it’s unweighted. Keywords with 10 volume will have the same value as keywords with 10,000 volume. I prefer to use ‘estimated traffic‘, which takes into account both the ranking, and the search volume for a keyword, giving a much stronger indicator of ranking performance.

 

Clicks & Impressions

Pretty obvious here, but clicks & impressions are a clear indicator of performance. Can be broken down by keyword level (albeit filtered data) and URL level, it can give us quite a bit of drilled-in performance data.

 

Conversions

Last, but certainly not least, conversions. Provided you’re not an ads-driven programmatic build, conversions should be the key of all key metrics. It’s the most direct indicator of the actual performance of the traffic. If it’s good traffic, it’ll convert. Bad traffic, won’t. Even a badly optimised conversion funnel will get traffic into the funnel. These conversions can be broken down into different levels, from funnel entry to funnel exist, ie add to cart verse purchase, to get even better data.

 

Breaking the metrics down

When analysing the performance of a programmatic build, you can’t look at it at a page level.

You’ll spend hours upon hours looking through the data.

You need to analyse it at the category/site section level. Grouped performance data essentially.

Look at the data for sets of the pages at once.

You could break the data out by category, sub-category, location, or even analyses like ‘location pages verse generic pages’.

Any way you feel it could be analysed that returns a smaller amount of overall groups.

 

Analyse keyword data by categories

You should group your keyword data into categories and view your performance at the category level. Aggregate all the performance data, and view key metrics like clicks, impressions, and estimated traffic performance by category.

 

Analyse traffic and performance data by site sections

Just like you group keywords into categories, you can use the same setup to categorise your landing pages into site sections.

A quick way to analyse site section performance from GSC data is to also group the landing pages into site sections in data studio.

Another trick here is to individually verify GSC properties of each site section. If you do that, you’ll get fully granular data for each site section, in its own property. Everything will be filtered to the site section.

 

Tracking a programmatic build

On top of the standard conversions/clicks, there are a few extra ways I look at performance, especially for a programmatic build.

As long as the GSC is verified before launch, analytics setup with conversions, and you have your rank tracking set up, you’re golden for launch. Everything else can be set up based on the historic data.

There are a few things you can start to look at after a launch, with some of the things I look at being below.

 

Identify top keyword templates and generate tracking keywords

You might be wanting to track thousands of keywords for the overall project, which is great, but sometimes it’s a bit of data overload.

Focussing on a small subset to gauge overall performance can really help you get a quick analysis done.

I recommend identifying your top keyword templates and then generating a subset of keywords for each main section. I personally track between 100 & 500 keywords per site section/page type.

So for the ‘buy’ channel of real estate, you’d have;

real estate for sale <location>
properties for sale <location>

and then ‘rent’ might have;

rental properties <location>
properties for rent <location>

Using these templates, pick the top 100 locations, or whatever the variable is, and generate the keywords. You could use merge words, or follow my guide here on bulk keyword generation.

Load these up in a rank tracker, in either different projects, or tagged separately, so that you can quickly view the overall group performance.

I use serprobot, and would set them up like;

Domain.com – Buy Keywords
Domain.com – Rent Keywords

So that I can view averages just on the dashboard, and quick view performance for that whole category.

You might also track a subset of generic verse property type keywords. Generic would be;

property for sale <location>

but then property types might be;

houses for sale <location>
apartments for sale <location>

And you could do a similar thing with generic vs location keywords.

Just follow the same process as above, and generate the keywords and then load these up in the tracking.

 

Monitor your indexation

During the initial phases of a build, monitoring indexation is a great way of understanding how Google is initially reacting to it.

Indexing is the first sign of happiness.

An indexed page means it’s passed the initial checks by Google, and they at least, somewhat, found enough value in the page to index it.

If you’ve never worked on a large-scale programmatic build, the amount of pages that don’t get passed the discovered/crawled stages sometimes is incredible. So passing this stage is a great first step.

You can do this through the coverage report at the top level, and by site section if you’ve verified the individual site sections in their own GSC property.

Another good method is creating site-section-based XML sitemaps. These can then be clicked in the sitemap report, and you can view the coverage report based on just the XML sitemap URLs.

The final method of initial is by keeping an eye on the keywords with a URL ranking in the top 100. These should generally be the most related page to the query, so just watching the quantities of keywords with a URL ranking, as a per cent, can give a good indication here too. It might be a bit more performance/quality related than direct indexing, but everyone knows it gets pretty junky after a few pages anyway, so even a shockingly optimised page could pop the top 100!

 

Count of URLs with impressions

Not so much a key metric, but another great view of performance, is monitoring the overall count of URLs with an impression.

A more ‘advanced’ method of monitoring indexation, keeping an eye on the count of URLs with impressions is a great way to monitor the overall performance growth across the entire build.

Somewhat useful at the start, and particularly great for self-expanding programmatic builds, this metric will help you confirm whether the wider system is driving the performance, or whether it’s a small subset of URLs that are performing the best.

It’s great for ongoing monitoring, especially when you start to break it down by sections and page types.

Analyse the per cent of live pages that are actually generating impressions.

 

So many metrics, so little time

Keeping what you’re tracking to a minimum, will help you avoid data overload.

Think about what metrics you can’t backdate, like ranking information, and ensure they’re all setup just in case you need them one day.

You never know what you’re gonna need, and when.

Using Data to Determine What Filters Should be Targeted

Using Data to Determine What Filters Should be Targeted

Bedrooms, and bathrooms, and price ranges, oh my!

There are so many filters that could be used in the pretty URL, but what should be used?

What should earn the “pretty” status, and what should be relegated to query parameter status?

Some may say everything goes in pretty.

Some may just pick a few, and leave it at that.

Well, let’s take a look at how you could use data to inform your decision.

I’m not going to go into why you shouldn’t be creating pretty URLs for everything, that’s a separate post.

What we will run through are ways to set up keyword data, for you to gain insights about search trends of the filters you’ve got, so that you know what you should be optimising for.

The data you’ll need

To run this analysis, you’ll ideally just need an extremely large keyword research piece specifically for your niche.

It will need to be rather large to ensure you can get solid data, and you’ll also need to have it pretty clean, or at least understand the caveats of your data.

If you’ve also got a tonne of GSC data to mix in, then that would be great. That will help extend the data to cover a good portion of the 0-volume keywords that might not show up in keyword research tools.

For my examples below, I just spent 20 minutes pulling together some real estate keywords for 3 Australian cities, and ~15 suburbs from a “Top Suburbs in Melbourne” type list.

I wanted to limit it to location-specific keywords, as without a giant list of seed locations it can be hard to compare location v not location without significant time spent. Makes things messy.

 

Setting up your data

Whether you’re using Excel or Google sheets, you’ll need to create a data table to power the analysis.

The data table should contain all the keywords, their volumes, and associated categorisation.

 

Create categories for each filter

You’ll need to create categories for each of the filters that you are trying to analyse and work out whether they should be optimised for.

Go through and create columns, and set upYou’ll  their categorisation with rules for each possible value so that you can capture as many as possible.

For my example, I am using real estate keywords. A column has been created for each filter I’d like to analyse, along with categorisation for each of them.

Each filter has its seed values assigned, along with what the value is for that variable.

If a keyword contains either ‘buy’ or ‘sale’, it gets flagged as “Buy”.

If a keyword contains ‘1 bed’ or ‘1br’ it gets flagged as 1 Bedroom.

You can read more about how this works here.

You’ll want to be as thorough here as possible, and include as many variations as possible.

A couple of missed ones could really sway a decision.

Try and also create a catchall category at the end of filters with only a variable or two.

I created one for ‘features’ based on what I was seeing in the keyword data.

 

Prefiltering / cleansing the data

Depending on how clean your keyword data is, it might be better to just look at a portion of it.

A portion you know is 90% cleaner than the rest of the data.

For my real estate keywords, I know that if the keyword includes a channel, so ‘sale’, ‘buy’, ‘rent’, or ‘rental’, there is a higher chance of it being a keyword of enough quality for the study.

To include keywords that don’t include a channel (like ‘real estate’ or ‘properties’), I also include keywords including a bedroom or bathroom filter.

This is done via a YES/NO filter, that just flags it as YES if any of the filter cells have something inside them.

All my data analysis will have this filter applied, and it brings the keywords down from 9,000 to just 2,000.

I know those 2,000 are trustworthy to tell a story.

 

Creating your pivot tables

You’ll now need to create pivot tables for each of them so that you have a way to read the data

Go through a create a pivot table for each of your filters with the below data;

  • Filter as the row
  • Search volume SUM as value
  • Search volume COUNT as value
  • Search volume COUNT shown as % of grand total as value

The SUM should be obvious, being that it will be the total amount of search volume for each of the filter values.

The COUNT will be how many times that filter value is used among the keyword set.

The COUNT & % of Total will show us the actual % of keywords that use this filter value. A little quicker to analyse than the overall count alone.

 

Analysing and selecting your filters

Now we’ll get to read our data and see what we can make of it.

Let’s take a look at my property-type keywords.

We can see that of the 2,000 keywords included, 85% mention a property type. So only 15% are more generic keywords like ‘properties’ or ‘real estate’.

Even if you consider ‘homes’ as generic, that’s still less than a quarter of the keywords without a property type.

So yes, property type 100% needs to be optimised for.

 

Looking at the features keywords.

Only 2 keywords include pets, 2 with pools, and then 1 mentioning granny flat. If these were the only filter values available, I would not be optimising for them.

Similar story with the bathrooms keywords.

Only 2 keywords contain a bathroom-related phrase. Probably wouldn’t recommend targeting that in bulk.

Now onto the 2 that are a bit contentious when it comes to real estate sites.

The first one being bedrooms.

Bedrooms is one I personally recommend against optimising for directly under normal circumstances. At least at first anyway.

I feel it creates too many variations of URLs, with not enough reward/value in return for doing so. Can be worth targeting once all indexation/crawling boxes are ticked, especially with some rules in place, but maybe not directly out the gate.

In saying that, looking at the data 10% of the keywords (7% of total volume) include a bedroom value.

Is that enough to warrant targeting of it? Maybe.

But if we break that data down a bit further, and split out the city (Melbourne) from the ~15 suburbs, we see something a bit different.

16% of the city keywords (14% of volume) contain a bedroom term, verse only 5% (1% of volume) of the suburbs do.

So that’s 1 location have a significantly larger amount of keywords including it than the 15 other locations combined.

So if you create pages equally amongst cities & suburbs, you’re going to be creating significant volumes of pages when only a small portion of them will be useful.

Yeah, long-tail value this and that. I’m not saying definitely don’t, I’m just advising against it without restrictions in place.

A similar situation is with the prices.

Pretty low volume for the majority of the keywords that include a price (normally ‘under xxx’ type keywords).

And if we break it into city vs suburb, we get;

None of the suburb keywords in this data include a price. It’s only at the city level.

 

Why some filters may not be worth targeting

I’m a big believer in limiting crawlable URLs where possible.

Minimising re-use of listing content, avoiding the possibility of confusing Google – too much.

Keeping the site as small as possible, whilst still targeting as much as possible.

So why would I recommend avoiding creating bedrooms or pricing optimised URLs in bulk?

Well, it comes down to page count.

Crawlable page count to be specific.

Let’s say you have a location set of 5,000 locations.

10 property types.

and 2 channels.

You’ve already got 100,000 crawlable URLs right there.

If you then have 7 bedroom options, you’re looking at 700,000 URLs in addition, to that 100,000 that exist, that Googlebot will have to constantly trawl through.

Is it worth enlarging your site by 700% to target an extra 7% in search volume?

If you think so, then go for it.

That’s also if you do it cleanly. If you have other filters with crawlable links on the site, that overall crawlable URL count will only increase.

So if you’re creating significant page volumes off of smaller % filters like this bedrooms count, you must ensure you have your crawling well in check before you launch.

That way you can avoid exacerbating any existing issues.

There are other ways of efficiently targeting these types of keywords though.

In particular, I recommend a targeting strategy here on how to target these filters that may have value at key locations, and not others, by having a couple of tiers of locations.

 

Picking your filter values

To try and keep some filters in check too, you can also optimise the system so that only certain values of a filter get optimised.

Using the bedrooms as an example, you might choose to just create pretty URLs for Studios, 1 bed, 2 bed, and 3 bedroom apartments. 4+ bedrooms would then be relegated to the query parameter, and not receive the internal links pointing into it.

 

Let the data guide your optimisation

By leveraging this keyword data you can really gain an insight into what filters, and values, you should be optimising for.

Plenty of caveats, particularly around longer tail keywords that tools won’t give you, but there should be more than enough data to at least guide an initial decision.

It’s also easier to expose a filter later on, than to clean up the over-indexation caused by one if it needs to be reverted.

There’s also the other question here, is it even worth putting in the work to have separate ‘pretty’ and ‘parametered’ filters?

I’ll leave it to you to decide.

Building Your Own CTR Model from Search Console Data

Building Your Own CTR Model from Search Console Data

Rather than performing analysis using just the bare average rank, you can bring in a keyword’s search volume and add some weighting to the rank.

To do this, you estimate SEO traffic for a keyword by multiplying its current rank, by the estimated click-through rate at that position.

It’s not meant to be extremely accurate, but it’s meant to be a like-for-like comparison across the keywords, and show that positive movement for a 1,000/search keyword is more valuable than a 50/search keyword.

Now, I have a standard CTR model I use for these analyses.

However, if you want to be a bit more specific, be able to break it out of brand v generic, or be able to view it per category, then you will need to build your own model.

You can also run these CTR models at the landing page level, to analyse landing page performance

How to build an SEO CTR model

It’s actually extremely easy to extract your own CTR model, directly from your GSC data. Here’s how to do it.

1. Extract all your GSC data for the latest month (or longer if you don’t have a lot of data). Run separate exports for queries & URLs, and you can look at the CTRs for both of these.

2. Round the ranking data to the nearest whole number, by adding a new column and using the =ROUND(<rank>,0) formula to round it to the nearest whole number.

3. Create a pivot table from the data, but don’t just use the CTR. If you average the CTR you’ll give single-click keywords, the same weighting as keywords with 500 clicks. You want to create a calculated field, that looks at clicks & impressions from the data.

4. Under ‘values’ of the pivot table, create a calculated field for the CTR that is =Clicks/Impressions. A calculated field is needed, rather than the average of CTR, to weight the keywords. Low volume keywords will swing the CTR model if not weighted.

5. Plot a scatter graph using the pivot table, and throw on a trend line to make it a bit easier to interpret.

And presto! You’ve got a customised CTR model. Now replicate this using a landing page data set.

I prefer to remove some of the lower keywords, and particularly keywords with 0 clicks from the data, to ensure they’re not messing up the data too much as they have significant low volumes. Even with the weighted CTR using clicks/impressions, there can be a lot of junk with 0 clicks so it’s just easier to exclude it all.

It’s recommended you don’t export GSC data using both queries & landing pages at the same time. Both dimensions at once will cause GSC to sample your data, limiting what you get. So unless there is a specific analysis you want to do that requires both together, it’s best to export the data separately.

 

The data caveat

Unfortunately, GSC doesn’t give us all out query data.

Google explains the missing query data is related to privacy, and/or for long-tail phrases.

They’re basically excluding tonnes of the longer tail keywords, that will have low impressions/clicks each, yet might still drive significant quantities of traffic.

This missing data could skew results if they are indeed super-low individual click count keywords, due to how they may shift the averages.

Even with this missing data, you should be able to make some informed decisions you couldn’t before doing a CTR analysis.

Just know the caveat behind the data though.

 

What you can use your CTR model for

There are a few uses for your new CTR model.

  • Understanding brand vs generic CTR rates, and having a CTR rate with the brand removed
  • Analysing CTR rates per keyword category, and knowing underperforming areas
  • Analysing CTR rates per landing page, and knowing underperforming landing pages
  • Forecasting SEO traffic
  • Setting baseline CTRs for areas to then make CTR optimisations & monitor going forward

And many more…

 

Access my sample sheet

You can access my sample sheet with the below link.

You can paste in your data and it should do all the calculations and the graph for you.

 

How will you use your new CTR model?

Always up for hearing what people will do with this stuff. Whilst you can just roll with the default model, a customised CTR model gives you so many more options.

 

Export Google Search Console Data

Export Google Search Console Data

When you start doing SEO data analysis, one of the first things you’ll find out is that GSC only gives you 1,000 rows of data.

So how do you extract the rest of the data?

How do you get more keywords, and more landing pages downloaded from GSC?

Search Analytics for Sheets

Whilst there are a few ways, this is how I always end up doing it.

Every time.

Search Analytics for Sheets provides the easiest method of extracting more than the 1,000 rows available inside GSC.

In just a few steps, you can connect your search console, select your filters, and export your data.

 

How to export more than 1000 results in Search Console

1. Firstly install the GSC extension, available from here

2. Connect your search console account to Search Analytics for Sheets by following the steps, and approving its access

3. Open up a new Google Sheet, from the Google Drive account that is linked to your Google Search Console. You can’t use an account that just has shared Google Sheets access, it needs to be the account that is actually linked to the Search Console account.

4. Open the Search Analytics for Sheets sidebar, by clicking Extensions > Search Analytics for Sheets > Open Side

5. Select your GSC property, enter a date range, enter page or query as “Group By”, add any filters, and then click ‘request data’

You will need to be mindful of the ‘Group By’ area. If you select page & query at the same time, GSC won’t give you all your data. You will receive a sample data set, only giving you a portion of the data.

Using Page & query at the same time is great for a couple of different analyses, but if you’re just starting to play with GSC data then it is better to export queries, or pages, to ensure you get decent figures.

Pages is what you should use if you’re looking for actual URL totals, as Google will even hide some data from the ‘queries’ to protect privacy. Things with phone numbers or personally identifiable information, when it detects it.

 

Creating automatic Search Console data exports/backups

Search Analytics for Sheets even offers automatic backups.

In the sidebar, you’ll see a tab for “Backups”.

Flick over to this, and you’ll see the same options as the primary tab, except for the extra “period” area.

You can enter whether you want daily or monthly backups here. Just set that up as required, and click on ‘enable backup’, and it will run at the intervals you have set and pull the required data so that you can have backups of it.

This was definitely required back when we only had 3 months of history, and not 18 months, however, it’s still good to run it now and then. Particularly for your own properties, or client properties where you think they’ll be with you for a longer period.

 

Plenty of uses for GSC data

There are plenty of different analyses you can run using your GSC data, so knowing how to export it is vital for these.

Analysing Brand vs Generic Keyword Performance

Analysing Brand vs Generic Keyword Performance

Breaking down your GSC into brand vs generic keyword performance can give you a better insight into how an SEO campaign is going.

Most SEO activity won’t be affecting the brand too much, as it will be more to drive performance increased within the generic keywords.

Whilst you’ll get a flow-on effect, in that as generic performance improves, more traffic will then drive brand marketing essentially, increasing return visits via branded search.

That takes time though.

You’ll also want to monitor rankings separately, and rankings for brand-related keywords will bring down(up?) the average, and hide some of the generic keyword performance.

My first analysis like this many years ago took quite a while to figure out.

Now, they’re like 30 minutes.

Definitely a worthwhile analysis, considering how short they are.

What are brand keywords?

Brand keywords are any keywords that include your brand name.

This could include both brand + generic phrases, as long as there is a mention of your direct brand in the keyword it’s a branded keyword.

It also includes anything that should include the brand, and by that I mean misspellings. Whilst you might think there are 3 ways to spell your brand, I can assure you the internet will come up with 50 more.

 

What are generic keywords?

A generic keyword is any keyword that does not include a mention of the brand.

It’s basically the “everything else” bucket for this analysis.

There could be keywords mentioning other competitors, or other brand names from what you sell, but these will still be classed as generic (although, you could further categories these later to get additional insight).

 

Why analyse brand vs generic keyword performance?

There are two main reasons I like to do these analyses.

Brand vs Generic keyword split

Initially, they’re great to set the default pre-SEO brand vs generic split percent.

As your generic SEO activity increases, the assumption here is that the generic keywords will improve performance more than the brand. You would then see an increase in the overall % of the generic keywords.

You can then trend this monthly, and just add something else to the list to help show any performance increases.

Exclude branded keywords from GSC data

The other reason is to essentially just exclude brand performance from GSC data.

Sometimes brand performance, particularly spikes from campaigns, can mess with the GSC data.

Excluding these branded keywords can help you get a more accurate overview of ranking performance for the keywords you’re targeting.

 

How to do a Brand vs Generic keyword analysis

1. Use your favourite GSC data exporter to get a dataset of Query, Clicks, and Impressions. I personally use Search Analytics for Sheets, and would only select the latest 3 months of data, or even just the latest month if the sites big enough, to ensure that the analysis is with recent data. You could rerun with older data to compare.

2. Copy my Google sheet from the below link

3. Import your data into the rows highlighted below. If you have included CTR & Av rank then just ignore them, but most exporters by default will include that data so we may aswell save it for later.

4. Enter variations of you’re brand terms into the K column, and then mark them as ‘Branded’. These should be the shortest variations you can come up with, that still ensure generic keywords don’t get included. Leave the cells below it, as the formula will just fall back to ‘Generic’ when a brand seed word isn’t found. You should only really be adding 5-10 to start, and can come back and refine it to more later on.

 

5. (OPTIONAL) If you’d like to get deeper insight, add some categories & sub-categories into their columns. This will help you break the performance down by category, if that’s something you think would be useful for your brand.

6. Visit the ‘analysis’ tab to have a brand vs generic performance overview pivot table ready for you

One caveat to this is that due to exporting query performance data, you have a limited click count. Large portions of longer tail keywords are excluded from query exports, so whilst this % is a good indication, it should be used as only that and not a source of “truth”.

You will be able to see this shift over time though, as both branded and generic keywords will get excluded.

You can read more about that here under the ‘queries’ expander.

Monthly Brand vs Generic Keyword Analysis

This can be extended to include monthly data, and I might extend the Google Sheet later if I get a few asking for it.

Just throw on a month row, ensure your data exports are monthly, and then just add them all together in the same sheet with each different set having its associated month added.

 

Just another insight you can provide

Overall, this is just another little insight you can provide your clients, and hopefully, another metric you can just keep an eye on.

 

Data Preparation and Setting up a File

Data Preparation and Setting up a File

Data preparation is the base of any good analysis. Getting it right from the start will just ensure that any analysis you do in the future, will be nice and easy for you.

It’s back to basics time!

I run through a number of different basics, from file naming and why you should follow a pattern, through to how you should be handling your raw data & your dashboards.

There are also a few different ways of setting up the raw data, and how to split your metrics. My personal preference includes a new row for each month across each segment. So multiple metrics could be used on the same row, however, each row could be a different traffic source, or a different site section.

Bulk Keyword Categorisation & Classification

Bulk Keyword Categorisation & Classification

Categorising a large keyword list in bulk takes hours to do, unless you do it a special way. Learn how to categorise thousands of keywords in seconds through the use of a magical Excel formula.

You can categorise data in Excel with this formula, it doesn’t only work on keywords. Another popular use of mine is to categorise landing pages, which can significantly help with showing how ranking performance has improved certain landing page’s traffic.

Let me show you how you can categorise text with keywords in Excel, removing the need for another keyword categorization tool you need to pay for.

This is an older video, of the keyword categorisation process in Excel. The overall principles are exactly the same, however this one will not work in Google Sheets.

But, I have a solution for that. A different formula that works in Google Sheets.

Categorising & Classifying Keywords in Excel

To categorise and classify your keywords in Excel, just like in the video, you can download my Excel template below.

Download the Keyword Categorisation Template
  You will receive a download link via email. We hate spam and never share your details.

Categorising & Classifying Keywords in Google Sheets

If you would rather categorise keywords in Google sheets, my preferred method these days, then you can copy my categorisation sheet below.

You will see a very similar template to what is shown in Excel in the video above, except with a slight tweak to the formula.

The formula I use for this is;

=IFERROR(ARRAYFORMULA(INDEX(*categoryRange*, MATCH( TRUE,ISNUMBER(SEARCH(*findSeedWordRange*,” “&*keyword*&” “)),0))), “Unclassified”)

(might need to change the quotations due to formatting issues)

You’ll see one major difference with the Google sheet though, and that comes down to the cell ranges used for the find/category keywords.

If a cell range in the formula is larger than what’s available, when no category is found it will leave the cell blank.

To use the ‘unclassified’ text, the range needs to be exactly the same as what’s available. This means every time you add keywords into the find column, you’ll need to expand that range.

I tend to use just the blanks as the other/generic/unclassified as it tends to make life a bit easier.

 

New categorisation formula & setup

I was reached out to on Linkedin by Kyle Faber and he believed he could help tweak the categorisation system a little bit.

Using a combination of regex & a textjoin, he was able to help tweak how the categories are applied.

It does appear to be faster that the setup posted above, and is also a lot simpler & cleaner which will help troubleshooting.

I’ll break it out a bit more later, but for now, the new formula for joining all the seed phrases together is;

=textjoin(“[s]?|”,true,sort(I$3:I,1,false))

Then the formula used in the category cell, is;

=IFERROR(VLOOKUP(REGEXEXTRACT(” “&A2&” “,I$1),I$3:J,2,FALSE),”Unclassified”)

Which will categorise the exact same setup above, with a different formula, by looking for the matching find, and then return the category with a vlookup.

It will default to partial matching, but if you’d like exact matching of a word, then enter a space before & after the find column.

That will then match full mentions of the word, rather than just partial matching.

Was great to get some feedback on this, and be able to tweak the setup that I’ve used for years!

Get the updated sheet at the link below;

Definitely give Kyle a follow on his socials, at https://linkedin.com/in/kylefaber & https://twitter.com/regal_kyle.

 

Categorising URLs into Site Sections

You can also use this directly on URLs, to categorise them into site sections, locations, or anything you’d like. This would allow you to view SEO performance by site section, based on grouping the URLs by folders, or any structure within them.

Just use the exact same formula, but rather than putting keywords in the ‘find’, add portions of URLs that you would like to group together. The formula will then put each of the URL structures together in to their groups, allowing you o view their peformance.

 

Let me know how you use this sort of sheet to categorize in Excel or Google sheets, or whether you do this all a different way.

Estimating SEO Traffic for a Keyword Set

Estimating SEO Traffic for a Keyword Set

Estimating your organic traffic will allow you better assess your current performance for a set of keywords, but also allows you to directly compare that performance with your competitors.

In this video, I run through how to properly set up your data and run the formula across a set of categorised keywords. This will be the basis of future videos, on how we can better use this data.

The estimated traffic is being worked out by multiplying the search volume of the keywords, with the expected CTR at their current rankings.

The formula I use for this in Excel is;

=IFERROR([@[Search Volume]]*(IFERROR(LOOKUP(9.99999999999999E+307,SEARCH(CTRModel16[Position],[@Rank]),CTRModel16[CTR]),”N/A”)),0)

Which is templated as;

=IFERROR([*CELL FOR SEARCH VOLUME*]*(IFERROR(LOOKUP(9.99999999999999E+307,SEARCH(*CTR MODEL TABLE RANK COLUMN*,*CURRENT RANK*),*CTR MODEL TABLE CTR AT RANK COLUMN*),”N/A”)),0)

Removing the bulk of it, we essentially have the search volume, along with a lookup for the CTR at that rank. The lookup could be simplified with a vlookup if you want to, however, I choose to just use my categorisation formula for this.

This can be simplified to just a vlookup for the rank table, especially if you’re using whole rankings rather than GSC average rankings that have a decimal place.

Estimating SEO Traffic in Google Sheets

You can duplicate a Google sheet below which is 95% the same as the Excel above to get you started.

Let me know if you have any issues, but it should just be plug and play with your keywords, rankings data, and any associated categories.

Top Ranking Domains Analysis for a Keyword Set

Top Ranking Domains Analysis for a Keyword Set

Ever wanted to see the top-ranking domains / URLs for a particular keyword set?

This analysis will let you do exactly that, and give you more refined performance data, allowing you to dig into the true top performers.

There are a few tools out there that let you see the top competitors and their estimated traffic.

However, many of these tools fail to give good insight into category-specific performance. This holds you back from being able to extra true competitive performance insights, as you’re only seeing the wider picture.

You’re missing out on the drilled-in numbers

Watching the video above, you’ll learn how to build out your own analysis, using a combination of Excel & SerpRobot.

 

Steps to extract the top-ranking domains

  1. Export SERPs data from SerpRobot
  2. Format the data appropriately in Excel
  3. Extract the domains from the ranking URLs
  4. Run estimated traffic formula based on ranking & search volume
  5. Categorise the keywords into required categories
  6. Pivot the data based on the domains, with slicers/filters for categories

 

And there you will have it. Your own analysis, with estimated traffic figures at the category level, so that you can get the direct competitor insights you’re after.

Categorise GSC Data into Site Sections in Data Studio

Categorise GSC Data into Site Sections in Data Studio

I’ve finally started playing in Google Data Studio, and obviously kicking off by messing around with my Search Console data.

One of the initial things I wanted to do was categorise the landing page data into site sections, like I normally do in Google Sheets.

Thought you would need to create a reference table of all the URLs and their site sections, but then came across this post that gave some more advanced methods of categorising on the fly directly in data studio.

 

Creating a custom dimension in Google Data Studio

The first thing you need to do is create a custom dimension in Google data studio.

Click the “add a field” button in the bottom right of the data editing sidebar.

You’ll be presented with a popup which is where you can paste your site section formula.

Site Section categorisation formula

The site section formula you want to use, will be a modified version of the below formula.

case
when regexp_contains(Landing Page, “/laptops/|/other-folder-as-example/”) then “Category 1”
when regexp_contains(Landing Page, “/second-category/”) then “Category 2”
when regexp_contains(Landing Page, “/third/”) then “Category 3”
when regexp_contains(Landing Page, “/another/|/also-this/”) then “Category 4”
else “Unclassified”
end

This formula will categorise the site sections based on the text you have that is in the quotes, next to ‘Landing Page’.

You can add multiple texts for each site section by using a pipe | and that will act as an OR. So /folder-1/ OR /folder-2/ basically.

Then just change the category names to your site sections.

Extend or shorted by just adding new lines in between the ‘case’ and the ‘else’.

You will need to keep unclassified at the end, or rename it to something else, to ensure there is a fallback.

When the URL includes this and this (so includes multiple elements)

If you’d like to categorise into site sections based on requiring 2 elements, ie maybe a subdomain and a specific subfolder, then you will need the following formula instead.

case
when regexp_contains(Landing Page, “/laptops/|/other-folder-as-example/”) then “Category 1”
when regexp_contains(Landing Page, “/second-category/”) then “Category 2”
when regexp_contains(Landing Page, “/third/”) then “Category 3”
when regexp_contains(Landing Page, “/another/|/also-this/”) AND regexp_contains(Landing Page, “/this-folder/”) then “Category 4”
else “Unclassified”
end

That formula, for the last folder anyway, will ensure that either of the first 2 exist, along with ensuring that the second part also exists.

 

When URL includes something, but not something also

If you’d like to include specific folders, but ensure that a different folder didn’t also exist, then this is the formula you need.

case
when regexp_contains(Landing Page, “/laptops/|/other-folder-as-example/”) then “Category 1”
when regexp_contains(Landing Page, “/second-category/”) then “Category 2”
when regexp_contains(Landing Page, “/third/”) then “Category 3”
when regexp_contains(Landing Page, “/another/|/also-this/”) AND NOT regexp_contains(Landing Page, “/but-not-this-folder/”) then “Category 4”
else “Unclassified”
end

For the last site section, that will ensure one of the first two pieces exist, and that the last piece does not exist, to assign it to that associated site section.

 

The outcome

When you’re done, just pull the dimension in like you would normally use ‘landing page’ and it will aggregate all your data into the associated site sections.

Pretty easy was to see aggregated data, rather than just landing page level!

 

* Due to WordPress being WordPress, some of the formulas may not copy paste. You may have to replace the quotations, with new ones, as the styling may be affected.