After running through the keyword research in part 1 here, we’re now digging into the SERPs.
We’ll be extracting the data, and then looking into a few of the different analyses we can do with it.
- Extract SERP data
- Format the data
- Top performing competitors
- Top-ranking pages & page types
- Look into the structure of a few competitors
This will all be the preparation for the final part, where we will be able to leverage the data to make actionable recommendations to a client.
Extracting SERP Data
Now that we’ve got a relatively clean set of keywords, we can start to scrape some SERP data.
I used to do this with Serprobot, and just export their SERP report which gave the top 10 keywords.
But now, I use Valueserp. With a cost of $2.50 per 1,000 keyword checks, it’s not too bad! Only gets cheaper with more. Easy to use playground for testing and quick checks too.
EDIT: Well, I did use Valueserp, and probably still will…. but I built a Google SERP Extractor for Google Sheets that you can use with another provider. The below was written before I did this so I’ll keep this as is for now. But use the Google Sheets if you want a different solution!
I tend to leverage chatgpt-written Python to interact with their API endpoint, however, they have an extremely easy-to-use CSV export function which lets you bulk run keywords, and then download a CSV of the serps.
You can upload up to 15,000 keywords at a time, and the batches are run pretty fast.
Since I’ve ended up at 12,000 keywords, to try and save a few Google Sheets rows (it hates 50,000+, but reallllly hates 100,000+), I’ll run “commercial” terms with 50 or more volume, which brings us to 4,200 keywords.
1. Jump into Valueserp and signup, add some credit, and then go to batches;
2. Click on ‘Add Batch’;
3. Enter a name and click save. You don’t have to worry about the settings, or the CSV settings yet;
4. Click ‘add in bulk’ and then download the template ValueSerp gives you;
This is the biggest pain point, and the place I slip up sometimes.
In the file you’ve just downloaded, you need to prepopulate your settings.
The following are the settings I’ll use;
Location: United States
Google Domain: google.com
That will return the top 15 rankings for each keyword in the US for the keyword set.
5. Upload your CSV template;
6. Confirm the numbers, and the market are all set correctly, and then click start;
4 minutes after clicking start, I have the top 15 results in Google, for just over 4,000 keywords
7. Click actions and then download the results set
8. Click on download
There might be multiple result sets depending on how many times you’ve run the keyword set, but if this is the first time just click the first one showing.
9. Click CSV and then click on the CSV builder as you need to select the fields to download.
10. Select the fields you want. I have just unselected ‘page’, as I don’t need that. If you’d like the meta descriptions too, just select snippet. Then click ‘ok’.
You could also select things like results count, and then do your competition analysis off of that if that is a metric you’d want.
11. Finally, download all pages of your result set in your custom csv.
You’ll get CSVs that cover ~1,000 keywords each, so just merge the data. You can do this with the csv merge trick from the CMD (or Mac equivalent) we used earlier to combine keyword data.
Don’t forget to remove the title rows!
Paste all the data into a Google sheet, and just go grab a coffee or something if you’re on a laptop. Sometimes Google takes a little while to think about what you’ve just done.
Sometimes you’ll get extremely long URLs, or other items, that are too long to paste into Google. Mine had a handful of URLs that were 2,000+ characters long and Google sheets didn’t like that.
Sorted by length, and purged all the URLs from one site, wasn’t to high ranking anyway, and it let me paste in the data.
Setting up your SERP data
Once you’ve got the data in Google Sheets, we need to get it set up for analysis.
First, vLookup to the keyword sheet for the search volume, along with your main categories & anything else you want to analyse. I’ll just bring in category & product line for now;
Then we want to estimate traffic for each URL, based on the volume for the keyword & their current position.
This follows my SEO traffic estimation setup from here.
This will give each URL its traffic estimation. The CTR model is from an old study years ago, and you could use an updated one if you’d like, but this just allows a like-for-like analysis across all keywords & URLs.
You’re now ready to analyse the data.
Analysing the SERP Data
With a suite of SERP data ready for analysis, we can finally see who is ranking where.
Key competitor reports
First up, top-ranking competitors.
A simple pivot chart & graph of aggregated est. traffic shows us Amazon’s clear d-d-dominationnnnn.
This is where we could then break things down by category;
And possibly plot them on a stacked chart to show the category breakdown of the top sites;
We can pull a couple of insights from this like;
- Macy’s, Brooklinen & The Company Store have minimal to nil exposure within the ‘pillows’ product set
- Pacificcoast has no exposure within linens, and gets its strength from duvets and pillows.
And of course, you can set your current benchmarks for your client from the data, so they can understand where they currently sit within each category.
Breaking this down a bit further, we can delve into the product line of the categories.
Taking a look at pillows;
The “pillows” line is the generic keywords with no subcategory, so takes up a big share. For our set, they just don’t mention the material. They could mention size, or firmness though, and would fall under the generic sub-category.
One key insight here is that downandfeather and pacificcoast outrank Amazon within the ‘Down Pillows’ product line, but both of them have negligible exposure within the more generic ‘keyword’ space. Interesting.
Also, note targets strength in the more generic category. We’ll dig into that later.
The best value here is highlighting a client’s, or prospect’s, current performance.
Could even use this sort of data for cold outreach. Hit up the #5-20 and offer some services!
Top ranking pages
The overall performers are great to see, but it’s not normally actionable information. It’s just a bit of insight that you can “ooh” at, set benchmarks off, and continue on your merry way.
The real value here is that you not only have these domains, but you also have the URLs ranking for each keyword along with their page titles.
I’ll get into some more specific optimisation & recommendations from this data later, but this is what we’re looking at;
URLs are broken down by their estimated traffic.
Straight away we can see the URL with the second highest estimated traffic is their product page, and their collections page is much further down the list in ~10th.
This one is great for quickly eyeing off the actual ranking URLs, but best broken down by category/product line so you can see who is ranking with what URL.
You could even reverse this, and then filter the keywords by the URL.
Gets a bit tricky doing that easily in Google Sheets though, so I don’t tend to do that too much. Harder to pull bulk insights from it.
Top-ranking page types
For ecom, and many larger portals, marketplaces & directories, there are a few key page types.
I don’t mean the stock standard homepage/landing pages, I mean more like the following;
Category Pages – Structured categories determining the site hierarchy.
Filtered Search Result Pages – Search result pages filtered by different modifiers, like material or size, separate to the category pages.
Product Pages – Individual product pages, that are listed on the category & SRPs.
Most of the larger sites have pages that fall under this structure, and then some others like “Blog” or “Guides” as a catch-all for the rest.
We can break down our competitor URLs into these groups, and see who is using what to rank. This is a great strategy insight, and can guide what pages you need to create & optimise.
Just like from the keyword research, add a new categorisation on the SERP table, that will use the URL and categorise based on specific path elements found.
You’ll want to run through the top URLs, and extract the most specific path that can be categorised into sections.
Some examples are below;
So we just need to flag and categorise those, and then say whether its a Category, Filtered Search, or a product page.
Some are easier to spot than others, and categories can sometimes merged in with filtered search too. When that happens, you’ll just have to make a decision and caveat the data. I’ll tend to dump it into category.
You’ll also find, that when sites do SEO properly they might not have any identifiers in the URL structure.
Look at that Pacific Coast category URL;
Damn them for having such a clean structure.
In this case, you’ll need to have a separate rule for ‘pacificcoast.com/down-pillows/’ and flag that as the category.
Unfortunately, they’ve also got an idea structure setup for their filtered search;
This makes it even more difficult to ID the filtered search, and I will just let these fall under the category bucket.
At least all their products have the MV identifier so we can flag those up;
The fastest way to run through these rules though, is going domain by domain, from the top handful of domains. I’ll do the top 15, as that’s what our charts above show.
Target made it nice and easy for us;
Ended up with a heap of different rules to define each page type.
Run it through the data, and we can now break down each of the URL sets by the type;
Firstly, NYT comes in with the “Other” category, because they only rank with blog content. Parasites.
Next, are the URLs I have classed under the ‘search’. These aren’t structured search though, these are where Amazon truly shines.
/dp/ dictates the product page. That’s easy.
/b and /s appear to dictate the structured category pages, like;
They’ll have the category selected on the left;
But then you’ve got these other search pages, like;
These are just some lovely keyword search pages;
No structure in them at all, and it appears Amazon just list out some categories that it thinks are related, based on the products that come back from the search.
Not something I would suggest copying.
It’s Amazon, so it ranks well.
Target has a bit of a weird SEO setup.
They have a category page like https://www.target.com/c/duvet-covers-bedding-home/-/N-5xtuy
Which lets you select different sizes;
Which all go to their own unique URL;
But then they have extra filters, which appear to just append a bit more ID to the previous URL, like for cotton here: https://www.target.com/c/duvet-covers-bedding-home/-/N-5xtuyZg8grc
However these do canonical to a proper URL of https://www.target.com/c/duvet-covers-bedding-home/cotton/-/N-5xtuyZg8grc so it just updates the page ID, and then they give it the full clean URL via canonical / links. The joys of filter setups.
The big thing here is that we can just see they’ve prioritised sizing, over material, based on the links provided under the H1.
Separately to this setup, they have URLs like https://www.target.com/s/100+cotton+sheets.
These leverage a special longtail system. There’s a couple different companies that do this setup, and I will recommend to everyone to avoid them. Target’s may be inhouse now though.
The systems work by just uploading a tonne of keywords, and then pages get created for each. They all interlink to related pages, and there is normally a linking widget on the homepage that points in.
There isn’t one on the homepage for Target, but category pages have;
With all the links pointing into the /s/ urls. Many reasons to avoid, and many ways you can do this better yourself.
Let’s take a look at one of the reasons to avoid this system:
On top of that already covered by their own category pages;
The categories just need a bit of a cleanse and upgrade, and they could work their way off the /s/ system.
These pages are the reason Target covers such a broad range of keywords though. This system is just a mass mud-on-the-wall page creator so that they can see what sticks.
They can target anything, and everything, allowing them to be up there with Amazon. They have the product range to do it – which along with an authoritative domain/brand, is the key to these pages working well.
Anyway, those systems are a story for another day. That’s the story of Target.
Bed Bath and Beyond have a sightly un-optimised structure.
They have category URLs like;
But then have a filtered URL of;
I’ve classed the standard /c/URLs as categories, and anything with &a1 in it, as filtered search.
A bit of cleanup here could go a long way for them, but this is still a great base-level setup.
They’ve got some basic category URLs like https://www.macys.com/shop/bed-bath/duvet-covers?id=25045
Which appear to be standard category structure URLs;
But then they have these /b/ and /featured/ URLs, like;
You can make any URL you want out of that /featured/ one, so appears a similar “optimised” keyword search;
You can’t with the /b/ as it appears it leverages the id to redirect to the final URL.
Some of the /b/ URLs appear to redirect;
So they may be phasing them out?
Ignore the domain shown, SSD crashed, the backup didn’t have my VPN installed, haven’t bothered installing yet, and Macy’s blocks ALMOST EVERYTHINGGGGGG outside of US. This works though.
Anyway, all of this just gives a good perspective of whos ranking, and what types of URLs they’re using to rank.
Categories, search or product pages?
I’d then break down search into structured vs unstructured.
ie navigatable / hierarchied / appropriate filters, verse keyword style search.
I’ll never recommend a raw keyword approach for a client.
However, seeing this style already existing in the SERPs, I know we can go broader with a good structured setup.
Keyword Clustering from SERP Data
And…. this is where I jump in with another edit.
This was something I was going to do in the next post with the valueserp data, but since I already through together the keyword clustering Google Sheet I thought I would throw it in here too.
I copied my Google sheet, added my Serper API key, and then added the ~4,000 keywords with their search volumes;
The most keywords I have run with the tool so far was 2,500 so let’s see how it goes with 4,000! The only setting I have changed on the sheet was the batch size, to increase to 40 to try speed things up.
Damn, got an exceeded maximum execution time error after 3,000 keywords!
Oh well, just need to hit ‘extract’ again and it will continue from where it left off!
Now we have 40,000+ rows of SERP data for the keywords again.
Can just click ‘GROUP KEYWORDS’ now, and it will run through them all and cluster them;
45 seconds later, I got grouping data for all 4,000 keywords;
It doesn’t look like much now, but by jumping into the groupings tab of the sheet, we can see all the groups together.
Picking out just a few of the groups, we get;
King Duvet Cover
Deep Pocket King Sheets
I’ve added the cluster data into the Google Sheet that has the ValueSerp SERPs on it. Might go through and swap out the ValueSerp data at some point, but all the other work was off the ValueSerp data so I will just leave as is for now.
Leveraging the SERP data to make structure & targeting recommendations
With AI now in the mix, the true value from this SERP data is in our optimisations going forward.
We can work through all of this, in bulk, and determine the best URL paths, structure, content requirements, and more.
Use what’s already ranking, to work out exactly what we need to rank.
But, we’ll need to cover that in a whole new post!
Stay tuned for the next post, where we will cover how to set up a site structure, and leverage the SERP data to optimise your pages. We might even have a look at how we can play with these new keyword clusters!
Want my data?
Copy my sheet with the link below.