Category: Excel

Stripping and Rebuilding a URL Structure for Migration Testing

Stripping and Rebuilding a URL Structure for Migration Testing

Some clients have recently required a migration mapped out for 10’s to 100’s of thousands of URLs.

To do this, we needed to capture all the different URLs, build out the migration rules, and then test the redirects.

Mapping out and building the rules is one thing, but then how do we bulk generate what the URLs should look post-migration so that we can actually test the redirects?

How can we test the implemented redirect rules, to ensure they match what we originally mapped out?

No one wants to manually create and map 1,000+ URLs to compare the redirect path too, and I personally prefer to not just test the redirects then give them ‘after’ the once over.

I’d like something to compare the implementation to.

Using Google sheets (or Excel) you can strip down the existing URL, extract the parts the redirection rules will use, and then rebuild the existing URL into their new formats.

This will allow you to do a bulk migration test against the implemented redirect rules, and confirm everything redirects where it should.

There are a couple of ways of achieving this, and I will just go through the different methods I use below.

A quick note though, this has nothing to do with building the actual redirection rules themselves. Whilst it might help you build better examples of what you need to redirect, this post is purely about building the expected URLs so you can confirm the redirects match.

 

Stripping a URL down

There are a few main types of URLs we can account for here.

The standard hierarchical folder-based structure, the flat URL structure with everything to load the page in the base URL, and the ugly query parameter based structure.

They can all be stripped down relatively easily, but firstly we can strip out the URL path from the domain to simplify issue patching later.

 

Removing the domain from a URL Path

A handful of different ways to strip a domain from a URL are available, but this formula is possibly the easiest;

=RIGHT(A2,LEN(A2)FIND(“domain.com”,A2)LEN(“domain.com”))

This will strip everything up to and including ‘domain.com’ allowing us to account for all https, http, www, and non-www variations of a URL leaving us with just the URL path.

Stripping the domain from a URL

It will also cull subdomains, so just keep that in mind. If you need subdomains left, you can still use the same formula, but each subdomain will need it’s own.

I’d recommend extracting the subdomain in a separate column, and then referencing the subdomain’s cell, rather than plain text domain. That should save you a bit of time!

 

Extracting query parameter values from a URL

Let’s go over this one first, as it is slightly simpler than breaking down a ‘pretty’ URL.

There’s just one formula you need, and that’s regex extraction.

=iferror(regexextract(B2, “[?&]parent=([^&]*)”),“”)

That will extract the value for the ‘parent’ query parameter, and leave the cell blank if the parameter doesn’t exist, or is blank.

Extracting query parameter value from a URL

You then just need to repeat that formula, for each query parameter value you need to extract.

 

Stripping a hierarchical ‘pretty’ URL structure

Stripping down a folder-based, or hierarchical is the next structure that could be at play here – and generally my favourite to use.

The absolute simplest way is to use the “split text to columns” method.

Splitting url path text to columns

Highlight your text, click ‘tools’, and then click on ‘Split text to columns’.

You’ll then get asked for a separator. In this case, the separator is a ‘/’, so you’ll need to click ‘custom’ and enter the ‘/’.

After which, you’ll immediately get the following;

Select the URL separator

Each folder split across the different cells, giving you all the different folder levels all accounted for.

Works perfectly! If you don’t mind splitting any new URLs like this going forward any time you add new ones.

I personally prefer to try and work with formulas where possible, so that we can keep expanding the URL list, along with tweaking it as we go if modifications are required.

Provided you did the URL path column above, the following formula is what you’ll need;

=iferror(LEFT(B6,FIND(“/”,B6)1),“”)

Parent folder extraction from URL

That will just strip off everything after the first slash, leaving you with the parent folder.

Next, you’ll do a similar thing to get the child folder, but first, you’ll just subtract the parent folder from the URL path.

=iferror(LEFT(substitute(B6,C6&“/”,“”),FIND(“/”,substitute(B6,C6&“/”,“”))1),“”)

This will just remove the parent folder from the URL path, and leave you with the child folder, which will be the second folder in the path.

Extracting a child folder from a URL

And if you’re playing with 3 or more levels, here’s the formula you can modify;

=iferror(LEFT(substitute(B6,C6&“/”&D6&“/”,“”),FIND(“/”,substitute(B6,C6&“/”&D6&“/”,“”))1),“”)

You’ll just need to keep adding the new cells in there each time you need a new level

 

Breaking down a flat URL structure

These are extremely difficult to break down.

Unless there is consistency in how they’re formatted, this could be a very manual process.

However, I am just going to assume you’ve gotten lucky with something that resembles a consistent format for you to pull apart.

=LEFT(B15,FIND(“-for”,B15)1)

Stripping a flat URL structure

For the child folder, there are two formulas I have that basically just do the same thing.

They both look at a break point and just trim everything including and after it. For my example, my breakpoint is “-in”.

=substitute(REGEXREPLACE(B15,“(.*)-in.*”,“$1”),C15&“-“,“”)

Extract a part of the URL

=substitute(LEFT(B16,FIND(“-in”,B16)1),C16&“-“,“”)

Google sheets URL modification formula

Same same but different.

I just don’t remember how they’re different lol.

The last part here is to extract the location.

If your location is like mine, you don’t need anything fancy.

You just need to substitute all the previous folders from the structure, then remove the slash, and you’ve got your location.

=substitute(SUBSTITUTE(B15,C15&“-“&D15&“-in-“,“”),“/”,“”)

Extraction a location from a URL for SEO

You essentially rebuild the URL that came before it, based on what you had stripped out, and then you’re left with the ‘location’ that is on the end.

If your location isn’t on the end or doesn’t have something that can be clearly substituted out after it, you’ll need to modify the formula used for the child folder.

 

Rebuilding the New Post-Migration URL Structures

Once you’ve stripped down the old URL to its main elements, you can start building the new one!

If you’re merely converting query parameters to a clean structure, with no folder changes, you’re super lucky.

Many of us though, are not that lucky.

 

Converting query parameters to pretty SEO URLs in Google Sheets

Okay you lucky people, this is super easy.

=”https://www.domain.com/”&C2&”/”&D2&”/”

Converting query parameter to pretty SEO URL in Google Sheets

You just piece the URL together using the ampersand, ‘&’.

Include your domain with a slash, then the parent folder and a slash after it, and then the child folder and another slash.

Some people like to use concatenate, but you complicate things. Complications through concatenations.

Annnnnd you’re done! You’ve now turned your query parameters into a pretty URL structure, in bulk, so that you can test these out.

It’s rarely that easy though.

 

Rebuilding a URL structure where folders change names or get merged

Whilst it’s easier when things don’t change, it really isn’t that hard when they do now that you’ve stripped the URL down.

The first step is to paste all your parent, child, and any other folders into a separate column, and then remove all their duplicates.

Removing duplicates from a list of URL paths

This will leave you with a unique set of folder values.

You then just need to add their new value next to them.

If the value doesn’t change, just paste the same value in.

If the folders are getting merged, just paste the folder you’re merging into here.

If it is a completely new value, type out the replacement.

Old to new URL folder mappings

The mapping of the values might take time, as some websites could have hundreds and hundreds.

The benefit of doing it this way is that you aren’t dealing with all the possible variations & combinations.

Separating the child folders from the parent folders lets us independently map them to their new values.

Now we need to bring these new values into the mix, with a good old vlookup (if you don’t know how to do it, check out my SEO Excel formulas post).

=VLOOKUP(C2,M:N,2,0)

Vlookup the old to new parent folders

Repeat the process for the child folders, and you’ll get all the shiny new folder values.

You’ll just need a slight tweak to the child formula, due to the possibility of there being no child.

=iferror(VLOOKUP(D2,P:Q,2,0),“”)

If no child exists, the cell will remain blank.

Vlookup the new child URL folders

You should now have completely mapped the old to new folder values, so we can move on to constructing the URL.

Now you just piece it all together!

=“https://www.domain.com/”&E2&“/”&IF(len(F2)>1,F2&“/”,“”)

Rebuilding a new SEO URL for migration testing

This will add one folder to the other, and rebuild your new URL structure for you!

If there is no child available, it will ensure there is no extra slash added.

If you prefer the old flat structure, tweak the formula to be;

=“https://www.domain.com/”&E2&IF(len(F2)>1,“-“&F2&“/”,“/”)

Which will ensure there is no extra dash added, unless the child exists.

And if you need a stop word, or two, in the middle to clean her up a bit, just do the following;

=“https://www.domain.com/”&”theres-a-“&E2&“-before-the-“&F2&“/”

You’ll have to account for your own variations of the template, and whether a child may or may not exist.

 

Rebuilding a Structure that Previously Used ID’s

What if you’re just stuck with a heap of parameter IDs, and no ‘pretty’ URL to start with?

The benefit to the above vlookup style URL rebuilding is that you can extract the query parameter values, and then just map the old values to their new pretty folder name.

A quick rebuild, and you’ve gone from the old ugly URLs, to what could be thousands of crawlable new versions of the URL!

Start by getting the path out of the URL, and then extract all the parameter values, like I showed you earlier.

Converting query parameters to a pretty URL

Add your new parameter values to the parent/child vlookup lists;

Query parameters to clean folders

Then just vlookup the parameter value, to the new parent & new child values, and build your new URL!

New pretty URLs for query parameter ugly URLs

Everything was covered earlier, so just grab all the formulas from there.

That covers it though!

You’ve gone from the ugly parameter version to a shiny new & testable pretty URL.

 

Bulk testing your new URL structure

You can now run the old URLs through screaming frog (or your preferred bulk testing tool) and copy out the final URL in the redirect path.

Match it up to your expected ‘new URL’ and confirm whether it is the same or whether they’re different.

If they’re different, is it the redirect rule that doesn’t match? Is there an error in the expected URL? Are the redirects just not implemented?

This is guaranteed to save you time bulk testing your migration.

 

Testing a URL migration on staging before going live

In an ideal world, for a migration this size, you should be able to test out your redirection rules on a staging environment.

Swap out the website’s domain in the formula, for the URL of the staging environment, and do everything else the same.

You’ll get thousands of staging URLs ready to load screaming frog up with!

 

Download the sample

You can download all my sample data from here if you want to jump right in.

I’m always looking at new ways to simplify bulk analysis, so if this worked for you, or you do something completely different, let me know!

I’d love to hear about it all ūüôā

 

Also, just a quick shout out to the guys at ExtendOffice.com though!

Plenty of awesome formulas, with this one definitely being one for the bookmarks: https://www.extendoffice.com/documents/excel/1783-excel-remove-text-before-character.html

Excel Formulas Every SEO Needs to Know

Excel Formulas Every SEO Needs to Know

As an SEO, you really can’t escape Excel. Almost every month a new SEO tool pops up, that can do so much, but will never seem to replace those little things you can do in Excel.

You find yourself using a tool for 75% of the job, but will then always jump back into Excel to finish it off.

Well, that’s me anyway. If it’s not you, then you probably just don’t know many of the things that Excel can do!

I run through VLOOKUPs, IF statements, LEN, TRIM, PROPER / UPPER / LOWER, COUNTIFs & SUMs and show you some basic usage examples. I’ve also added a few extras to the bottom that aren’t in the video!

Basic Excel Formulas for SEOs

Whether you’re starting out, or an experienced SEO, these are the basic building blocks to so many advanced Excel uses for SEO. You need to know these.

VLOOKUP – Find data for a value, within a range

The simplest summary of what a VLOOKUP will do is to match two sets of data where you have something to match them with.

So is long as one column in each data set has some sort of unique identifier, then you can VLOOKUP between the two datasets and bring the data together.

If you don’t have a unique identifier, will you have one if you match two columns¬†together?

You might have a date that isn’t unique, but if you add it to another column (like colour) you will get a unique value that you can use to match the data sets.

=VLOOKUP(*CELL of what you want to search for*,*RANGE of where you want to look for it*,*COLUMN NUMBER the data is in you want to extract*,*TRUE (1) or FALSE (0) but 99% will be FALSE*)

LEN РCount character length of a cell

Nice and simple formula. The LEN formula is great for counting the lengths of page titles and meta descriptions!

Plenty of other uses, but that’s what the majority of people will use it for.

=LEN(*CELL to count*)

TRIM – Remove spaces before & after a cell

Sometimes when trying to match data up, the cell might visually look the same as another cell, but you can’t seem to match it up!

The amount of times I have run into this is crazy, and the majority of times it happens there is a pesky space before or after the text.

The TRIM formula will clean these up for you and make sure there isn’t something hidden there wasting your time!

=TRIM(*CELL to trim*)

SUM –¬†Add all numbers in a range together

Just adds numbers. It’s that simple. Just select the range you’d like to add together and away it goes.

=SUM(*RANGE to add together*)

COUNT –¬†Display a count of cells with numbers in them

Out of your selected range, how many of the cells have a number in them?

=COUNT(*RANGE to count for numbers*)

COUNTA –¬†Display a count of cells¬†with numbers and letters in them (so not empty)

Out of your selected range, how many of the cells have a number of letters in them? This is what you need to use if you are trying to count cells with characters rather than numbers.

=COUNTA(*RANGE to count for numbers*)

COUNTIF –¬†Count all the cells when a specific criterion is met

Only count a cell that matches your criteria. This is great for being able to select cell as your criteria.

So if you have a column of ‘Colour’ and a cell that has the text of ‘Blue’, you could select that ‘Blue’ cell and then the range that has all the colours, and the formula will count how many times ‘Blue’ appears in the range.

=COUNTIF(*RANGE to count for numbers*,*CRITERIA for when to count*)

PROPER –¬†Capitalise the first letter of every word in a cell

Great for cleaning up a list of titles, or any words, where you’d like to capitalise the first letter of every word.

=PROPER(*CELL to capitalise first letters*)

UPPER –¬†Capitalise every letter of every word in a cell

This will format it AND MAKE IT LOOK LIKE YOU’RE SHOUTING. Just use the upper formula when you need to make sure everyone knows you’re SERIOUS.

=UPPER(*CELL to capitalise everything*)

LOWER РMake every letter of every word in a cell lowercase

The lower formula will make everything lower case. Great to run through your keywords with to ensure there isn’t any weird capitalisation going on.

=LOWER(*CELL to lowercase*)

LEFT & RIGHT – Extract x characters from the left or right of a cell

This is a simple way of extracting a certain amount of characters from the left or the right of some text.

Great if you want to cull the end of a title, remove a domain, or trim some trailing slashes from a URL.

=LEFT(*cell reference for text to look in*, *number of characters*)

=RIGHT(*cell reference for text to look in*, *number of characters*)

However, if you are looking trim everything before or after a certain character, instead of a number of characters, check below for a more advanced formula!

IF –¬†Only do something when a certain condition is met

The IF formula lets you create rules for your cells. Maybe you need to filter out titles that are too long.

You could do an IF formula in a new column that says =IF(length>70,”LONG”,”OKAY”). This will give you back a cell that says LONG if the title is too long, or OKAY if everything is okay.

Probably not the best use case, but gives you the idea!

=IF(*The condition to run*,*What to do if it’s true,*What to do if it’s false*)

Does Cell Contains Text – Run condition if the cell contains specific text

A quick formula to use if you want to do something off the back of a cell containing a specific piece of text.

This formula will return TRUE or FALSE, which is what I normally use with this one, but you can expand it do use IF. So IF bla = TRUE, do this.

Great way to quickly segregate data by whether something is in a URL, or keyword.

=ISNUMBER(SEARCH(“text to look for”,*Cell to look in*))

& – Yeap, the ampersand bad boy is rather useful

If you’re trying to merge stuff together in Excel, you need the ampersand. This thing will let you attach anything, to anything else.

It will let you run multiple formulas, then just stick them all together to form what is essentially a sentence.

=”Stick this “&”,with this “&”, and merge all “&2+2&” sentences together”

Which will output:

Stick this, with this, and merge all 4 elements together

SUBSTITUTE – Swap out any text, with any other text

This formula is invaluable if you are doing any sort of text editing or manipulation.

You select the cell you’d like to manipulate, enter the text that currently exists, then enter the text you wish to change it to.

=SUBSTITUTE(*text to look in*,”old text here”,”new text here”)

Trim URL to Domain / Subdomain

This is something that you will find incredibly useful if you’ve never done it before. Some extra tools do it, but you can achieve almost perfect trimming with just a formula – SOOOO much easier!

Quick and Dirty Trim Domain Prefixes Method

If you are just looking to trim the HTTP part, here is a hacky formula that will just substitute out the HTTP part. It’ll leave subdomains / domains, and all their folders intact however.

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(substitute(SUBSTITUTE(A2 ,”https://www.”,””),”https://”,””),”http://www.”,””),”http://”,””),”www.”,””)

Trim URL to Subdomain

Spend a bit more time playing with a formula, and you can get something that will do a much better job though!

This formula will trim any URL back to its subdomain. It will remove all prefixes, or any suffixes (aka folders) from a URL so that you are just left with the subdomain & domain combo.

=SUBSTITUTE(REPLACE(REPLACE(A2, 1, IFERROR(FIND(“//”, A2)+1, 0), TEXT(,))&”/”, FIND(“/”, REPLACE(A2, 1, IFERROR(FIND(“//”, A2)+1, 0), TEXT(,))&”/”), LEN(A2), TEXT(,)), “www.”, TEXT(,))

And another formula is;

=REGEXREPLACE(REGEXREPLACE(A2,”(http(s)?://)?(www\.)?”,””),”/.*”,””)

With A2 being the cell reference to where you have a URL to trim.

Trim URL to Root Domain

If you are wanting to get rid of the subdomain too, so you are just left with the root domain, you can use this formula. It will remove folders, subdomains, prefixes, and anything else so that you are just left with the root domain stripped out.

=SUBSTITUTE(TRIM(RIGHT(SUBSTITUTE(REPLACE(REPLACE(A1, 1, IFERROR(FIND(“//”, A1)+1, 0), TEXT(,))&”/”, FIND(“/”, REPLACE(A1, 1, IFERROR(FIND(“//”, A1)+1, 0), TEXT(,))&”/”), LEN(A1), TEXT(,)), CHAR(46), REPT(CHAR(32), LEN(A1))), LEN(A1)*2)), CHAR(32), CHAR(46))

Stripping Text Before/After Characters

This is something I have only picked up in the last couple of years, and these formulas allow you to strip out the text before or after a character or a set of characters.

I will leave this up to your imagination with how they work, but a great usage is stripped title tags of anything after a | (pipe).

Yes, you could find/replace this stuff out, or substitute it, but maybe there are 50 different items to swap out. That will take ages!

If they all have a pipe right before them, you can strip them all out with one formula.

Found these formulas here, thanks guys!

Strip Text Before a Character

This formula will remove everything before the pipe. Just edit the pipe out with any character, word, phrase, anything, and it will remove all the text before it.

=RIGHT(A1,LEN(A1)-FIND(“|”,A1))

Strip Text After a Character

Again, just replace the pipe with anything you want to have this formula strip all the text that is after it.

=LEFT(A1,FIND(“|”,A1)-1)

Finding Your Own Excel Formulas

The biggest thing to remember with Excel formulas is that chances are someone else has tried what you’re doing before.

Just break down what you’re doing, and Google it piece by piece. You might end up putting something together yourself, through what you find from others.

Building a Keyword Research Dashboard in Excel

Building a Keyword Research Dashboard in Excel

An extremely large table of all their keywords, isn’t going to give the insight some clients would want. Some would like more a bit more insight into the data they’ve paid thousands for.

Building a keyword research dashboard in Excel is a great way to allow a client to interact with their keyword data.

In this short video, I will show you how you can take a categorised keyword list and turn it into a client-friendly dashboard!

The video is a couple years old now, and I may eventually update it, however, for the most part, it still holds true today.

Now, this won’t be great for some clients. You could end up giving them way too much information, and have them asking way too many questions.

However, I personally don’t mind that. I love when clients have an interest in SEO as it gets them more involved in the project, and more excited when the results kick in.

How to build a keyword research dashboard in Excel

  1. Categorise up your keyword research data
  2. Create pivot tables for each category level
  3. Attach slicers to the categories so you can filter the data
  4. Design the tables & slicers to match your client’s branding

And there you have it, you will have an interactive Excel keyword research dashboard to send to your clients.

Creating an online keyword research dashboard

You can save this Excel dashboard to Onedrive and then share with your clients via a link instead of sending a file.

For dashboards under a certain size this means they will be able to view the dashboard online instead of having to download the file. You can then simply update the keyword research file, and they will be able to see the update when they refresh their browser.

This becomes a massive help when trying to send updated data to a client. No more v53 tacked onto the end of a file because you’ve updated it 53 times!

Download the Excel Keyword Research File

If you watched the video and can’t be bothered doing the work yourself, you can just have my file!

Or if you’re doing keyword research for car hire you can also just steal my keywords – chances of that have to be slim.

Download the Excel Keyword Research Template
  You will receive a download link via email. We hate spam and never share your details.

 

Don’t forget to add some proper styling to the file though! She ain’t exactly pretty at the moment.

If you have any issues with the Excel file, or just need a hand with anything from the video, just leave me a comment below and I can help you out.