Category: Programmatic SEO

Just Another WordPress Programmatic SEO Build – Part 5

Just Another WordPress Programmatic SEO Build – Part 5

Safe to say this build is progressing pretty fast thanks to sheets & WP.

A custom build like this would normally take quite a bit longer! And that would be on data configuration alone.

Being able to format stuff and quickly make mods in G-sheets makes things sooooo much easier.

If you haven’t read the other parts, you can find part 1 of the WordPress Programmatic SEO build and then work your way up to this one.

Let’s continue.

Linking widget tables

A few of the pages are starting to get some serious link counts, particularly the category pages.

Yeap, a few links.

Definitely a few more than what a normal comma list would contain.

But, we need these links here. We can’t just strip them from the page.

A standard programmatic implementation, particularly in the early phases, is an amazing-looking table.

One of those linking widgets that could have 10 or 100 links in. They work though, so don’t talk negatively about them!

I figure we can look at the link count, and if it’s above say, 10, we swap it out for a pretty table instead.

I grabbed some default HTML table code from here and slapped it into one of the category pages to test it.

Boom. Table.

I’m not counting this copypasta as code. If it was, then HTML links are code too yeah?

Updated the data to some sample data and some links to see what it’d look like, and threw in a centring style, and it isn’t too bad at all.

Since there will be quite a few links though, there’s a bit too much padding.

Fixed.

Now we need to work out how to actually get this going in the Google sheet.

We have a comma separate list of links that we currently use in the page, but how can we turn this into a table.

We need some way to split that list up into groups of 3 for a 3-column table.

I got stuck here for a bit, but then did some Googling and came across this magical formula.

We can just replace every 3rd comma with the row code (<tr>), and then every other comma with the table column code (<td>).

After more tweaks than I care to admit, I pieced together the below;

We start by including the table opening code, then the comma separated that breaks up with every 3rd comma being swapped for the </tr><tr> portion of the table code.

Then we substitute the remaining commas with the <td style> code.

Then throw the table closing code on the end, and we’ve got ourselves a nice link widget.

Well, nice is debatable.

For those that might want it, the formula for a table from a comma-separated link list is;

=substitute(substitute( “<table style=””text-align: center””>
<tbody>
<tr><td style=””padding: 0px””>”&REGEXREPLACE( REGEXREPLACE(L2,”(\,.*?){3}”,”$0*”) ,”\,\*”,”</td>
</tr>
<tr>
<td style=””padding: 0px”” >”),”,”,”</td>
<td style=””padding: 0px”” >”),”<animal>”,A2)&”</td>
</tr>
</tbody>
</table>

*ignore my styling, and formatting, and well….. just anything I do that has code involved.

Throwing it into the site to test we can see it works. Works = nice.

Obviously quite a bit bigger than the comma list, so not 100% sure whether this could be better or not.

However, it gives us another option.

Adding a little extra code could give it a little expander option but that’s beyond what I’d want to do now so I’ll skip that.

Think I will just completely skip including this and just shelve the table code. Will keep it in the spreadsheet and maybe return it later.

If I was going to use it though, I’d be going to the content formula that includes the comma-separated list.

Then using a formula, I’d run a count formula that when the comma count is below say 10, the comma-separated link list could be used. If it was higher than 10 it would pull in the table links cell.

 

Scraping videos

To build out the content even more, we could embed some Youtube videos.

Ideally, you have your own set of videos to mix up the content.

I don’t.

I’m just going to scrape a heap of youtube videos and embed whatever video currently ranks for the combo keyword.

This used to actually work a treat to help a page rank, and I have seen some benefits recently on a few of my sites, however, this is purely to build out the example content and show you how to leverage it.

To do this, we’re going to come up with a heap of keywords to put into a rank tracker (serprobot). We’ll then export the top10 SERPs, remove everything not in the first place, and then remove any non-youtube links.

Anything left over will be included as a video in the pages.

I’ve loaded up a project in serprobot looking for ‘<combo> youtube video’ keywords.

After a few minutes, it’ll complete the rank check, and I’ll be able to export my SERPs report.

I’ll just insert two rows in the keyword column here so the keyword & URLs line up, and then substitute out “keyword” along with sort the list.

Sorting it by URL, we can then delete anything non-youtube, and then delete 2nd position onwards so we end up with a keyword | youtube list like this;

Strip out the ‘ youtube video’ part from the keyword, and we’re back to the combo name, then throw it back into Google sheets and we’ve got a list of 1,000 youtube URLs with their respective combo.

I’ve added a <youtube> into the content template where I’d like to throw the video in.

We now need to craft the youtube embed code, and get that into a column.

The code I’ll be using is this;

=”<p style=””text-align: center;””><iframe frameborder=””0″” scrolling=””no”” marginheight=””0″” marginwidth=””0″”width=””712″” height=””400″” type=””text/html”” src=”””&substitute( VLOOKUP(A2, Videos!A:B,2,0), “watch?v=”,”embed/” )&”?autoplay=0 &fs=0&iv_load_policy=3 &showinfo=0&rel=0 &cc_load_policy=0 &start=0 &end=0″”><div> </iframe></p>”

Which yes, will look crazy, but, it’s a standard Youtube embed code. This then has the vlookup inside it that goes off and grabs the combo’s youtube video, and then replaces the ‘watch?v=’ with /embed/ so that the embed works.

This then gets substituted out in the intro content section.

Re-uploading the data and we get…

Fancy videos!

Well, kind of fancy. Whatever Google ranked for the keyword, that’s what gets included.

Some are better than others.

 

Homepage update

The homepage is still looking a little sad with just a short list of posts.

We need to get this updated with some content.

Possibly links out to the animals with a pretty picture, and a heap of text-centric content to pad it out a bit.

I’d love to set up a bannerbear template for this, and be able to generate a text-over-image thumbnail for each category.

That’s pretty and ideal though, so let’s just go with the functional solution for now.

Also probably don’t need to try and think programmatically about the homepage…. can try something a bit manual.

Bit of text, and some chonky links added in.

Much better.

Well, somewhat better at least anyway.

I LIKE IT ALRIGHT.

 

Additional content sections

Another big part of a programmatic SEO builds’ content is what is essentially content re-use.

This content re-use primarily occurs on search result/aggregation type pages, where you list out and link to the listings/specific content.

Where these links are included, portions of the content are also included, and that pads out the content on these aggregation pages.

You’ll also find this type of content on sites like G2 & Capterra. Each software has its own page, but then they take a chunk of that text and list it out on the ‘best software for x’ type pages.

Think of this sort of content more like a big block of ignored text. It won’t penalise a page, but it might not add direct value when used alone. It needs to be combined with other elements, and is just used to support those other elements.

This build is a bit different, and not at such a scale. Our re-use chunks might be a bit over-used, when looking at the total percentage of pages with a piece of text.

Oh well though, it’s just a concept build anyway.

Let’s see what we can apply to the build!

 

Animal-based content paragraph

We’re going to write up a custom dynamic element for each animal, and try to feed in a food variable where we can, but more so focus purely on the animal itself.

Something like the general nutrition habits of the animal in question. Maybe what they normally eat, and what their bodies require.

I’ve leveraged OpenAI to come up with generic garbage about each animal;

I’ve then going to get this included in the combo page’s outputs by adding an <animalGeneric> into the conclusion content template, and then substituting it out.

Essentially just a pure little dupe content section, but mixed in with the rest of the content it shouldn’t be tooooo bad, we’ll see.

Might update it to add a dynamic element to it later, or split it into 2 sections and include them in separate parts of the page content.

 

Food-based content paragraph

We’re going to do a similar thing to what we did with the animals, and add a section purely about the food.

Probably about their nutrition value, but will really just use whatever OpenAI spits out for us.

Created a list of all the foods, ran them through OpenAI with a little prompt, and received a heap of text;

I should edit it, but nah. Let’s just roll with that. Might get an editor to just tweak it later.

 

Leveraging these additional sections

I’ll throw both of these new sections in the footer section of the combo pages.

They’ll get included in the content template as a variable, and will then be substituted in the content builder.

It will not include a templated content section, the generic section about the animal, the 4th image, and then a generic section about the food.

Two columns are then added into the content builder sheet.

These do a vlookup to the generic content templates sheet, with one matching the animal and one matching the food, and will return the text element if it exists.

They’re now added to the conclusion content section, with the substitutions in place.

 

Stripped pagination

Finally stripped out the pagination on the category pages, by editing the raw code of the theme.

This will probably get overridden when the theme updates, so I’ll need to get a child theme going, but I can sort that out later.

I tried to also strip out the post inclusion on archive pages, but the content still shows so I am not sure what’s up there.

Reduced the item count to 1 for archive pages though, which is the lowest it’ll go.

Just a single item is included now, with no pagination below it. Better than before so let’s go with that.

The intent behind this is to de-bloggify the site a bit, and to also ‘silo’ it a bit more by containing the links as much as possible.

We’ve got a decent internal linking setup within the content itself, so that is our core interlinking.

If we need more links, that’s where we will try and get them in.

Might even try fully silo it later too by cleaning up the templated header/sidebar links a bit.

 

Wrapping it up

We’ve now added some new content to each page, investigated an alternate table linking option, along with creating some shiny new pages.

On top of all this, I discovered a silly mistake in one of my linking codes.

I wasn’t substituting the space in multi-word animals & fruits, in the link codes and imagery names.

This led to 404 pages being crawlable, and a lack of imagery on heaps of pages due to an incorrect lookup.

Rookie mistake.

In the words of a wise man, DOH!

Taking a look at current performance though, we can see a few things.

Crawling is spiking, but a few days delayed so that spike is a week old now.

Google’s started indexing it all, slowly but surely.

16 in mobile usability report. I’ll go into why I look at this in a separate post at some point.

Impressions are starting to kick off now, because….

A heap of keywords are hitting the board. 35 pages with impressions in gsc. Decent little start.

Oh, and I added another 150 pages.

Stay tuned for Part 6, and see what we can do next.

Current Page Count: 320 pages

Programmatic SEO: Tracking & Performance Monitoring

Programmatic SEO: Tracking & Performance Monitoring

So you’ve got a shiny new programmatic build, and want to know how it’s performing.

Monitoring indexation, traffic, conversions, rankings, there’s so much to look at so what should you focus on?

Let’s take a look into a few things you should be monitoring, and how to go about it.

Key metrics that can gauge performance

The key metrics you should be monitoring will change through the life cycle of your build.

A build that has just been launched, should have more emphasis on tracking indexation, rankings, and even impressions.

For existing builds, or as a build gets older, more emphasis should be placed on two other key metrics. Conversions & clicks.

Everything else is still important, particularly for understanding what modifications could be made to improve performance, but if your setup isn’t driving converting traffic, then what’s the point?

Unless you’re a build that’s purely advertisement reliant, of course.

 

Indexed Pages

The overall count of pages indexed. This can be monitored via the coverage report, and XML sitemaps.

 

Ranked Pages

Kind of an extension of indexed pages, ranked pages is the count of pages ranking in the top 100 for the most related keyword possible. You load up a rank tracker with keywords exactly matching specific pages, and then monitor them showing up in the top 100. It’s great for ensuring that Google is actually ranking the pages, albeit low to begin with, rather than just reporting they’re indexed.

 

Ranking Performance

Most will just use average rank for rank monitoring. Whilst it’s great, it’s unweighted. Keywords with 10 volume will have the same value as keywords with 10,000 volume. I prefer to use ‘estimated traffic‘, which takes into account both the ranking, and the search volume for a keyword, giving a much stronger indicator of ranking performance.

 

Clicks & Impressions

Pretty obvious here, but clicks & impressions are a clear indicator of performance. Can be broken down by keyword level (albeit filtered data) and URL level, it can give us quite a bit of drilled-in performance data.

 

Conversions

Last, but certainly not least, conversions. Provided you’re not an ads-driven programmatic build, conversions should be the key of all key metrics. It’s the most direct indicator of the actual performance of the traffic. If it’s good traffic, it’ll convert. Bad traffic, won’t. Even a badly optimised conversion funnel will get traffic into the funnel. These conversions can be broken down into different levels, from funnel entry to funnel exist, ie add to cart verse purchase, to get even better data.

 

Breaking the metrics down

When analysing the performance of a programmatic build, you can’t look at it at a page level.

You’ll spend hours upon hours looking through the data.

You need to analyse it at the category/site section level. Grouped performance data essentially.

Look at the data for sets of the pages at once.

You could break the data out by category, sub-category, location, or even analyses like ‘location pages verse generic pages’.

Any way you feel it could be analysed that returns a smaller amount of overall groups.

 

Analyse keyword data by categories

You should group your keyword data into categories and view your performance at the category level. Aggregate all the performance data, and view key metrics like clicks, impressions, and estimated traffic performance by category.

 

Analyse traffic and performance data by site sections

Just like you group keywords into categories, you can use the same setup to categorise your landing pages into site sections.

A quick way to analyse site section performance from GSC data is to also group the landing pages into site sections in data studio.

Another trick here is to individually verify GSC properties of each site section. If you do that, you’ll get fully granular data for each site section, in its own property. Everything will be filtered to the site section.

 

Tracking a programmatic build

On top of the standard conversions/clicks, there are a few extra ways I look at performance, especially for a programmatic build.

As long as the GSC is verified before launch, analytics setup with conversions, and you have your rank tracking set up, you’re golden for launch. Everything else can be set up based on the historic data.

There are a few things you can start to look at after a launch, with some of the things I look at being below.

 

Identify top keyword templates and generate tracking keywords

You might be wanting to track thousands of keywords for the overall project, which is great, but sometimes it’s a bit of data overload.

Focussing on a small subset to gauge overall performance can really help you get a quick analysis done.

I recommend identifying your top keyword templates and then generating a subset of keywords for each main section. I personally track between 100 & 500 keywords per site section/page type.

So for the ‘buy’ channel of real estate, you’d have;

real estate for sale <location>
properties for sale <location>

and then ‘rent’ might have;

rental properties <location>
properties for rent <location>

Using these templates, pick the top 100 locations, or whatever the variable is, and generate the keywords. You could use merge words, or follow my guide here on bulk keyword generation.

Load these up in a rank tracker, in either different projects, or tagged separately, so that you can quickly view the overall group performance.

I use serprobot, and would set them up like;

Domain.com – Buy Keywords
Domain.com – Rent Keywords

So that I can view averages just on the dashboard, and quick view performance for that whole category.

You might also track a subset of generic verse property type keywords. Generic would be;

property for sale <location>

but then property types might be;

houses for sale <location>
apartments for sale <location>

And you could do a similar thing with generic vs location keywords.

Just follow the same process as above, and generate the keywords and then load these up in the tracking.

 

Monitor your indexation

During the initial phases of a build, monitoring indexation is a great way of understanding how Google is initially reacting to it.

Indexing is the first sign of happiness.

An indexed page means it’s passed the initial checks by Google, and they at least, somewhat, found enough value in the page to index it.

If you’ve never worked on a large-scale programmatic build, the amount of pages that don’t get passed the discovered/crawled stages sometimes is incredible. So passing this stage is a great first step.

You can do this through the coverage report at the top level, and by site section if you’ve verified the individual site sections in their own GSC property.

Another good method is creating site-section-based XML sitemaps. These can then be clicked in the sitemap report, and you can view the coverage report based on just the XML sitemap URLs.

The final method of initial is by keeping an eye on the keywords with a URL ranking in the top 100. These should generally be the most related page to the query, so just watching the quantities of keywords with a URL ranking, as a per cent, can give a good indication here too. It might be a bit more performance/quality related than direct indexing, but everyone knows it gets pretty junky after a few pages anyway, so even a shockingly optimised page could pop the top 100!

 

Count of URLs with impressions

Not so much a key metric, but another great view of performance, is monitoring the overall count of URLs with an impression.

A more ‘advanced’ method of monitoring indexation, keeping an eye on the count of URLs with impressions is a great way to monitor the overall performance growth across the entire build.

Somewhat useful at the start, and particularly great for self-expanding programmatic builds, this metric will help you confirm whether the wider system is driving the performance, or whether it’s a small subset of URLs that are performing the best.

It’s great for ongoing monitoring, especially when you start to break it down by sections and page types.

Analyse the per cent of live pages that are actually generating impressions.

 

So many metrics, so little time

Keeping what you’re tracking to a minimum, will help you avoid data overload.

Think about what metrics you can’t backdate, like ranking information, and ensure they’re all setup just in case you need them one day.

You never know what you’re gonna need, and when.

Programmatic SEO: Integrating Blog Content With Programmatic

Programmatic SEO: Integrating Blog Content With Programmatic

Every website tackling SEO is running a blog.

Blogs & SEO go hand-in-hand.

For programmatic SEO, blog posts enable the ability to target specific keyword sets that you just can’t efficiently target in bulk.

They help fill the gap.

How can you best leverage this content though?

How blog content works for most sites

The majority of websites with a blog will throw all their content under /blog/ or just single folders for each content type.

So you might see /advice/, /guides/ and /news/ commonly floating about.

All the blog content sits within these folders, and only in these folders.

You’ll find links to the posts around the site, but all readers once they click will be sent to that static blog folder.

These posts will then link out throughout the site, when it suits, mostly via manual contextual links.

 

How to improve blog content handling

Having blog posts set up like every other site still holds significant value, and is by no means wrong.

It’s the easiest method of getting gap-filling content live, so you can begin ranking for and driving traffic from additional keywords.

However, you can squeeze a bit more value out of this blog content.

Not by modifying the content itself, but by where the content is used on the website.

By relocating your content to sit within your programmatic directories, you can improve its effectiveness at passing value into those directories.

Rather than just a couple of links from the post into your programmatic setup, you’re shifting the entire value into the setup.

All value your content then generates, is directly passed into the directories where it matters, rather than to your domain, and then back down.

Not only improving the strength within these directories, but also helping aggregate all related content together for Google in the hope of being seen as covering the topic more thoroughly.

 

How this can be achieved

There are a few ways to integrate this, and you don’t have to give up all those lovely WordPress features either!

Definitely don’t need a full custom blog set up to make this happen/

 

Headless CMS via WordPress API or similar

WordPress offers an API that allows you to de-couple it from itself.

You can have the admin work as per normal, and then the front end be built however you like it, without all the WordPress bloat!

The developers need to build off the WP API, and can essentially leverage the blog content however they like.

It would work exactly like any other content type you have, so they get full access to it all.

No funky tricks to make things work.

Built it how you want it.

Sometimes devs will request something in the URL to be able to determine the content requested is blog content.

It’s the easiest way to integrate this, but it shouldn’t be necessary.

There are ways to make something like this work, even if it’s maintaining a list of all possible blog posts that are available, and thus then knowing all their slugs and not needing a key anymore.

You could ensure blog posts only get included in these category setups once a week, and then every week you recreate the list of all posts available.

Another possible way is it you know what variable values you’ll have within the programmatic structure, anything left could be deemed to be a blog post, and if not, it could then 404.

You might just need to heavily cache the blog content as it could be a bit of a load on the server on the initial load. But hey, it’s blog content. Not exactly requiring live updates here or anything.

There’s always a way if ideas are bounced around with the team enough.

 

Reverse proxies with Nginx

This is the only way that I know, that allows for integration between CMS platforms.

Essentially the setup just says that say /buy/ points at one system, and /buy/guides/ points at another.

It requires a unique key in the URL to be able to distinguish the two systems though.

Something like /blog/, /advice/ or /guides/ in the URL of every piece of content, whether it be categories or blog content.

 

Examples of content integration

 

Airtasker

Airtasker has now fully integrated their blog content into their structure, by using a

They have their main categories of landing pages like Cleaning, like this;

airtasker.com/cleaning/

But then they’ve managed to pull in their related blog content, under the main /cleaning/ structure like this;

airtasker.com/cleaning/guides/how-to-clean-home-after-flood/

They’re using the /guides/ as a key, to then be able to distinguish it as blog content.

They even go another layer deeper, and sit content behind their sub-categories too;

airtasker.com/tradesman/awning/guides/how-to-diy-awning/

Gets a bit deep into the structure, but that’s purely because of the deadweight of /guides/ being included.

 

Other examples coming soon…

 

Is it worth the work?

It’s really hard to say.

Comes down to how hard this is actually going to be to integrate for you, so somewhat the skill of your development teams.

If they can efficiently build this, then most definitely.

If it would be a prolonged build-out, and be deemed as not “simple” to maintain, then no, it probably wouldn’t be worth it.

I’d also try and get a second opinion if someone says it’s too hard.

Personally, if I can get this through I will certainly try. I know it will help.

Just Another WordPress Programmatic SEO Build – Part 4

Just Another WordPress Programmatic SEO Build – Part 4

Time for part 4. If you haven’t already, make sure to check out this WordPress programmatic build part 1, part 2, and part 3.

Today, we’re spending the entire part on a single topic.

Imagery.

Like a portion of the last part regarding the AI-generated text, this probably won’t actually apply to you.

You’ll probably have your own imagery to use, or at least, acquire proper images to use.

However, this one is fun. Promise.

Stick with it, and join me as we tinker on the edge of the dark side.

Text-to-Image Generation

Not to be confused with AI Image Generation, which I’ll get into shortly, text to image generation is exactly that.

Generating an image from text.

Yes, I know. Its how AI image generation works.

But this is literally taking the text and putting it in an image.

You build out a template, throw it some text and any images you want included, and an API will spit out a shiny new image.

Normally used for social media banners, programmatic SEO builds have started to leverage this to generate unique imagery related to their content in a less-than-greyhat way.

However, it’s not something I have actually leveraged.

Been on my to-do list for quite a while!

Time to get it going.

 

Picking a text to image generator

Yes, you can either code this yourself, modify one of the available code snippets out there, or even just buy one! There’s probably a plugin available.

But I’d rather find a good SAAS that does this well for the project.

After doing a bit of research, I’ve decided upon Bannerbear, and their image generation API.

Figured it also suited the whole animal theme.

They’ve got a nice, pre-formatted demo on the homepage that gives you some options.

These options are then spitting out;

A pretty neat way of generating a custom featured image for a post.

They’ve actually got a cool tutorial here on connecting WordPress with Bannerbear to get something like this going for OpenGraph imagery.

The query string method has been updated, to an apparently ‘simpler’ method, but let’s take a look at the whole process.

An alternate, and a bit cheaper, could be using placid. Seems they have some good integration methods too.

 

Creating a template

Once you’re signed up, you can start by creating a new template. The template will be the shell of what you’re generating, and allow you to hot-swap the different items out.

They offer quite a few templates, so I picked one that suited me and modified it as required.

I’ve included 4 images on the banner, along with the logo and a couple of text elements.

The 4 images will be swapped out, along with the title on the graphic.

Half jumping forward a bit with what the 4 images will be, but you’ll find out in the rest of the post.

As for the title, that will be the combo name.

 

Generating an image via URL

Following the tutorial here, to change the title and generate an image we get a URL of;

https://ondemand. bannerbear.com/simpleurl/ <base>/image/title/text/ Can+Monkeys+eat+ Bananas?/

After about 5 seconds it spits out;

Lost my question mark though… I’ll work out how to get that back a bit later.

Now, the best part about this?

You can bulk generate the images with Excel!

Well, you can bulk generate them a tonne of different ways but I already have this bulk image downloader & renamer that I use.

This works perfectly. I wasn’t expecting that at all.

Make sure you follow the tutorial in the post above on how to set it up, as you just need to edit a single bit in the code to tell it where to save images.

After that, just fill in the image name (no extension), the generation URL, along with what’s being replaced.

So for my example above, it’s just going to replace the question text.

Unfortunately, you’re rate-limited at 10 requests per 10 seconds, which yes, you could hit.

We have to modify the Excel script by adding some lines. When you’re in the editor modifying the file name (see in guide linked earlier) you need to look for;

Next i
End Sub

Insert 2 lines above that, and add before it;

‘~~> Pause for 2 seconds between requests
Application.Wait (Now + TimeValue(“0:00:02”))

The code will look like this;

It will add a 2-second pause before each request.

However, I just realised that Bannerbear requires the darn $150/month plan to do this simple URL generation. You get 10,000 images rather than the 1,000 I was going to play with, but I’m not upgrading for that.

New method time.

 

Generating an image via Airtable

They offer an Airtable generation set-up too (tutorial here) where you just throw all the fields in an Airtable and then it will build everything for you… proper no code!

You just map the table headings to your field names, insert your data, and then add your Airtable API key to Bannerbear.

Then when setting up the import, you insert your Base ID and table name, and then Bannerbear gets full access.

Here’s my setup before import:

Bannerbear will then take it all, run through them all, and then spit out the image URLs in the final column!

Pretty cool.

I’ll use the Excel image downloader on these URLs now so that I can properly rename the images, but generating them this way was faster and easier anyway!

I also got to keep my question mark.

I’ll sort all of that after I sort the images that will actually go into these, and the pages.

Now the fun bit.

 

AI Image Generation

Dall-e 2

Have you heard of Dall-e? And in particular, dall-e 2?

Dall-e takes a text prompt, and turns it into an image.

Their homepage demo gave me an option of a prompt of;

Teddy bears working on new Ai research on the moon in the 1980s

Which outputs an image of;

 

Pretty cool.

Can you see where I’m going with this?

Unfortunately, Dalle 2 is in super limited beta access.

Even if you get access, you can ‘only’ do 50 images a day.

 

Midjourney

Another ‘top tier’ option is Midjourney.

They’ve recently entered beta, and work via discord where you make your request and it’ll build the image in front of you. You can join the beta here.

When you run a generation, you get an initial set of 4 images.

You can then pick one, and get 4 more, or refine a specific one.

I’m going to try and get 3 new versions of the second image.

Image

Lets now upscale the first one, and see what it spits out.

After it spends a minute generating the image, we get this thing of pure beauty.

Image

It’s definitely more ‘artsy’ than what Dalle-2 is, but it could certainly work in the context of a ‘dogs eating grapes’ page.

 

Dall-e Mini

There’s an alternative called dall-e mini, which is basically just ‘simpler’ version of Dalle-2.

A very very simple version.

Let’s throw in the same prompt as what dall-e 2 gave me and check the result;

I mean, it’s so close haha.

However, it’s free and easily accessible.

So it is a massive step up from anything else we can have access in bulk to at the moment.

Let’s take a look at an example that directly relates to our topic.

Just for comparison, here’s the exact same output from a dall-e 2 beta account…

You can heavily tweak the prompts though.

There is some great info starting to popup, about throwing different elements on the end.

dog eating a banana cartoon

 

dog eating a banana sketch

 

dog eating a banana, studio lighting

 

I ran through a heap of options, trying to work out what would work best before finding something I liked.

Yeah, it’s not amazing, but it’ll do.

 

Bulk generating imagery

Now, in case the plan didn’t click, I’m going to attempt to bulk generate images for all the combo pages.

Dall-e Mini will throw out 9 images per request so we’re gonna hook into that.

There are mentions of a dall-e mega trained model, which would be a bit more ‘refined’ than the images we’re seeing above.

However, that takes a bit more work to get going. So for the initial build, we’re going to aim for just grabbing this output and throwing imagery in an S3 bucket.

I’ve got one of the developers I’ve used previously to build something for me that’ll do this. Might look at sharing it later, won’t for now though.

I know I said all in Google Sheets… but… you obviously would never, ever, ever, need to do this sort of thing.

You’d have completely original imagery and would never use AI to generate them.

So it doesn’t count. This is just me filling in the gaps to better show you how to leverage your own content.

Safe to say this is my new favourite thing.

This is essentially a folder full of folders. Each folder is for a specific animal & food combo.

Each of these combo folders, then contains the 9 images that were generated by my prompt.

HOW COOL.

I mean, look at these little fellas just enjoying life chilling with some strawberries.

Each request takes 2-3 minutes, so let’s go with 3 minutes. If we have 1,000 combos to generate imagery for, we’re looking at 50 hours to generate 9 images for each combo.

That’s not too shabby for generating 9,000 somewhat-usable images.

Probably won’t need all 9 for each post, and I certainly don’t want to go through 1,000 folders of imagery to manually select the best.

We need to prioritise the imagery somehow.

 

Google Cloud Vision AI

Google used to have something where you could put an image up, and it would tell you what it thinks it was.

Well, I eventually managed to find it again.

Google Cloud Vision AI

You can throw the images in there, and it’ll output what it thinks it is;

Both the dog and the banana are clearly recognised.

Unfortunately, however, if we click over to the ‘safe search’ tab, we can see that Google is determining it as ‘very likely’ of being ‘spoof’.

Now I can’t find much quick info as to what this exactly means, but seems to just be their fake image detection.

Google knows it’s faked.

Massive sad face.

But then we look at this outputted image…

This one is marked as ‘likely’ now.

If we generate some others, we can then get it too;

So now it’s only flagged as ‘possible’.

Maybe if we generate enough, it can get even lower.

After testing a heap more, I couldn’t seem to get dogs & bananas any lower.

Something about bananas doesn’t seem to work properly with AI haha.

However, this potato & cat image managed to get this;

So it’s clear you can fool the vision AI detection API with a potato butt cat.

Obviously, there’s a chance Google is using something completely separate in their algo.

There’s also a chance they’re not.

If people swear by the NLP API to optimise text, why can’t we leverage it to choose our imagery?

 

Image prioritisation with Vision AI

How can we use all this to pick the best x images?

Knowing that we can get at least a portion of images passed the auto-detection, we could filter our imagery and give the images a score based on whether the animal & food are detected.

Another score could be based on the level of spoof the image has come back with.

Mocked up a quick Google sheet template, and will have to also throw this over to the dev to handle.

I’ve then thrown together a quick quality scoring system.

If the animal is detected, the image gets a +2.

If the food is detected, the image gets a +2

Then a score depending on the spoof level.

The scores are just all added up, and I get a sortable score.

Once sorted, there is a ‘sortID’ column which is the following formula;

=IF(D2=D1,F1+1,1)

All it does is count up from 1, when the combo is the same.

Then there’s a formula in useImage which just tells me if I should use the image or not.

=IF(F2<5,“Image”&F2,)

If the sortId is less than 5, then put Image<sortId>. So, Image1.

When it’s time to use these images I know there will be exactly 4 images flagged, and their names will always be Image 1-4, and not the source image number, to avoid confusion.

The dev will just do a mass scan with the cloud API and then throw me back a CSV.

Another thing that I’ll have to rely on a dev for, but another super quick thing to do. You could do it yourself too if you want, but I’d rather just hand it off for a few $.

While that happens though, the image scraper has been working in the background and we’ve got images for 500 pages ready to rock and roll.

 

Embedding imagery in the posts

Now that we’ve got our images, we need to be able to get them into our post data, and be able to embed them.

I’ve added the image folder to the Image column, which contains ‘Image1’ up to 4.

This way, we can now vlookup from the main datasheet, 4 separate times. Once for each image.

Created 4 different columns on the datasheet.

You’ll see some blank spaces, the images errored out.

So basically, if there’s no image there will be no image link.

Exactly what we need it to do.

Now we need to take these, and build out the HTML code.

I could build the HTML code in the actual content for the page, but that’ll be a bit dirty and won’t help to troubleshoot.

Rather, I’ll add yet another 4 columns. One for each HTML of the image.

This way, I can just slap the cell reference into the content, and do any tweaks separately.

We can’t do this all as one, because the images won’t be a big slab of imagery. Well, that’s not the aim anyway.

We will want to insert the imagery separately, so one at the top, one at the bottom, and then two in the guts.

Let’s start with a formula to check if the link exists, and then will just use the link if it does, and keep the cell blank if it does’nt.

=IF(len(T2)>0,T2,)

We can then modify it with a standard HTML link code, to dynamically insert the image URL from the image1 cell, along with create an alt text that combined the animal and the food.

=IF(len(T2)>0,”<img src=”””&T2&””” alt=”””&proper(D2)&” with “&proper(E2)&”””>”,)

Be sure to use 3 quotations so that they stay in the outputted text. Since sheets is looking at the quotations as the beginning and end of text, it will error out unless you do the 2 for the inclusion of one, and then a third to finalise the text portion and start the formula portion.

Duplicated it across the other 3 columns, and mixed up the alt text a little bit in each of them too.

Now we gotta stick them in the content.

I’m going to leverage the content templates we built out for this one, and just insert <image1> wherever I want to put the first image.

It can go after the first paragraph, but before the question answer.

Image 2 can then go after the first answer.

It didn’t end up quite as clean as I was hoping, but I have inserted the 4 variables into the text templates.

Have then done substitutions on the content sheet to swap out the variable for the HTML from the data sheet.

Probably should have done that HTML code on the content builder to save the vlookup, but oh well, it’s done now.

 

Uploading the new data with images

I’m not exactly sure whether this will work with the HTML code out the box, so rather than toying around I’m just gonna give it a crack.

Going through the import there’s a step in WP Import/Export that allows for the importing of imagery;

Not exactly sure how this works though, so I am just going to continue the import of the HTML with the S3 references, and maybe we can update so that it actually imports new imagery a bit later.

That’d be cool though.

The upload took a little longer than before…

Almost 4 minutes compared to the normal 30 seconds. Bit more to process now I guess.

But, IT WORKS!

Beautiful.

Except due to the image quality they’re pretty small, and left aligned.

Let’s update the HTML code to centre them, and make them a bit bigger.

I’ll make them 400×400, about double size. The new formula is;

=IF(len(T2)>0,” <img class=””aligncenter”” src=”””&T2&””” alt=”””&proper(D2)&” with “&proper(E2)&””” width= “”400″” height=””400″” />”,)

Remember those double quotes to ensure they’re included in the output.

Now to reupload it all and check it.

Looks much better now.

Oh, and bonus!

It actually imported the imagery from the HTML, didn’t have to do anything else. Thought I was going to need to specify the URLs separately or something.

Bloody ripper!

 

 

Including the text-to-image featured image

Now that we’ve sorted out images for the pages, we need to sort the featured graphic we built out in banner bear.

I loaded up the 4 images per page into the Airtable we made, then jumped into BannerBear to run the generation

500 images were generated in about 5 minutes.

Well, there goes half my quota!

Completely worth it though.

How about those crazy cat eyes though!

Demons.

Now we gotta get them thrown into our spreadsheet, so I’ll just add yet another tab that can keep track of all of them.

I was thinking that maybe we could just import from the bannerbear URL.

Unfortunately, though, all the images will have a jibber jabber name of random characters. More like an ID than an image name.

I could be lazy and just use that, but let’s do this right and give them a proper page name.

I formatted the titles to turn it into a file name (without extension) so we can use the excel image downloader.

We can throw these names into the downloader with the bannerbear URL.

And then just run it… and wait. It’ll be a while.

The images all come out at like 600KB though, so I need to try and compress and/or resize them.

I’ve just come across this image optimisation tool.

Free, and you can just upload a tonne of images and then it’ll optimise and you download.

I resized all the images to 800px wide, rather than 1200, which halved their file size.

The tool downloaded them as pngs though, so I then converted them to JPG and it dropped them by 70%, and they’re all pretty much under 100KB now.

That’ll work perfectly.

Not amazing quality, but definitely plenty for what we need at the moment.

Threw them all in S3, and now we just need to reference them in our data.

Knowing the folder where the images is, along with the name, we piece a URL together by adding the folder along with a .jpg extension on the end.

Now just need to vlookup this from the main data sheet and we’re set.

Time to actually use these images somehow.

Figured I’d try put it as the featured image to test on one of the posts.

Works perfectly right out of the box, so we will just go with that.

The WP Import/Export plugin doesn’t have a separate field for the featured image, but there’s a box you can tick that sets the first image as the featured image.

Sounds dangerous if there isn’t a banner bear image available, but seems like it’ll sort it for us provided we enter the featured image URL in the image URLs box.

Continuing with the import we just select it all and wait for the site to download from S3.

Worked like a charm.

Image quality is pretty shotty though so I might try to step the compression back a bit if we can.

Only 62KB now,  so it’s pretty zoomy though!

 

Meet Caroline

Caroline loves animals!

Leveraging fake name generator for the name, and this person does not exist for the face, we made an author!

That’ll suit nicely for now. Definitely some solid credibility right there.

More so to look real to users, than Google, for this one though.

 

That’s a wrap for now

I’ll call it there for the end of part 4. We’ve done a lot!

No new data was added as Google is slow to kick the crawling off, so the 160 pages will be enough.

We’ll probably do another batch of pages in about a week.

Those 150 pages now have 4 in-content images, along with a featured image set!

Well, most of them. Few image generation issues so there is a handful (~10) missing them.

I’ve also released this dynamic content generation Google Sheets template, which is a basic version of what I am using in this project.

What’s next? Find out in part 5 here!

 

Current Page Count: 160

Dynamic Content Generation with Google Sheets

Dynamic Content Generation with Google Sheets

If you haven’t been following along, I’ve been documenting my journey to a 1,000+ page programmatic site in WordPress using Google sheets.

The build is heavily reliant on Google Sheets to generate dynamic content based on templates.

The content there is a little more advanced, as it is leveraging rules to determine which content template is used.

I thought I’d take it back to basics, and create a dynamic content generator that was a bit simpler.

A straight-up data-to-content text replacement.

 

Dynamically generating text in Google Sheets

 

The data set

The generator starts with your data set.

Each column represents a different piece of data.

You wouldn’t need to use every single of pieces of data you put in the spreadsheet, however, you will need to ensure any data point you do want to use is there.

Each of the columns is a variable name, and that will be taken by the generator and replaced out.

 

The dynamic text template

On the second sheet, is a cell where you write out your text template.

You just write out the text you’d like to include, in the format you want it, and include the variables wherever you’d like them.

Next to the text template is a list of all your variables.

These are just a list of every header name from the first sheet, and it’s a great list to help you remember what you have to work with, rather than needing to flick back and forth between your data set.

Since it’s formula driven, you can’t copy/paste the variables. However, if you’re not adding more columns to the dataset you could paste the raw data so that you could just copy them in.

 

The content generation formula

You’ll find the actual formula that does the replacement on the main generator sheet.

The formula might look a little daunting, but it’s just a large nested substitute.

Each heading has “<” and “>” added on either side to convert it to something to use in the text templates.

The formula will then take these variables, and substitute them for the value in that column.

It will then repeat the process.

To add a new variable in;

1. Insert a column before the contentOutput and fill in your data.

2. Add an additional SUBSITUTE( at the front of the list

3. Copy the data after the 2nd to last bracket, and paste it after the last bracket

4. Modify the cell references to instead reference your new column

So if you inserted one into the current template, the formula would go from;

=SUBSTITUTE(SUBSTITUTE( SUBSTITUTE(SUBSTITUTE( SUBSTITUTE(SUBSTITUTE( SUBSTITUTE(Template!$A$2, “<“&$C$1&”>”,$C2), “<“&$D$1&”>”,$D2) ,”<“&$E$1&”>”,$E2), “<“&$F$1&”>”,$F2), “<“&$G$1&”>”,$G2), “<“&$H$1&”>”,$H2), “<“&$I$1&”>”,$I2)

To become;

=SUBSTITUTE(SUBSTITUTE( SUBSTITUTE(SUBSTITUTE( SUBSTITUTE(SUBSTITUTE( SUBSTITUTE(SUBSTITUTE( Template!$A$2, “<“&$C$1&”>”, $C2),”<“&$D$1&”>”,$D2), “<“&$E$1&”>”,$E2), “<“&$F$1&”>”,$F2), “<“&$G$1&”>”,$G2), “<“&$H$1&”>”,$H2), “<“&$I$1&”>”,$I2) ,”<“&$J$1&”>”,$J2)

Note the blue for the new modifications.

Since the new column of data was J, we copied the end and then changed the I to a J.

 

Content template preview

A preview column has also been added to the text template tab.

Once your generation formula is set up with all your variables, you will be able to see a live generation example as you edit your template.

This will allow you to see exactly what the text template looks like when you substitute some of your real data into it.

The example will randomly select from all of your examples every time you update the cell.

A great way to ensure you’re accounting for different outputs in your template, as you craft it.

 

Access the Google Sheet

You can access the dynamic content generation Google Sheet at the below link.

Let me know if you have any questions!

Feel free to leave a comment, I’d be happy to help out.

 

The SEO First, Users Second, Approach

The SEO First, Users Second, Approach

The words that designers hate.

These words give anyone involved in UX nightmares.

SEO first, users second.

SHOCK. AWE.

Oh, the humanity!

But that is exactly how I approach a fresh programmatic build.

That is how I have helped build out multiple successful programmatic builds.

Even though many SEOs will hate this strategy, it is what I will continue recommending for clients tackling a new, large-scale setup.

Why?

It works.

The SEO first, users second, approach

An SEO first approach does exactly that, it puts SEO ahead of the user experience.

It’s not saying to ignore the UX of the site, by no means.

It’s just making sure that anything being built for an MVP (minimum viable product) has SEO front of mind.

If a feature is being considered there’s one simple question that is asked.

“Is this primarily for SEO, or for users?”

If it’s primarily for SEO, great – it’s in!

If it’s primarily for the users, then it’s moved down the task list a little bit.

As time goes on, more focus can be placed on the users, that the system is now generating.

A product could be launched that’s 80% SEO and 20% for users.

2 months later, it’s 70/30. 4-6 months down the line, 50/50.

As the traffic grows, that traffic starts to be optimised for.

Similar to a chicken/egg scenario, except, what’s the point in primarily focussing on users, when they don’t exist yet?

 

Why I recommend this strategy

Development time is precious.

Programmatic builds take time to get traction in the market.

Getting the MVP live as soon as possible is critical to give the build time to grow in the market.

Getting the MVP live ASAP lets you further iterate on the build once you’ve enabled the ability to drive traffic.

Focussing on SEO first gets you to market faster, with a product that is “good enough” to start getting indexed, and hopefully ranked.

What’s the point in building a pretty page, with fancy animations & conversion-optimised widgets, if you’ve got no one visiting the page?

 

Avoid the void

A less-than-average user experience also makes sure that the product teams come back to rework the build later.

What’s going to be easier to get across the line.

Reworking a build based on a poor UX for an ever-growing user base?

Or reworking a build based on poor SEO that may or may not actually drive traffic?

I know what one I would rather try to sell into the product team.

You probably won’t even need to mention it. Someone higher up might eventually spot the build and start asking questions forcing more time to be spent on the product. Perfect.

Avoid the void of “that can’t be prioritised” that many SEO projects will fall into.

Get the results, and get more work done, albeit a bit cheekily. Sometimes you’ve just gotta play that game, to get the best result possible.

 

Keeping existing users out of the SEO-first approach

If you’re an existing website, but planning a new programmatic build, how do you avoid the existing userbase entering the build?

Especially when it’s an SEO-first approach, you don’t normally want to push your existing users into it as it has a higher chance of degrading their experience.

After all, you’re focussing on acquisition more than retention.

The main way of doing this is by keeping internal links to a minimum.

Not something you do with most builds, but at the start just limit where you’re linking in from. Prioritise links from ‘less visible’ locations initially.

Link from within widgets a bit further down the page, in places that users aren’t going to actively look at.

Some systems I have worked on have even only had a handful of homepage links pointing in, from towards the bottom of the page.

It’s enough to kick things off!

This will essentially help you dark launch the site, allowing you to start to gain traction whilst you still build the whole experience out.

No point delaying everything when you can get it live much earlier, with minimal risk.

 

Should you be taking an SEO-first approach?

Scenario dependant.

(my new fancy way of saying “it depends”)

Are you a brand new site? Yes.

Are you a brand new programmatic build, tacked onto an existing site, that existing users won’t need? Yes.

Are you a brand new programmatic build, tacked onto an existing site, that users will need (ie an optimised search)? No.

Are you rebuilding an existing programmatic build with an existing large user base that will get migrated over? No.

Plenty of great use cases for an SEO-first approach, but it’s certainly not for everyone.

For many, the users should definitely still come first.

Just Another WordPress Programmatic SEO Build – Part 3

Just Another WordPress Programmatic SEO Build – Part 3

Let’s continue the WordPress programmatic SEO build!

If you haven’t already, check out part 1 of the wordpress programmatic build here, and then part 2 here.

A little pre-note here is that the next couple of posts will take a little side-step. Purely because I don’t have my own content, and want to get something out ASAP.

You can do this infinitely better than I will be, if you’re putting in the time or money to build out your dataset.

We’ll be back on track afterwards.

Time to crack on!

Fixing the design

I couldn’t bear it.

Haven’t played with too much WordPress lately so wasn’t sure what design to go with.

Have just installed Astra free and tweaking it a little bit. Much cleaner. Might upgrade it to paid to get a bit more customisation too but it’s already infinitely better than before.

Managed to clean up some of the more “bloggy” stuff, and have also got breadcrumbs going!

Just need to remove that pesky home button but I can sort that one later.

Also removed comments, cleaned up a couple of things and added a pretty header menu.

And by pretty I mean non-default. It’ll do for now.

 

Will be getting rid of that ‘previous post’, along with the recent post links at some point too. They’ll kill some of the topical groupings by linking to who knows where.

Happy to also say that the change to Astra fixed the main category design problem!

 

Need to work out how to remove all the posts listed below though, as they’re already linked to from within this content.

Unfortunately, the theme killed the subheadings for the page, but that’s liveable for now. Can add some custom CSS to address this later… I hope.

 

Upgrading the content

On top of the continuous data expansion via my VA, I need to update these keyword templates.

Let’s be real, they’re shit.

I still might get a writer involved later, but let’s have a play with an AI tool.

 

Generating some generic template text with an AI tool

My go-to AI content generation tool is WriteSonic – for now.

I’m mostly using it to generate product descriptions with AI in bulk at the moment.

So let’s take a look at what we can do with it this content.

I’m going to use the ‘sentence expander’ option, and just work through the content to see what it can spit out.

Starting with the opener, let’s see what we can extract from a generic point of view.

The goal will be to pad out the dynamic text templates a little bit, and get some more content on the pages.

 

That first option will work perfectly with some tweaks, and I’ll replace animals with the <animal> variable so that it inserts our animal name.

Added a few extra bits of text to some of the templates.

 

Couldn’t find too much generic stuff, so need to work out a good solution for this, and want to avoid providing too many incorrect answers.

It’s all very specific to each animal & food combo, not exactly template-able without just being garbage.

Wonder if there’s a better way we can do this.

 

Working directly with an AI model

AI tools are great for helping you write a piece of content.

They spit out a suggestion, then you can edit it and tweak it, and away you go.

Perfect when you’re handling small quantities of content.

How can this be scaled though?

How can I generate content that actually relates directly to the current page at scale?

Well, you can go directly to the source.

The source that pretty much every AI tool leverages in some way or another.

OpenAI.

Their GPT-3 setup to be more specific.

You can ask it pretty much anything at all, and it could return a full response, or just a simple answer to a question.

Maybe we can just ask it, in bulk, whether an <animal> can eat <food>.

Once registered, you can jump straight into their playground.

I haven’t spent as much time working directly in Open AI & their API, so this project is changing that.

To start, we can just straight up ask it the question.

Yes, a robot in the computer wrote that.

Guess that means the project’s done now, doesn’t it?

Well, I could stop there, and if I was happy to spend the extra money for a VA to build out a quality data set and have a writer build out quality templates, this is definitely where I’d stop.

Alas, I’m not. So let’s take this further.

The initial step was obviously not enough content, so let’s try get that expanded out.

So that’s a great start, and it can generate content for each section we created.

Completely related content.

You can then kindly request two paragraphs of content, instead of a single paragraph by using the “write two paragraphs” prompt.

I am seeing it hit and miss, but seemed to get higher success throwing a ‘separate’ in there.

The ‘please’ is just to make sure the robots don’t come for me when they take over.

That’s 89 words though, from a simple question.

I’m no expert here, and have only just started learning the basic prompts, so feel free to throw anything in the comments that could help out here.

This is using the DaVinci model, which costs $0.06 per 1,000 tokens.

OpenAI says it’s ‘approximately’ 4 characters per token, including both the prompt and the output.

The above prompt was 77 characters and the response was 495 characters, giving us a total of 572 characters.

572/4 = 143 tokens

143 tokens = 143/1000 x 0.06 = $0.00858

That 89 words cost us less than 1 cent.

Let’s say we want to generate 8 sections of similar content for a page, it’ll be $0.069 per page.

To then do this for 1,000 pages, we’re looking at $68.64

1,000 pages of ‘content’, for under $70.

Would I recommend this done for a client build?

Never. I’d stick with the dynamic variable insertion techniques.

Would I do this for a test project where more emphasis is on the process & the overall build and I really just need a tonne of filler content to build out the pages and prove out some concepts whilst still having somewhat shot at being indexed and possibly ranked for longer-tail keywords?

You bet. Game on!

 

Bulk generating the AI text

Let me preface this by saying I have never done this before. Legit.

This is my first time doing any sort of bulk generation of text directly with an AI model, and not using a tool with an interface and a VA to enter the data.

This. Changes. Today.

Ideally, we get this working directly in Google Sheets.

Would mean we don’t need to worry about jumping in and out of sheets, or doing any bulk imports. As new data & combinations is added, we could just run a formula and boom, we’ve have some magic words.

Found a great little gpt script by Andrew Charlton here that’s for titles & meta descriptions, and has the core request.

Input a keyword, and it’ll spit out a title and description. It’ll also cost a little bit due to a rather large amount of sample data hardcoded in the prompt though, so don’t overdo it!

Rather than trying to modify it, a mate shared a mob called Riku.ai.

They’ve been working on a plug-and-play sheets addon, and I’ll be going with that.

I’ve started by building out a little Google sheet to test things, along with estimating the costs of the outputs;

Something I might look at cleaning up later, but this was more of just a way for me to learn how to best write prompts.

I was manually copying and pasting these prompts into the playground to test, and then just paste the output here.

Riku just released the Google Sheets add-on though, and if you’re a paying member (they’ve got a discounted option via app sumo for July) you can find the Google Sheet in their FB group.

Everything runs through your Riku account, and you set up prompts in there which is rather annoying as you need to work in 2 systems to make updates. To get this working how I wanted it I had to create ‘blank’ prompts in Riku, and then just fed in this prompt. So I am essentially just using it for the Google Sheet at the moment.

Loaded it all up, threw in a heap of prompts and this is what happens;

This is magical.

Following this format, I’ve built out a Google Sheet containing all the different prompts I might want to use;

Throwing some of these into the Riku sheet and running the AI over it, we get;

At top level, it looks pretty cool.

I get the exact question format I am after, along with a chonky paragraph.

Problem is, I started to see some issues when looking at the generation in bulk.

Like this one.

As far as my Googling tells me, no, they can, in fact, not.

I know I’ll get a heap of false info for this, but that is why the prompt was tweaked further to become;

So it all comes down to the prompts and some examples of certain expected outputs from the prompt.

I also realised that the prompts I was using around the paragraphs, just weren’t generating anything clean enough.

I figure it’s best to take it back to basics, and just use the dynamic variable system I have already built.

The one that the VA started filling in for me, where we just place some simple text and bulk it out with the generic text.

This is where the true intelligence of OpenAI shines.

The current text system to generate our combo pages accepts inputs of text like the below examples;

So let’s see what happens when I try to request something similar from OpenAI.

Nice summary, but, it doesn’t follow the format.

Well, we can feed the AI some examples of the formats we want it to follow by using these magical stop thingies ##.

Google tells me that’s pretty good.

Turns out the ‘comma-separated’ request might have been a bit confusing, particularly for some of the other prompts.

So instead, I modified it to be ‘a summary’ and the output slightly improved;

This time, it also followed the format in a couple of the examples where there is a full stop mid-sentence.

It followed it PERFECTLY.

Honestly, was not expecting this. Pretty neat.

I’ve fleshed a similar thing out for the other main sections, so that we can run this and get the content.

Unfortunately, I ran into an issue with the Riku Google Sheets.

It wouldn’t allow line breaks sent via the Google Sheets. I stripped them out, and thought I had gotten it to work, but unfortunately it only worked for a few. After that the Google Sheets kept acting like the stops didn’t exist and I had to drop it. It was too time consuming to patch the broken ones.

However, I was sent a direct Google Sheets script that I made work perfectly.

My prompt with examples, line breaks and stop sequences on the left, with the output on the right;

Bingoooooo.

I added a little tweak for some of them, that if the animal couldn’t eat the food then don’t run the request for the prompt.

In this case, it was for the benefit of food.

I don’t think it’s much of a benefit if the animal dies after eating food, and the robots can return some rather interesting stuff, so was best to just ‘ignore’ them for some sections.

It will need some more examples in the future though to account for different possibilities.

I ran this across 150 combo pages to kick things off, and it cost approximately $1.

In total.

$1 for an 80%+ ish, arbitrary, correct data set for all of those responses.

Waheeeeee.

I further tweaked the model to include the text template element that will sit before the text. This way, the robot has something to kick it off and they al nicely fit the template and will read cleanly.

One of the new templates is;

Which, as you can see, outputs exactly what I need. I nice and direct answer.

Yeah, a lot of them are ‘upset stomach’, but hey, that’s what happens with food.

It now works a treat and I’ve added content for them all, for 150 pages.

You’ll probably spot some stuff that might not be correct.

The robots get it wrong sometimes though.

There’s just so much trashy content out there these days that these little guys suck up way too much junk to feed it.

Yes, that food is okay
No, that food is poison
Oh, I feed it to her as a treat
Yeah you should be okay
Only a little bit though

There’s just no solid answer for it sometimes, so it confuses the AI when being asked yes or no.

Sometimes, the two different AI models will give two different answers. The same thing happens just using two different prompts at times.

My initial generations are mostly accurate, but I’d say it’s 80%. I’d like to get that closer to 95%.

For certain yes/no questions, I feel like if it’s a no, there’s a higher chance of the no being correct than a yes.

My plan later will be to run both models, and maybe an extra prompt, and then try and pick the average to give it more chances to be correct.

I’ll also extend out the samples, to try and be a little more accurate. It will cost more, but accuracy will be required, particularly in the yes/no question.

Something I’ll test out later, but I’ve got enough for now, ill probably just add a caveat or something as a warning just in case it actually ranks.

 

Adding an out an ‘about us’ page with AI

We need to now throw up a basic ‘about us’ page that gives it a little bit of a… err.. real feel.

Since we’ve just used a tonne of it already, why not throw a bit more AI in the mix.

Don’t need anything special for now.

Threw in a rather detailed prompt and the playground threw back the below.

It randomly spat out a domain name that is completely open for registration… at the time of posting. Go on. I dare you.

No clue what it did there, but pretty cool!

 

Added a logo

Just some text and an icon I whipped up in 2 seconds.

It needed it.

 

Re-upload the data

And just re-uploading all our data.

Updating the old pages, and adding in all our shiny new data.

150 posts, plus the 10 categories (imported separately), done.

 

Where we’re at now

After the initial build creation, and internal linking setups from the previous posts, we’ve now significantly scaled page creation along with expanded the available content for each page.

The pages are looking pretty text-heavy now though…

Maybe this guy can help us out.

So what does this banana face-hugger cross doggo have to do with the build?

Continue with part 4 here.

 

Current Page Count: 160

Programmatic SEO: Self Scaling Landing Page Systems

Programmatic SEO: Self Scaling Landing Page Systems

What do Booking.com, Homeadvisor, Airtasker, Vivian Health, & ZipRecruiter have in common?

They’ve all experienced massive growth whilst leveraging a self-expanding landing page system.

Coincidence?

Self-scaling landing page systems are at the core of what I recommend for clients for this exact reason.

A system that automatically expands as the business grows.

Essentially as close to automated SEO as you can get.

What are they, and how do they work? Let’s take a look!

What is a self-scaling landing page system?

A self-scaling landing page system is just a programmatic build, with rules-based internal linking.

Sounds fancy, right?

Well, it’s pretty much just any programmatic build that ensures it isn’t linking to pages with 0 results, and has a content type that grows.

It’s that simple.

So, how is it self-scaling?

The internal links make it self-scaling.

If a page isn’t linked to, it doesn’t “exist” to Google.

If your internal links point to all pages with at least 1 result, when a page that didn’t have results gets a result, it will then get linked to.

A whole new page in Google’s eyes, that you just created!

Yet it was “there” the whole time.

So as the business grows, as it gets more content posted/shared on it, new pages get results. These pages will then become a brand new page!

It’s the ultimate way to limit your thin pages, whilst growing your SEO strategy over time.

All self-scaling sites are programmatic builds, but not all programmatic builds are self-scaling.

What’s a ‘result’ in this context?

It’s a piece of content.

For portals, classifieds and marketplaces a piece of content is a listing.

Normally user-generated, it’s the core content of the business essentially.

For other systems, it could be;

  • User-generated comments
  • User-generated photos
  • User-written articles

Basically, anything that’s heavily user-driven as this content thrives in a self-scaling build.

 

How much value does self-scaling add to a build?

As with most SEO, it’s hard to say.

It does, however, have definite impacts on content quality. It ensures minimal thin content pages are ‘live’, whilst still being able to target the widest amount of keywords possible.

You can also naturally ‘grow’ the website, ensuring there are always batches of fresh pages being spun up giving Google some nice crawl bait.

 

Limit page ‘creation’ by content quantities

Rather than ‘creating’ a new page when it has a single piece of content, some programmatic builds will benefit more by setting a minimum quantity of content.

Maybe, say, 5 pieces of content.

This would mean that a page would only become actively linked to, and included in the XML sitemap, once it reached that 5 pieces of content threshold.

You could also launch a build with a rule of 10 pieces of content required.

Then 2 months later drop it to 8.

After another 2 months, drop it to 5.

This would allow you to slowly scale up the website and ensure only the highest quality content is initially created.

Excusing my poor design abilities, an example is below.

When the pages reach the page creation threshold, being 5 pages, they get created.

So as time goes on, as the business gets more pieces of content, more pages get created.

 

Natural page degradation

Sometimes locations, or categories, come in and out of fashion.

Sometimes pages you previously had, were no longer needed.

Just like you set page creation rules you could also set page degradation rules.

When a page gets below a specific quantity of pages you could ‘un-create’ it.

Remove all links pointing and, and remove them from the sitemap.

Sometimes you’ll also have pages that float around the minimum content required, bopping in and out, so how can we deal with them?

If you have 8 pieces of content as the creation trigger, you could then have a link removal rule at say 6 pieces. This would let a page get down as low as 6 pieces of content, before being removed.

That allows pages to get created, and then gives them a buffer before having links pointing in removed.

This will then no longer give Google crawlable links into the page, however, it would still have the page accessible to it and it will know about it, due to being previously available.

An additional rule could fully degrade a page once it drops below a separate content count limit, let’s say 4 pieces.

At 4 pieces, you could have the degradation rule setup that automatically 301 redirects this page to its parent.

Some people may prefer no indexing, but I always recommend the 301 to just clean up these pages, and ensure you can maintain any SEO value they might have had, by passing it through to the most related page.

If a page then ever meets the 8 pieces of content rule, it would have the redirect removed, and links pointing in created again.

Some devs will say “but that redirect is cached and it will create a user-experience issue”. No, no it won’t in 99.99% of cases. The significantly small portion of users that ever visited these pages would probably be long gone. You can also set it to not cache the redirect locally, or cache it for a very small amount of time so that the cache expires quickly, and when they re-hit the page, they’ll get the new page rather than the redirect.

 

An alternate degradation, the ‘pending’ state

Another way of handling this is giving pages a ‘PENDING’ state. Once they drop below the page creation content quantity rule, you flag them.

This flag would last a period of time, say 4 weeks, giving that page that 4 weeks to then re-meet the page creation rule.

If the page doesn’t, it gets degraded. If it does, then the pending state is removed.

If it then drops again, the pending state starts all over again.

This way gives pages specific periods of time to ensure they get the content they need created and could give a page set more natural growth.

 

Teasing new markets with pre-expansion ‘fringe’ pages

There’s a limit to these systems though, and that is that it’s purely dependent on a business having content that matches a landing page, for the landing page to go live.

You can’t build it to grow an audience unless you have related content already.

However, there’s a way you can build in an extra layer to these landing page systems.

You can build out a page template that contains ‘related’ or ‘nearby’ listings, to that of what the page is about.

Let’s call them ‘fringe pages’.

These fringe pages essentially make a page that has 0 content directly related to it, but is close enough that you could include a few keywords to make it seem related.

So they’re on the fringe of having their own content, but just aren’t quite there yet, and you’d like to give them a subtle nudge.

You’ll have a page you can start to hopefully get ranked, drive traffic too, and thus generate actual related content for the page.

You need to keep this under control though.

You can’t just go out and create a page for absolutely everything, that’s not what these are for.

They should only be getting used for pages that you think are directly on the edge of being able to have their own page.

Maybe pages that had a piece of content 6 months ago, but no longer. Or maybe for new categories, where you’ve got a few somewhat related pieces of content that you can use to seed the new categories until you get more content. Maybe even expanding into new areas. Possibly even new countries. Use one country’s content as “examples” until the new country has their own content.

Fringe pages have a somewhat limited use case, and aren’t for all programmatic builds.

But they can be powerful, when used right.

 

Examples of self-scaling programmatic systems

There are a few examples I use when talking about self-scaling.

You can normally tell a website will have a self-scaling part, when there are linking widgets that have weird numbers of links, and that count will change depending on the page.

Maybe one page links to 2 sub-locations, yet another will link to 5. Some will be easier to tell than others, but there are signs there if you look hard enough.

A couple of them I worked directly with the dev teams to build out, but you’ll be able to easily find all the shared info yourself.

 

1. Booking.com

Not a build I’d normally use as a self-expansion example, however, it turns out to be a pretty great one.

I’d be lying if I said I found a good example nice and quickly…. because no. This took way too long to find.

The site went through way too many design changes that it just wasn’t clear what was a design modification, verse a self-expansion.

As booking.com grew its brand, its property counts grew. As the property counters grew, all of their location pages grew.

The only clear case of this I could point out though is in their district linking widgets.

Sources: January 2012, June 2013August 2014, October 2015March 2017, November 2017March 2018

You’ll see the Melbourne Hotels page go from 16 districts being listed in 2012, to 31 districts in 2018. They then changed their design and it was no longer easily trackable.

They also possibly moved to a less ‘auto-expansion’ focussed set up, as the linking widgets included distances to suburbs rather than counts. Unsure if it continued being count driven after this.

They doubled their page count at the district level alone.

This might not seem massive, but scaled across the entire site, this is a pretty great setup for somewhat hands-free targeting of new but highly related pages.

After March 2018 they started to reach the maximum page count growth possible due to, well, being booking.com. They had all the hotels.

This is where they really shifted their focus into the UX and ‘pretty’ things, and started to clean up the linking to include imagery.

 

2. Homeadvisor.com

One of the original builds when it comes to self-scaling as far as I have found, they’ve done really well over the years.

The self-scaling is a bit hidden, and you might not even notice it.

At the top level, let’s compare two categories. Painting, and Permit Services.

The painting top-level category pages link out to a tonne of state pages;

But then if we look at the permit services page, we see significantly fewer state page links, in the same widget.

They just have less permit services-related content, and thus don’t have enough for a heap of the state pages.

If we then compare two of the state pages for the Permit Services, we can see the following linking widgets in the sidebar;

There is a maxed-out widget for Arizona, but Illinois only has a small portion of the cities available, and thus significantly fewer links are showing.

It would be safe to assume, that as the business adds more contractors/jobs in the missing areas, then they would be added to the list and linked to.

If we actually look at the permit services category page over time (via Wayback machine!), we see the following with the state linking widget:

You’ll clearly see the links increase as time goes on. Their landing page system naturally grew with the business.

It then also actually decreases at the end! They have a degradation strategy in place.

One of their degraded pages is still accessible though and has a noindex, nofollow tag on it, so they went with a meta robots degradation strategy.

A pretty old system, yet shows how it could have propelled home advisor to where it is today.

 

3. Airtasker

One of the first self-scaling systems I worked with a development team to build out whilst working for Airtasker as a growth marketer.

I’d say I managed to get the MVP live 90% of the way I wanted, and apart from a redesign and some additions, it doesn’t appear too much has changed for the core system 4 years later. They’ve definitely made updates, but the actual core build appears to work just like day 1.

That is, other than the business getting much much bigger, with much more content being created, and the self-expansion doing its thing driving page counts up and up!

Due to the sheer volume of pages and new targets, you can see it took a little while to get a foothold. Some updates were made, and as time went on, you can really see how the domain has grown in SEO strength over time.

Both in the primary Aussie market, and now internationally as they expand out into other markets.

Unfortunately, the system is using JS for the view more expanders *mutters under breath* so Wayback machine won’t work to show a side-by-side comparison due to most the locations being hidden behind a view more.

However, if we just take two different live pages we can see the difference.

This is comparing the cleaning gold coast (left), to the building gold coast pages (right).

You can clearly see that there are significantly more pages created for the cleaning category, than there are for the building category.

This isn’t saying that the LP system is the key to their growth.

By no means.

It’s just a small portion of a larger pie.

The brand marketing. The endless campaigns. The paid spend. The content marketing.

Aussies got to know what Airtasker was through all of this.

The landing page system was then able to be in the right place, at the right time, to leverage that growth when consumers were searching for related services.

It was able to take what was being done with the company anyway, and help scale that, in a way that improved the marketing reach.

Without the extra marketing, the domain strength growth through brand placements, the content market, the growth would have still been there.

Just significantly less, and much slower.

These systems work hand-in-hand with a solid core-business strategy.

 

4. Vivian Health

Another large system I helped build out, it’s clearly been a solid win for the business.

The team were very receptive to SEO, and were able to take the SEO-first approach to the initial build, with more user-friendly updates made over time.

This helped skyrocket the growth of the pages.

Originally launched under their old name, you can see the immediate growth, and then additional growth post-rebrand & migration.

They are competing in a world dominated by some large online Jobs websites, on top of the ever-growing difficulties of SEO in the health space due to how Google treats the market.

We can see their automated growth at play when comparing the location links for two of the categories;

We can then see another location of the self-scaling within the subcategory links widget on some of the location pages;

 

 

5. ZipRecruiter

Another jobs space portal with a self-scaling setup.

We can see this in play when you compare two of the different categories of apprenticeships available on the site;

ZipRecruiter has a pretty serious linking sidebar setup. If you delve in further, you can really see how vital SEO has been as a part of their overall strategy, so you can really pull some good ideas out of their system.

 

How can a self-scaling setup help you?

A self-scaling setup can really help some businesses, and other businesses not so much.

It really comes down to how often your content expands, and whether that is often enough to be worth focussing on one.

Either way though, any good programmatic system shouldn’t be linking out to pages that don’t have content, so they should all naturally scale as you build it out.

If you’re only linking to pages with content, then naturally, any good programmatic build will be self-scaling by default.

Programmatic SEO: Building a Strong Site Structure

Programmatic SEO: Building a Strong Site Structure

A strong site structure should be at the core of any programmatic SEO build.

How the value of your site flows around the site is crucial to ensure both key pages ranking, and longer tail pages are seen as valuable enough to rank.

Let’s dig into building a strong site structure.

The importance of URL structure in SEO

The URL plays such a key part in defining how pages relate to each other.

Whilst a breadcrumb can also help do this, the URL structure does this stronger and even clearer.

The structure helps Google quickly understand what a page’s parent is, and thus it can understand the relationship between the two.

By default, the fact a page is a child of another page would help it know that the content will probably be extremely related, with a lot of topical cross-over.

The structure also helps pass SEO value around the site effectively.

Ensuring that child pages pass their value up the hierarchy, to their parent page.

Ensuring that a parent can help give their children pages a little boost from their value too.

Topical relation + SEO value passing = a win.

 

Site structuring with breadcrumbs

When building a strong site structure, URL structure is number 1. However, site structuring via breadcrumbs can also come into play.

Not just as a backup, but also as something that can work hand-in-hand with the URL structure.

 

Breadcrumbing as a backup

If all else fails with being able to implement URL structures, breadcrumb structures are your best backup.

They’re pretty much the only other way, to effectively set the parent-child relationship.

The downside here is that they only really work up the chain, rather than down, due to how a breadcrumb works.

 

Breadcrumbing alongside URL structures

Leveraging breadcrumbs on top of a good URL structure is where they stand out.

Not only can you back up the URL structure, by following it up with a breadcrumb structure that imitates the URLs, you can also drop in extra levels where applicable.

Sometimes a piece of content sits across different levels in the URL structure.

You obviously can’t include both in the URL, as that wouldn’t be right,

You can, however, drop the extra level into the breadcrumb structure and let Google know that relation with no issues.

 

How to build a strong site structure

Let’s look at how I’d be recommending you structure your site.

Just remember, there isn’t a ‘right’ or ‘wrong’ answer here, but pays to put in the effort at the start as it’s not exactly something you want to change.

 

Organising your content around a core structure

My primary aim with the structure is to try and put the items with the highest differentiation between them, higher in the list.

This way, you put related content together, and try and separate un-related content as best possible.

For real estate, at the heart of all listings are two channels – Buy & Rent.

So I’d recommend starting here with;

/buy/

/rent/

For the second tier, I would bring in location.

Something like;

/buy/melbourne/

/buy/sydney/

The third tier is when I would then bring in the property type;

/buy/melbourne/apartments/

/buy/melbourne/houses/

But you’d still have the no-location search URLs available, like;

/buy/apartments/

/buy/houses/

The reasoning behind structuring it this way, is that buy verse rent are two completely separate journeys.

It’s a rare occurrence for someone wanting to both at both renting & buying a home to live in.

The next thing they’re looking at is the location. There’s more of a chance they’ll want to look at both apartments & houses in a location, than to look at just apartments in multiple areas at once.

On top of this, you’ll also get property types like ‘apartments’, ‘units’, ‘condos’ and ‘flats’ being seen as extremely related in Google’s eyes. Sometimes even direct synonyms.

Google won’t ever see nearby suburbs in a city that same way.

Well, maybe not never. But very very very rarely.

 

Leverage your blog content

You can dramatically improve the value of your core structure, by leveraging your blog content.

Rather than throwing your blog content under /blog/, /advice/ or something completely random, nestle it under your core structure.

Sometimes harder said than done, even just using the parent folder can dramatically improve the value passed through to that folder, helping it and sub-pages rank.

This would involve post URLs like;

/buy/how-to-buy-a-property/

/rent/the-ultimate-guide-to-renting/

and similar.

If a post relates to the location too, it could be;

/buy/melbourne/how-to-buy-a-property-in-melbourne/

And an even more extreme could be about a location + property type article;

/buy/melbourne/apartment/how-to-buy-an-apartment-in-melbourne/

But that’s getting a bit deep, and would probably be a bit more on the difficult side of an integration.

You’ll always get a bit of tech pushback due to the way the URLs resolve from the server.

It can be done on pretty much any tech, with the work.

There’s also a bit of a shortcut, that would still work wonders.

An old workplace got around to restructuring their blog.

Whether they found an old doc I was working off, or whether one of the newer team got it done, they restructured their blog content to add value to their core structure.

airtasker.com/cleaning/guides/how-to-clean-home-after-flood/

They used the /guides/ subfolder to hack the content in there. The server knows where to grab the content from, purely from that subfolder.

An alternate way of doing this could be to just use ‘/guide-‘ with a dash, rather than creating a whole new directory, but the directory does make things a bit simpler.

They’ve used only their top-tier folder from the structure, and yet have added significant value to all content within the /cleaning/ folder now because of this.

 

Flat verse foldered

A debate that’s raged for years.

Is a flat site architecture better than a built-out, folder structure?

The way I see it, each folder level is similar to a 301 redirect. Probably doesn’t pass all the value through.

Let’s say it’s 80% (completely arbitrary, just to give it a number for context).

3 levels deep, and you’d have the parent sitting at 80% of 80% of 80% being 51.2%.

So wouldn’t the 80% of a single level, be better than 51.2% in a 3?

No, because that 3 tier is so much more targeted. Channel, location, & property type, just to name 3, could be included in a real estate portal.

The value gained from the 3-tier, nestled structure is more than what is lost through the levels.

I will always recommend a solid structure for any large-scale programmatic builds, and then a flatter, maybe one level, structure for smaller content sites where there’s less value gained from the tiering.

The structure is so important when it comes to content filtering, so giving Google the best hint to understand how each child relates to the parent is the easiest way to set that relationship.

Setting that relationship in a way Google can easily understand will help them rank the correct content, for the correct term.

 

There is no ‘best’ site structure

There might be ‘best for you’, but there is no ‘best’ site structure for you to build.

There are definitely wrong site structures though, I’ll tell you that.

Just build out the one you feel is best, based on what’s discussed above.

 

Should you be siloing your site?

To properly ‘silo’ your site, every single link on the page needs to be tamed and within the silo’s rules.

Every link.

It’s a lot of work.

Almost impossible at the enterprise level.

You can, however, keep note of what’s linking to what, and limit it where possible, to ensure the most value is passed to each individual link.

 

Is it worth doing a migration to update the structure?

It takes a lot for me to recommend an existing site migrate over to a new structure.

One of the core decision elements here is the amount of opportunity they stand to gain from doing so.

If they’re newer, planning a massive expansion, or have an extremely large amount of market opportunity and are prepared for the risk, then I’d say the migration to a more-solidified structure is worth it.

For a #1 in the market, a structure change could be more-so a solidification of their position, rather than for growth opportunity.

It’s a dangerous change to make, as you’re essentially redirecting over the core of your SEO performance, and risking it, during and after the migration.

 

Site structure is a key element

Just remember that site structure is a key element when it comes to the optimisation of your site.

Many other factors are at play, but the site structure is the core of your website.

Get it right, and the work done will be repaid.

Just Another WordPress Programmatic SEO Build – Part 2

Just Another WordPress Programmatic SEO Build – Part 2

It’s time for part 2 of the programmatic SEO build in WordPress.

Now that the core data structure and processes are in place, and we know that we can successfully generate a WordPress page, it’s time to extend the build.

If you haven’t already, you can check out part 1 of the programmatic SEO WordPress build here.

Extending the data

In the background I’ve had a few data updates happening, to extend what we’re creating.

Nothing major yet, but enough to build out a good handful of pages.

This is an ongoing process, due to the time involved. Going to try and get it above 1,000 pages but let’s see where we get to.

 

Creating top-level animal pages

Now that we’re covering the “can <animal> eat <food>” keywords, we can move onto the top level “what can <animal> eat” keyword set.

Both for targeting these keywords, but also, to become a parent for all the related content that will sit below it.

These animal pages will essentially act as ‘category’ pages.

 

Including animal pages in the data

The first step here was to create a new sheet that lists out all the animals.

I’ve thrown in some extra columns, and will be trying to include an intro, content related to each food the animal can eat, foods the animal can’t eat, and then a conclusion.

Then I’ll piece it all together.

 

Creating the base-level content

Let’s start with an opener and a closer. They should be nice and easy.

A couple of templates added, and then joining them back together and we get;

But we don’t want this output for every animal. Not every animal has content yet.

I could just add some content, and ensure each has a page, but then I couldn’t easily add new animals later.

We need the rule to make sure the animal has at least one food page created.

We can do this by counting the cells with either a YES or NO in the “canEat” column, when the animal matches up.

=COUNTIFS(Data!D:D, A2,Data!F:F,”*”)

Now the text is only generated for Cats in the example, because the data only currently contains content for cats & dogs.

 

Adding in the food content

I’m tossing up whether we actually say if an animal can eat the food or not, or just be like “find out if they can here…” and link through to the page.

For now, I am probably just going to do a nice ugly text link out and come back to it later.

First up, I’m just going to try and do a comma-separated list of all the foods an animal can and can not eat.

For this, I’ll be using an old formula I half found, and half put together, a few years ago.

This one took me a little bit to figure out how to convert it over though, as the original formula was only matching a single column. This time it needed to match two columns.

How do you match two columns? Well, you just merge the two columns into one column.

=ARRAYFORMULA(TEXTJOIN( “, “,TRUE,UNIQUE( IF(Data!Q:Q=”YES”& A2,Data!E:E,””))))

Long story short, this formula is looking into the data set for a new column I added, the is just YES<animal>. So YESdogs or YEScats.

When it finds a match, it grabs all the foods available for that set.

The source data looks like this;

And the output looks like this;

Definitely got me a little more excited than it should of haha.

As I mentioned, I want to build this out with a little section for each food eventually.

To avoid getting held up on that, and be able to continue to the build, I’ll just slap a sentence together for foods an animal can and can’t eat for now.

Will come back later and expand on it.

I’ve added a sentence for the can and can’t eats, and then substituted the animal and the food into it;

But you’ll see an extra variable in there, <foodEatCount>.

This one will just be a simple count of all the variables available.

There are a couple of ways to do this. Since it’s a comma-separated list, we can just count the commas and then +1 since there’s always one less comma than the item.

To count all the commas, and then add 1 to the count, we use the following formula;

=LEN(I8)-LEN( SUBSTITUTE(I8,”,”,””))+1

This is added inside a substitution formula, that swaps out the <foodEatCount> for the result of the comma+1 count formula.

The output is now the variable switched out with the count of items the animal can eat;

Replicating this for the cannot eat, for cats we get;

You may or may not pick it up, but I’m a bit pedantic so I am just going to quickly patch that with a little substitution and tweak.

Better.

Can tweak it later to allow for a secondary template if it’s just a single item.

Now, the comma-separated output looks like this;

Not quite “natural” sounding because you normally throw an “and” or similar at the end of a comma-separated list.

Probably okay for now, but I’m gonna patch it anyway.

It gets a bit messy, and it might look like an overly exaggerated formula, but we need to trim all the characters after the final comma.

=LEFT(I8,FIND(“@”, SUBSTITUTE(I8,”character”,”@” ,LEN(I8)-LEN(SUBSTITUTE( I8,”character”,””))))-1)

We then need to strip all characters before the final comma.

=RIGHT(I8,LEN(I8)-FIND(“@”, SUBSTITUTE(I8,”Character”,”@”,(LEN(I8)-LEN( SUBSTITUTE(I8,”Character”,””)))/ LEN(“Character”))))

And then finally, we slap it all together with an “and” in the middle.

I love the old oxford comma too, if you haven’t noticed. So I’m obviously gonna throw one of those in there too!

I’ll throw an extra column into my content creator, and clean it up there. It will help troubleshoot any issues later.

Once it’s all combined, we get;

=LEFT(I8,FIND(“@”, SUBSTITUTE(I8,“,”,“@”, LEN(I8)LEN(SUBSTITUTE(I8,“,”,“”))))1) &“, and “&RIGHT(I8, LEN(I8)FIND(“@”, SUBSTITUTE(I8,“, “,“@”,(LEN(I8)LEN( SUBSTITUTE(I8,“, “,“”)))/ LEN(“, “)))1)

It now reads like proper English!

Just needs one final tweak to allow for when there is only 1 food available.

To do that, we count the commas again. If there are 0 commas, it’s a single entry so we can include the food.

=if(LEN(I8)-LEN( SUBSTITUTE(I8,”,”,””))=0 ,I8,iferror(LEFT(I8,FIND(“@”, SUBSTITUTE(I8,”,”,”@”,LEN(I8)-LEN( SUBSTITUTE(I8,”,”,””))))-1)&”, and “& RIGHT(I8,LEN(I8)-FIND(“@”, SUBSTITUTE(I8,”, “,”@”,(LEN(I8)-LEN( SUBSTITUTE(I8,”, “,””)))/LEN(“, “)))-1),))

And after all that the output is now;

Quite a bit of work to just swap out a comma!

After all that though, we now have page content for an animal-specific page.

Neat.

Will throw a quick H1 together, and then just tack these pages onto the bottom of the CSV output when I am ready to do another page generation.

 

Internal linking

As the data is extended, new pages will be getting created.

The problem is, that it’s all reliant on the WordPress setup at the moment.

Following the basic internal linking strategies here, we need to focus on three link types.

Parent linking, child linking, and cross-linking.

I’m going to add some extra data points to all the animals and food, to help us filter ‘related’ content.

Think about these as another category level.

The two I will be adding for now, are animalCategory & parent.

All the animals will have a category, but might not have a parent.

 

Parent linking

The parent linking for the build will attempt to achieve a couple of things;

  • Combos linking to animal pages
  • Animals linking to parent animals when available

Let’s kick off by getting the combos linking to the animal pages.

I’ll actually achieve this in two ways.

The first is by assigning the animal as a category for the post. This will then add a category tag, but I’ll try replace that with a breadcrumb, along with nestling the posts under their animal in the URL. ie /<animal>/can-<animal>-eat-<food>/

The second way the links will be achieved will be via a contextual link. These can pass a bit more linking value, and may get seen a bit differently than just a hierarchical-based link.

To do this, we need to add a piece of conclusion content to our combo pages to swap in a link in.

Added in a new content template for a conclusion, and have added some text with the output being;

We have a new variable to build out, which is the URL of the category.

There will be two though, as the category itself will have a slug rather than a built out URL, but then we need the URL for the contextual links.

Pretty simple to build out, as we know the URL will be domain.com/<animal>/.

Some nice contextual links were added in!

 

Child linking

Now we need to link the other way, and link to all the individual combo pages that were created for each animal.

We’ve already done the hard part in a previous step of including the foods that an animal can and can’t eat, so lets just convert them into links.

We can modify our comma separation formula to be able to add any text before & after the item that we want;

=ARRAYFORMULA(TEXTJOIN(“, “, TRUE,UNIQUE(IF( Data!Q:Q=”YES”&A8 ,”ANYTHING”&Data!E:E&”ANYTHING”,””))))

So let’s just throw the HTML for a link in there.

=ARRAYFORMULA( TEXTJOIN(“, “,TRUE, UNIQUE(IF( Data!Q:Q=”YES”& A7, “<a href=””https://domain.com/”& lower (“<animal>/can-<animal>-eat-“& Data!E:E &”/”)&”””>”& Data!E:E &”</a>”,””))))

Now it outputs the HTML linking code, including the dynamic elements of <animal>.

It just includes the return portion of the food lookup twice, as it’s all done inside the actual return code for a food, rather than having to use the entire lookup formula each time.

This will now get replicated for the canNotEat section.

The content for the animal pages will automatically update, so no tweaks need to be done there.

 

Cross-linking

For cross-linking, I don’t think I am going to link to other animals from each other for now, and just link from some foods to other foods.

I don’t want to link to every food, as for some animals that could be quite a few!

The goal would be to create a little link wheel type set up where one page links to a couple nearby.

I could give all the foods a tag, and only link within them, but I don’t think that’ll net much extra benefit.

Instead, I’m going to use an auto-incrementing ID for all the pages matching that animal.

=COUNTIF(D$1:D4,“*”&D4&“*”)

Basically, the range is locked at the first cell, and then will increment as it goes down the list, counting the number of values matching the current animal.

This way, it just counts as it goes, and the count is constrained within the current animal.

To find the closest values each way, we’re just gonna take this incremental ID and do a vlookup for 2 higher & two lower.

Since we’re working through this list from top-down, theoretically 2 up will always exist except for the highest numbers, and the same with 2 down, except for the lowest.

To make it a clean vlookup the animal has been added before the ID so we can use the two columns to match it.

On the content builder sheet, I’ve then grabbed this ID, and then stripped it back to the number, and then grabbed 2 IDs either side of it.

=D2&SUBSTITUTE(Z2,D2,)-2&D2& SUBSTITUTE(Z2,D2,)-1& D2& SUBSTITUTE(Z2,D2,)+1&D2& SUBSTITUTE(Z2,D2,)+2

I’ll now use this to be able to grab the combo names that are associated with these IDs.

Modifying the formula, to now include an index/match. It’s just a reverse vlookup since the increment ID is in the final column and not the first.

=iferror(INDEX(Data!A:R,MATCH (D2& SUBSTITUTE(Z2,D2,)-2, Data!R:R,0),1),)& iferror(INDEX(Data!A:R, MATCH(D2& SUBSTITUTE(Z2,D2,)-1, Data!R:R,0),1),)& iferror(INDEX(Data!A:R,MATCH (D2& SUBSTITUTE(Z2,D2,)+1, Data!R:R,0),1),)& iferror(INDEX(Data!A:R, MATCH(D2& SUBSTITUTE(Z2,D2,)+2, Data!R:R,0),1),)

If a value exists in the ranges of 2 before, and 2 after, it’ll grab it. If not, then it will error and not grab it.

And now tweaking it rather extremely to be able to build the URL by using that lookup in a vlookup to grab the slug, and then again to grab the <food> we get;

=iferror(“<a href=””https://domain.com/”&D2&”/”& vlookup( INDEX( Data!A:R, MATCH(D2& SUBSTITUTE(Z2,D2,)-2, Data!R:R,0),1), Data!A:L,12,0) &”/>”&vlookup(INDEX( Data!A:R, MATCH(D2& SUBSTITUTE(Z2,D2,)-2, Data!R:R,0),1),Data!A:L,5,0)&”</a>, “, )& iferror(“<a href=””https://domain.com/”&D2&”/”& vlookup(INDEX( Data!A:R, MATCH(D2& SUBSTITUTE(Z2,D2,)-1, Data!R:R,0),1), Data!A:L,12,0) &”/>”& vlookup( INDEX( Data!A:R, MATCH( D2& SUBSTITUTE(Z2,D2,)-1, Data!R:R,0),1), Data!A:L,5,0)&”</a>, “,)& iferror( “<a href=””https://domain.com/”& D2&”/”& vlookup (INDEX( Data!A:R, MATCH( D2& SUBSTITUTE( Z2,D2,)+1, Data!R:R,0),1), Data!A:L,12,0) &”/>”& vlookup(INDEX (Data!A:R, MATCH(D2& SUBSTITUTE(Z2,D2,)+1, Data!R:R,0),1), Data!A:L,5,0)&”</a>, “,)&iferror(“<a href=””https://domain.com/”& D2&”/”& vlookup(INDEX( Data!A:R, MATCH(D2& SUBSTITUTE(Z2,D2,)+2, Data!R:R,0),1), Data!A:L,12,0) &”/>”&vlookup( INDEX( Data!A:R, MATCH( D2& SUBSTITUTE(Z2,D2,)+2, Data!R:R,0),1), Data!A:L,5,0)&”</a>”,)

But breaking it down, it’s just 4 of the same thing. The individual bit might be a little easier to understand.

=iferror(“<a href=””https://domain.com/”&D2&”/”& vlookup(INDEX( Data!A:R, MATCH(D2&SUBSTITUTE(Z2,D2,)-2, Data!R:R, 0),1), Data!A:L,12,0) &”/>”& vlookup(INDEX( Data!A:R, MATCH( D2& SUBSTITUTE(Z2,D2,)-2, Data!R:R,0),1), Data!A:L,5,0)&”</a>, “,)

The output is worth the effort, but safe to say, I’ve had enough of looking at this, so let’s just throw it at the bottom of the generated content.

I’m calling that done…. on the linking anyway haha.

 

Updating the pages

Let’s get all the pages updated.

First up are the categories.

I’ve created a separate sheet for them, as it’s a different import process.

Whilst doing this I realised the category treatment is well, crap, so I’ll need to get category pages as their own post design. Will have a play with that a bit later and continue as is. Worst case, categories can just become posts and I’ll mod the combo URLs to “fake” sitting below the category.

Reimported all the post content, which includes a heap of new data so there are some new pages created.

A total of 50 live now, most with just the content relating to the yes/no answer, rather than full data though. Still a good prototype to continue working off!