Category: Technical SEO

Caching HTML in Cloudflare for Improved Page Load Speeds

Caching HTML in Cloudflare for Improved Page Load Speeds

Caching HTML can help you reduce the overall page load speed, as the CDN can hold your HTML rather than have to request it from the server each time.

There are pros and cons to doing this, but if you’re a static site that doesn’t change too often, caching your HTML could be the perfect solution to improving page load speeds a bit.

 

Does Cloudflare cache HTML?

Yes, however not by default. You need to enable it with the below steps to create a page rule.

 

How to cache HTML with Cloudflare

Caching HTML in Cloudflare is easy, so just follow these steps.

1. Open cloudflare, and click ‘page rules’ in the sidebar

2. Click ‘create page rule’ to the right of this page

3.  Insert your sites domain with a wild card on the end, just like the cloudflare example, and then select ‘Cache Level’ and ‘Cache Everything’ from the dropdowns.

And you’re done!

You should be able to run a test now, and compare to before HTML caching.

For me, here is my test.

Before HTML caching

After HTML caching

Yes, the site is already optimised, but a few key metrics were improved even further.

FCP dropped from 305ms to 112ms.

TTI dropped from 305ms to 112ms.

LCP dropped from 500ms to 444ms.

Not too bad at all!

 

Is it good or bad to cache HTML with Cloudflare?

It really depends. How static is your content?

If you have an extremely dynamic site, or even a news homepage style site, caching HTML might not be good for you.

Rather than a consumer constantly getting fresh content, they will get the HTML cache version until the cache is refreshed. This means they’ll only see the content available at the time of cache, and not anything that has been added or edited since.

Not just consumers, but Google too. We want Google to have the freshest content, so you might be holding yourself back by a day or two, along with annoying users, if you enable HTML caching.

However, if you’re site is extremely static, HTML caching can really help you knock off a few hundred milliseconds off some key core web vital numbers.

 

Use HTML caching wisely, and it can help you improve your page load speeds… provided you’re a good fit for it.

Programmatic SEO: Handling Search Result Pagination

Programmatic SEO: Handling Search Result Pagination

Pagination is something that should be easy, but when it goes wrong, it can really go wrong, and impact crawling & indexation.

It’s one of the first things I go to clean up, as it’s a great way to help reduce the overall crawlable pages by a site, particularly one that’s wasting their crawl budget.

Folders or query parameters for pagination?

Provided there are no tech issues that make it not possible, I recommend that query parameters are used for pagination.

Page=x is just such a clear signal to Google that its page related, and I’m all about making things clear for a robot.

 

Are Rel next & Rel Prev still required tags?

No, you do not need to use the rel and prev tags anymore for Google.

One of the Google engineers did a presentation in Australia, and mentioned how this conversation about the tags went.

Mueller walked in one day and essentially said “You know we’re not using these tags anymore?”, they looked at each other, and then the tags were deprecated.

Someone accidentally / unknowingly removed them from being checked and no one noticed, so they dropped their use of them.

They are still getting used by other search engines, so if Google isn’t the primary in your market, you should definitely still include the tags.

 

SEO Pagination best practices

Pagination best practices are pretty simple at the core.

  • Always link to the first page
  • Link to the next couple of pages, and a couple of previous pages
  • Use a clean parameter where possible, with no parameter for the first page
  • Ensure the pagination query parameter is in the canonical tag
  • Use rel next/prev tags if you want / if they’re easy enough
  • Don’t link to the final page in a series from the early pages
  • Limit page counts where possible
  • 301 redirect any page counts outside of available range back to the first page
  • Include a decent amount of results per page, 20-30, rather than many pages of fewer results

 

Common SEO issues with pagination

There are a handful of issues I tend to look at when auditing a pagination setup.

 

1. Page 1 (the default URL) has a query parameter on internal links

Some default e-commerce setups will include ?page=1 parameter on the end of links back to the first page.

By default though, this first page has no query parameter.

So you then have the following created as duplicate pages;

https://www.site.com/category/

https://www.site.com/category/?page=1

Exactly the same page, with exactly the same content, yet two different URLs.

Even with page=1 being stripped with a parameter, you’re actively linking to an alternate version of a page confusing Google.

John Mueller made a comment re: UTM tags on internal links, but the comment directly applies to this scenario too.

What you’re asking Google to rank, and what you’re linking two, are separate pieces of content.

Why confuse Google?

Just ensure that any links back to page 1 exactly match what you’re expecting, which would be without the query parameter.

 

2. Canonical tag doesn’t include the query parameter

Canonical tags must include the pagination query parameter.

Google even mentions this on their ‘common canonical mistakes’ blog post here.

 

3. The final page is linked to from the first page

Linking to the final page, is what every single pagination linkset does. Ever.

Why would you link to the end, when the most important results are on the first set of pages?

I always recommend removing the link to the final page in a set, and instead ensuring the first few links are available.

If Google wants to crawl all the way to page 345 it can, naturally, going page by page in the order of priority you have set.

 

4. Not including enough results per page

Some websites won’t use their content to its full potential. Instead, they’ll limit each page to 10 results, and just have more pages.

Bring some of that hidden content forward, to the first page.

Include more results, at least 20, maybe even 30 nowadays, and make your primary page for the search result stronger.

 

5. No limits on pagination

The final one here is allowing as many pages of results to be created, as possible.

Will a user ever click through to page 345?

I highly doubt it.

But a scraper will, and they’ll take all your data.

You’ll also be re-using the same set of results, over and over across the site.

By limiting the amount of pagination, you severely limit the reuse of the same listings.

IMO, this then gives the listings more value for when they’re used in the top couple pages.

Limit the pagination to what you feel is reasonable, maybe page 20, or page 50 tops, and you can start to craft crawl behavior a little more.

With pagination limited, you should 301 redirect any page outside this range back to the first page.

That will help with the initial URL cull, and will also help clean up URLs when page counts lower due to less results.

 

 

Is infinite scroll or pagination better for SEO?

Well, there are ways to make infinite scroll work. If you really want it.

However, pagination is so much easier to not only implement but to monitor and patch issues with, so it is my preferred go-to option should I be given the choice.

 

Handling pagination is easy

Handling pagination cleanly, and efficiently, is actually pretty easy. Once you tick a few boxes and set some limits.

The Ultimate Enterprise & Programmatic SEO Checklist

The Ultimate Enterprise & Programmatic SEO Checklist

When doing an audit, I just go through the site, flag what I see as issues, and then write them up in a Google doc.

I rarely follow the checklists.

You won’t get a giant list of “40/54,000 pages don’t have canonical tags” from me, you’ll get actionable information.

Specific issues you can just pass straight to your development team.

I thought it was time I standardised some of the checks I made, and figure these would be useful for others too.

The Programmatic SEO checklist

134 tasks to investigate a programmatic site during an SEO audit.

A non-standard checklist, this should fill in the gaps from the standard SEO items that are checked.

The majority of the items are triggers for an investigation, rather than direct yes/no checks.

Triggers for items that you should investigate, and then work out if they’re an issue.

There aren’t many simple ‘yes’/’no’ items at this level of SEO. You need to put in the hard work at determining whether what you’ve found is an issue, and whether that issue is worth fixing.

This Google Sheet checklist will point you in the direction of what to look at.

Right now, the checklist will suit search/listing type websites.

Eventually, it will suit additional types of programmatic sites as it’s built out.

 

How to use the checklist

You’ll find each of the items listed on a new row, with more information provided when possible.

There’s an empty space to empty your findings on each of the items.

The status column is for selecting whether an item is a Pass, Fail, or requires additional investigation.

If an item is selected as ‘Fail’, it will activate the ICE score. Read more about the ICE score here.

Some items may have the ICE score prefilled out, but due to many tasks being able to either have a low, medium, or high priority outcome the majority can’t be prefilled.

Just fill out the impact, confidence, and ease for each item, and it will output the ICE score for you.

 

The checklist content

The list is broken down into 4 sections, Sitewide, Search, Listings & Blog/News.

Each section has the associated tasks, with some minor duplication where it’s required across multiple sections.

The bulk of the items, but not all of them, are;

 

Sitewide checks

Is the site rendered server-side or client-side?
– Is the entire content SSR, or only portions?
– Are linking widgets exposed in the HTML source?
– Can Google effectively render each page?
– Are there any issues when viewing the website with JS disabled?

Is there an XML Sitemap?
– Is the sitemap broken up by categories?
– Are there sitemaps with more than 50,000 URLs or over 50mb?
– Are there 0-result SRPs in the sitemap?
– Are there query parameter filtered URLs in the sitemap?

Is there a robots.txt file?
– Does the robots.txt block any pages?
– Does the robots.txt block any resources?

Is there an HTML sitemap?
– Does the HTML sitemap have traffic in GSC?
– Does the HTML sitemap link to 0-result SRPs?

Trailing slash or no trailing slash?
– Is the non-preferred being redirected to the preferred?
– Are there pages indexed with the non-preferred option?

Are there over-indexation issues?
– Are there URLs under “Discovered – currently not indexed”?
– Are there URLs under “Crawled – currently not indexed”?
– Are there URLs under “Soft 404”?
– Are there URLs under the 3 “Duplicate” issues?
– Are there URLs under “Alternate page with proper canonical tag”?
– Are there URLs under “Page with redirect”?
– Are there URLs under “Not found (404)”?

Are there Core Web Vital flags?
– Are there FCP issues?
– Are there LCP issues?
– Are there CLS issues?
– Are there TTFB issues?

How do the crawl stats look?
– Is it primarily the primary host being crawled, or do both?
– Are there significant non-200 requests?
– Are there significant non-HTML requests?
– Are there any significant blips in the HTML crawl rate that require investigation?
– Are there any significant blips in HTML response time that require investigation?

Do any internal links contain UTM tracking parameters?

Is the site multi-national / multilingual?
– Does the URL include the country/language code at the start?
– Are hreflang tags being used on the site?
– Does the hreflang include the current page in the tag?
– Is there a language/country selector on the site?
– – Does the selector use any parameters?
– Does each language selected only have content in that language being used?

Does the brand own any other internet properties?
– – Are there any redirects, where redirects should exist?
– – Do any of the other properties link to this one?

Are there any historic URL patterns that were used?
– – Do these historic patterns all redirect?

 

Search checks

Does the search follow a good URL structure?
– Is what’s in a pretty URL, the best option?
– Can you re-order the URL structure and still load the page?
– Are there filters as query parameters?
– – Are the query parameters ordered correctly?
– – Are the query parameters stripped in a canonical tag?
– Are there filters that are both pretty URLs & query parameters?
– How are multi-select filters handled in the URL?
– Are search filters unnecessarily being passed through to listings?
– Are there crawlable sorting links?
– Is there any sort of customisable/combination filter layer?

Do the primary filter naming conventions make sense?
– Are there any primary filter values that could be considered ‘highly related’ enough to possibly merge them?
– Are there any primary filter values missing that should be included?

Is there a supplemental view of current results?
– Are there significant versions of the supplemental view indexed?
– Does the supplemental view canonical over to primary?

Are there internal links to other pages?
– Is there a child linking setup?
– Is there a parent linking setup?
– – Does a breadcrumb link to the correct levels?
– – Is the breadcrumb JSON schema included in the <head>
– Is there a cross linking setup?
– Are there 0-result SRPs being linked to?
– Are there links to query parameter SRPs?
– – Do these links exist in the HTML?
– Are there any unnecessary links being included on the page?
– Are there any links to URLs that redirect?

How are 0-result SRPs being handled?
– Are they getting indexed / receiving traffic?
– Do they display related listings?
– How does the behaviour change when they return to having a result?

How is pagination being handled?
– Is there a link to the final page?
– How many pages of results are being created?
– – Are pages above this max being 301 redirects to the first page?
– Is the page number included in the canonical tag?
– Are there rel next/prev meta tags being used?
– How many results are included per page?
– Do links to page 1 have page=1 query parameter?
– Does the entire page of results load SSR or is there a partial CSR load?
– Are related listings being loaded at the end of results?
– – Are related listings causing indexation issues?

Is imagery being lazy loaded?
– Are above-the-fold images excluded from lazy loading?

Is dynamic content being used?
– Are there dynamic FAQs?
– – Are these correctly marked up with FAQ schema?
– Are there any visible errors within the dynamic content?

Does the mobile view render the same as desktop?

 

Listing checks

Do listings follow a good URL structure?
– Are there alternate versions available via filtered URL links?

Are there internal links to other pages?
– Is there a parent linking setup?
– – Does a breadcrumb link to the correct levels?
– – Is the breadcrumb JSON schema included in the <head>
– Are links to related listings included?
– Are links included to SRPs outside what this listing suits?
– If there are multiple listing tiers, do the tiers parent/child each other appropriately?

What happens when a listing expires?
– What happens if this listing then comes back online?

Is there significant unique content for the listing?
– Is any of the listing content hidden behind read mores, tabs, or accordians?

Does the listing template automatically optimise for the keyword?

Is dynamic content being used?
– Are there dynamic FAQs?
– – Are these correctly marked up with FAQ schema?
– Are there any visible errors within the dynamic content?

Does the mobile view render the same as desktop?

 

Blog/news checks

Does blog content follow a good URL structure?
– Do blog posts sit within a category or other site section?

Are there internal links to other blog posts?
– Is there a parent linking setup?
– – Does a breadcrumb link to the correct levels?
– – Is the breadcrumb JSON schema included in the <head>
– Are links to related blog posts included?
– Are there links to related listings and/or categories?
– – Are these links automatic, or manually included?

Is blog content being linked to from elsewhere on the site?
– Are these inclusions automatic, or manual?
– Are the inclusions based on tags/categories, or keywords?
– Are there opportunities to extend what tag/category triggers are used so that more pages link to the content?

 

Access the checklist

You can access the checklist with the below link.

Make a copy of the file into your own Google drive, and then make the edits there.

 

Active development

The checklist is still under development, so there may be items missing that will be added at a later date.

If you have any suggestions of tasks you’d like to see added, feel free to add a comment and I will get them included.

 

GTMetrix Review: Top SEO Insights You’ll Get

GTMetrix Review: Top SEO Insights You’ll Get

One tool I will use every single time that I audit a website is GTMetrix.

I’ve seen comments about it, and that other speed testers like webpagetest.org provide “better data” and “more insights”, but GTMetrix does everything I want and helps me solve my problems.

As with every tool, you take its automated insights with a grain of salt. You leverage them, to guide further insight gathering, or to back up a specific decision.

GTMetrix gives me the data to pass on to dev teams, and help get issues patched.

 

Running a Speed Test with GTMetrix

It’s pretty simple to run a speed test in GTMetric.

  1. Enter the URL you’d like to test.
  2. Change the location you’d like the speed test run from. Highly recommend you get this as close to your audience as possible.
  3. (OPTIONAL) Change the browser you’d like to test from. This is where you can also select a mobile browser if you’d like to run a mobile speed test.
  4. (OPTIONAL) Select the speed if you’d like to throttle the speed test. Throttling can help show more “true to life” bottle necks, like with a poor mobile connection, but will also help smaller issues show up easier as everything will get exacerbated.
  5. (PRO REQUIRED) Ensure you have the video test flicked on, if you’re a pro user. You’ll get a bit more useful insight.
  6. Click Analyze, and run your website speed test.

How to run a speed test with GTMetrix

 

How to interpret GTMetrix’s waterfall chart

The waterfall chart breaks down the exact points at when different resources are called, connected to, and downloaded.

Each resource is ordered based on when its loading starts.

All you have to ever really worry about here is when a specific resource is connected to, and when it finishes being downloaded. There are very few use cases you’ll run into as an SEO where anything in between is required.

GTMetrix waterfall chart example

Look through what items are being loaded and when, and then run through the standard checks from here to optimise.

Large files being loaded? Are specific requests taking too long? Too many files? External requests you didn’t know about?

Plenty of things to analyse here, but they’re very specific to each audit.

Each significant request stage is broken down by a coloured line. The following is what stage each of the coloured lines in the waterfall chart represent;

GTMetrix waterfall chart legend

You can find some more info on the waterfall chart from GTMetrix directly, here.

 

Page Load Video by GTMetrix

One of my favourite features, particular to help better identify CLS issues,

Unfortunately, it’s a pro-only feature.

In saying that, it’s worth it to help out with these audits.

When running your original test, you can tick on video audit.

Or, when viewing an audit you can click ‘enable video and re-test’ and GTMetrix will re-run the test, including the video test this time.

Video of pagespeed test

Here’s an example video output, from the test I ran above of SammySEO.com

You can play/pause the video, or run it at 1x, 1/2x, or 1/4x speed.

 

Testing Core Web Vitals with GTMetrix

When you run a GTMetrix test, you get a basic overview of your core vitals.

Core web vitals testing on GTMetrix

You can extract a little more information from the waterfall chart, and a few other places in GTMetrix, but this overview can help you delve into each specific CWV separately.

 

GTMetrix Alternatives

So, if this isn’t the tool for you, what other options are there?

The top few that come to mind are;

 

Is GTMetrix pro worth it?

Yes, I believe GTMetrix pro is worth it. Well, for me anyway.

If you have an alternate speed tester you’re using, then it probably wouldn’t be.

Other tools have similar features, I have just used GTMetrix for years now, so have gotten used to it and understand what to look at a bit more than the other tools.

Using UTM Tags on Internal Links? Please Stop.

Using UTM Tags on Internal Links? Please Stop.

Something I see popup now and then, UTM tags on internal links seem to be one of the more popular solutions for tracking internal link campaigns.

They drive me crazy sometimes, as they’re never fun trying to clean up.

Why you shouldn’t use UTM tags on internal links

Not only does the link not point to the primary page you’d like to rank, it also messes with your GA tracking data.

Yeah, canonical fixes this and that but no. No, it doesn’t.

A canonical is a suggestion to Google, that they’re following less and less.

An internal link is also a suggestion to Google about what URL you prefer for a piece of content.

Comments from John Mueller;

Here’s the video:

 

UTM Tags an Internal Linking Analytics

For the analytics tracking, here’s a Google comment re: the analytics side of this;

One of the big things is it creates new sessions for every click. and will assign conversions to your internal link source rather than the initial external source like SEO or Paid Media.

No one likes when their SEO traffic conversions get reassigned!

 

Alternatives to using UTM tags for internal link tracking

There are a couple of alternatives, depending on what you’re actually trying to achieve;

  • Use the user flow report under Audience » Users Flow
  • Create onclick events for the specific links you’d like to track (can use Google Tag Manager here)
  • Create a custom dimension for the events, and then use that to view the key metrics

Not my specialty here, but I can guarantee that UTM tags are not the way.

So you did the dirty, how do you clean up these UTM tags?

Shit happens, weird choices and mistakes happen. How do you go about cleaning it up?

Firstly, strip the internal link UTM tags immediately. That will stop the issue from growing, and help to start to correct any indexation & crawling issues.

Whilst the canonical tag technically should eventually fix it once links are removed, it doesn’t always. There’s also a quicker method.

A heap of redirects.

Provided there are mediums/sources/campaigns on there that are isolatable to only the internal links, then you can redirect the URLs to strip the params.

The 301 redirects should basically work that if one of these mediums/sources/campaigns exists, then just strip out all UTM tags with a query parameter.

That will have the least effect possible on any other UTM tags.

If there is nothing unique about these tags from any external tags, which would be weird but can happen, then you essentially have two options.

Leave it as is, and just strip out the internal link UTM tag. The other being change all of your external campaign UTM tags to ensure that there is a differentiator, and then you can just redirect the internal link versions.

 

So, please don’t use UTM parameters on your links

This shouldn’t be a debate, as there are alternative methods.

Yeah, UTM tags are easy. But you’re messing up your data, and messing up your indexation.

Cumulative Layout Shift (CLS) – Causes, Testing & Fixes

Cumulative Layout Shift (CLS) – Causes, Testing & Fixes

CLS (Cumulative Layout Shift) is one of the fun ones inside Core Web Vitals (CWV), and can sometimes be a bit annoying to find exact causes.

Or even exact locations it happens on, as Google can be a little light on the examples sometimes.

What is Cumulative Layout Shift (CLS)?

Cumulative Layout Shift is a score assigned to a page based upon how much the page changes/moves around between the initial render and the final load.

It’s about assets loading in, and then whilst the page continues loading, others load in and push the original content around.

Long story short, it’s when a website will annoying shift just as you’re about to click a button, and then you end up clicking the wrong thing because everything moved.

 

Does CLS affect SEO?

Yes, CLS very much affects SEO.

It is one of the main elements that make up the Core Web Vitals, which Google is now taking into account as a ranking factor.

Yeah, its one of hundreds of different ranking factors, but when you’re talking about an apples-to-apples comparison with a competitor, I would very much rather know that I have ticked as many boxes as possible to help me rank a site.

 

Identifying that you have CLS issues

Your first point of call to check for CLS issues, or where you might have spotted them initially, would be in Google Search Console.

On the “Core Web Vitals” tab in GSC, you’ll see some pretty charts that show how you’re doing across your URLs, for both desktop and mobile.

If you then click through to one of the reports, you’ll get a list of the issues that make up the Yellow or Red lines.

If one of them looks like the one below, you’ve got CLS issues;

Clicking on this, Google might give you a couple of URLs examples. Chances are though, it’ll just be a single one, even for hundreds or thousands of URLs.

Google might be saying they’re “Similar” pages, but sometimes they will group completely separate page types in here so don’t fall for their trap.

Now that you’ve identified you’ve got an issue, you need to actually find the root causes of this.

 

Isolating specific CLS issues by testing CLS

There are a couple of ways of isolating the CLS issues so that you can make a specific request with developers for a patch.

If you just go to them saying “fix CLS” they’ll either go in circles or call you crazy because “everything works fine”.

 

Testing CLS with GTMetrix

The first method I use is with GTMetrix. A super quick test, and it’s normally something I am running anyway, so can give a good initial overview.

Run your test, and then you’ll get a web vitals to score like the below;

CLS will flag on the right. For this one, green is fine, but it’s enough to use as this example.

This score will probably be different to what Google is flagging, but it’s not about the actual score. It’s about what’s causing that score, so that you can isolate and patch.

If you go to the ‘structure’ tab, you can then expand the ‘avoid large layout shifts’ section, and GTMetrix will break down the main causes for your CLS score.

GTMetrics flags the main offender here, which contributes 99% of the CLS issue.

Funnily enough, this test was run on a webpage talking about CLS here as I was looking for an example site. Definitely a case of, “do what I say and not what I do”. The post is still worth a read though.

In saying that, we can break down this CLS further by just loading the page.

Click that page above, and see if anything loads and then shifts around.

If your internet is fast, you might not notice it.

I used the GTMetrix video reports, so that I can show devs step-by-step what is happening in the load, and help them troubleshoot.

They are loading in the entire content, which pauses for 0.2 of a second, and then loads the image. This image load pushes all the content down.

Google is seeing this massive shift, and would be assigning it a rather high CLS score because of it.

Super easy to fix though!

 

Testing CLS with the CLS Checker Chrome extension

Firstly, just download the CLS Checker Chrome extension from here.

Using the same site as the GTMetrix test, you just need to load the page, then click the extension and click ‘check the page’.

It flags 2, with the first one possibly being related to the bigger one anyway;

 

If you click on toggle, it will make everything white, red and green.

White hasn’t moved, red is the original location, and green is the new location.

Sometimes a few different CLS issues will get grouped together here, so just be careful that a score the tool flags isn’t an aggregate view of about 3-4 different issues.

So this chrome extension is flagging that something has happened in that big red zone, which has pushed all the content down.

Safe to assume what has caused this based on the screenshot, but combine this with the GTMetrics video and you can really drill into what’s going on with CLS.

 

Testing CLS tester with Webvitals.Dev

This one is a tool I discovered recently, and whilst it’s useful to add to the mix it’s not really anything that GTMetrics or the chrome extension don’t cover.

It does bits from both of them, in the single tool though, so might be useful if you’re after a more consolidated view that you can send directly to the development team.

They do include a cool little GIF of what’s moving around on mobile though! Worth checking out, as it might be what you’re after.

 

Testing CLS with WebPageTest

If your preference is webpagetest, then you can also check CLS issues in there.

Once you’ve run the test, click on the “view” drop-down and navigate to ‘web vitals’.

 

You can then scroll down and you will see the CLS issues.

Here you can also view a filmstrip, but more importantly, also view the video that’s now included.

The same as how gtmetrix helps with the video, webpagetest can now help you with your CLS issues with their video report too.

 

How to fix cumulative layout shift

You normally need a dev team to fix CLS issues, so if that’s not the answer you’re expecting – this is awkward…

In saying that, CLS issues are pretty easy to get fixed, once you’ve identified them.

All the developers need to do is make sure that elements don’t move. They get them to not move, but ensure they load in their final position.

This means they need to set fixed heights for elements that have delayed loading, like images.

If an image is going to be 300px high, then make sure there’s a blank space of 300px (plus padding) to fix that image when it loads.

This ensures nothing will move when the image finally loads in.

Steps to fixing CLS issues

  1. Break down every individual item that you think is attributing to the CLS score based on the CLS testing above
  2. Show the issues to developers, along with with some pretty pictures, so they better understand
  3. Politely ask them to fix it
  4. Deliver cake in the hope it speeds up delivery

 

It really is that simple.

Your CLS issues should now be gone!

Programmatic SEO: Handling 0-Result Search Results

Programmatic SEO: Handling 0-Result Search Results

0-result SRPs are one of the main programmatic SEO issues I look into and try to clean up with clients.

Whether it’s an existing build with a large over-indexation issue, or a new build and trying to avoid it from the get-go, 0-result SRPs are something I want to patch.

What is a 0-result SRP?

A 0-result SRP is a search result page that returns 0 results.

Something that more than likely has a filter or two applied, and results in such a specific search there are no results returned.

For real estate, think about an SRP for a property type, like condos, in a small suburban neighbourhood that only has houses.

No results will get returned.

When larger sites get set up, they may have internal links that point to these pages, passing value, and sending Google to constantly crawl them.

If 0-result SRPs get significantly indexed, they can waste the crawl budget.

They will be viewed by Google as low-quality pages, which can also trigger Soft 404 errors as these 0-result SRPs will look exactly the same as each other to Google.

They’ll only have the templated elements, with no listings.

 

Detecting 0-result SRPs

Detecting these pages is pretty template-dependent, but you can normally crawl the site with Screaming Frog (or similar) and set up an extraction for the results count.

The pages coming back with “0 results” are what you’re looking for here.

I can go into a bit more detail later if it’s wanted.

Generally how I detect them though, is just clicking around. What templates include links to what pages. You can generally get a feel for the crawlable scale that way.

 

Ways 0-result SRPs get crawled

There are a number of ways these pages get discovered, but here are the top ones I audit for a client;

1. XML Sitemap

When creating an XML sitemap, rules aren’t put in place and every URL combo possible is sometimes included.

 

2. Internal linking structures

A good internal linking structure links from a parent to its children. Sometimes, these links aren’t filtered and you’ll get links to a number, if not all, 0-result SRPs.

 

3. Listings existed there once-upon-a-time

Whether via internal links, sitemaps, or expired listing redirects, there is a chance that a 0-result SRP was previously indexed due to it having links. Now it doesn’t have links though, but since Google already found it, it’s going to keep getting crawled and indexed.

 

Are all 0-result SRPs bad?

It depends.

The favourite answer of anyone dealing with an SEO.

Truly though, it depends on the scenario.

Are they being actively linked to, significantly indexed, and causing issues wasting significant portions of the crawl budget?

Then yeah, they are bad.

But are they rarely linked to, only indexed from old listings, and only have a few indexed?

Then nope, not an issue worth fixing… provided it stays that way.

 

Avoiding the initial indexing

The best way to avoid indexing the 0-result SRPs is to just avoid linking to them.

Might sound easy, but this can get a bit tricky sometimes as there are so many places to you need to add rules to.

Sitemaps, footer links, sidebar links, the list goes on.

Developers should be able to add rules to these linking widgets, so that there is a basic listing count check done, and then this added as a filter to the links.

A listing count could just be ‘has’, or ‘has not’, got listings. So true/false, and it could run live (if the developer is that keen), otherwise once a day, week, or even month could suit some times.

There have been times I’ve worked with a client and we did a once-off run for the initial build, and then they would come back and patch it 6 months later. That just ensured we got something live for launch that would still be fit-for-purpose.

Another thing you can choose to do, is set a noindex/follow tag to the page if it has 0-results. You must still ensure no links point in though.

I tend to avoid noindexing where possible, as it’s kind of a hacky avoidance patch, and is more of a fallback to doing a proper cleanup initially.

 

Cleaning up over-indexation

So you’ve worked out you’ve got a 0-result SRP issue. What’s next?

This is where it can get tricky.

The first step is to stop linking to them. Straight up, just remove them from everywhere by following the notes in the ‘avoid initial indexing’.

That will solve the largest part, the continual growth & value flow to these pages.

Following that you can 301 redirect the pages, depending on the scale of the issue.

Personally, I prefer to redirect them to a parent page where possible. Particularly, if the 0-result is a filter page, of another page.

ie a property type page of a location page.

That way there’s a clear parent, and the redirection would make sense in Google’s eyes.

You should definitely 301 any URLs that have no chance of having a page created anymore though. Whether through filters that no longer exist, or just changes in structure. A redirect under these scenarios should be a given.

Some others prefer to 404, or noindex, these pages to get them removed from the index. I prefer to try and avoid this, as they’re already indexed and have a little value (even if it’s minimal). I’d rather aggregate that value and send it to a parent. A no-index tag still means they’re getting crawled, so the minimum you need to do is remove all links in if you’re using a no-index or 404.

I’d rather have a 301 reversed than a 404 or noindex tag removed for a page. Have seen some severe ghosting of noindex tags before with those pages struggling to get reindexed, whereas a 301’d page seems to get reindexed pretty quickly.

Once you remove the links pointing in the majority of the recrawling should stop anyway, and so only if issues continue should you look into a more aggressive strategy for these pages.

 

Should you noindex 0-result search result pages?

Personally, I prefer to avoid using noindex tags on pages. I prefer to just limit their initial indexing, and then hope redirect and better-linking patches any further issues.

Many will still use noindex tags, so you can always give it a go if other solutions aren’t working.

 

Are they actually an issue for you?

So before you do anything else with 0-result SRPs, you need to work out one thing.

Are they actually an issue for you?

Injecting HTML Code “Server-Side” with Cloudflare

Injecting HTML Code “Server-Side” with Cloudflare

I’ve got an older site with a good amount of link value I want to add some links to for some newer sites.

Problem is, the site is on 4 year old tech, and hasn’t been updated in 3 years.

Dev is long gone. If I try run a build, there’s a 90% I’ll break it completely.

I’ve since rebuilt the tech, but can’t move the old site over because…. bloat.

Running Cloudflare, I quickly looked into options and might have found a solution!

 

Injecting HTML with the Add HTML Cloudflare App

This was the initial solution I found, and I was super excited. Took 2 minutes, and I had a link on the old site!

However, the darn thing was client-side.

If that suits you though, this is how you can inject HTML with the ‘Add HTML‘ Cloudflare app.

 

1. Load up a Cloudflare website, and click on ‘Apps’ in the sidebar menu

2. Search for ‘add html’ and click the app

 

3. Click ‘preview on your site’ to load a preview

 

4. Select ‘pick a location’ and a little selection editor will load. If you want to inject into the head, you can just enter head.

 

5. Click on your website preview where you’d like to inject the code. I’m selecting the first post in the list.

 

6. Select from the dropdown where specifically you’d like the code inject. I want to inject the code before this first post, so I am going to select the before option.

7. (OPTIONAL) Select the location you’d like to use this, if not the homepage. Manually add the URL, then reload the list. You might need to do this a few times until it shows, but eventually you can then find the URL and press the little tick on the left.

 

8. Enter the HTML code you’d like injected, and it will display in the preview on the right.

9. Click install and boom, you’ve got some HTML magically injected.

 

As mentioned before though, it’s unfortunately client-side code, and won’t load with JS disabled or show anything in the HTML source of the page 🙁

 

Injecting HTML code with Cloudflare Workers

This took me a bit longer than it should have to work out, because my coding skills are lacking a bit.

However, from the limited examples of direct use of this, I managed to piece together what I needed.

A Cloudflare worker basically is just a little script, that runs on Cloudflare between your server and your end-user. Cloudflare grabs the page from the server, executes your script, then sends it off to the user. Super powerful, when you know how it works.

Hopefully, this simple breakdown of actually using a worker to inject HTML can help you out too!

 

1. Go to workers in the Cloudflare dashboard, and click ‘create a service’

 

2. Insert a useful name, click ‘HTTP Handler’ and click ‘create service’

 

 

3. Click on ‘quick edit’ in the top right corner

 

4. Navigate to the page you’d like to edit, and then paste the following code;

 

class ElementHandler {
element(element) {
element.append(`*HTML TO INJECT*`, {html: true});
console.log(“injected”);
}
}
async function handleRequest(req) {
const res = await fetch(req)
return new HTMLRewriter().on(“*CSS SELECTOR*”, new ElementHandler()).transform(res)
return res;
}

addEventListener(“fetch”, (event) => {
event.respondWith(
handleRequest(event.request).catch(
(err) => new Response(err.stack, { status: 500 })
)
);
});

* You might need to edit the quotes due to WordPress formatting. Sorry!

 

5. Edit the *HTML TO INJECT* variable with what you’d like to inject. This could be plain text, or any full HTML.

 

6. Find your selector, and at modify the *CSS SELECTOR* variable. To get your selector, right click anywhere on your page, and click ‘inspect element’. Then right-click the element, copy, then click ‘selector’. You just paste this into the *CSS SELECTOR* box and you’re good to go.

 

7. The preview should now update to inject your HTML into the spot specified. There could be some issues with this, but provided you’ve follow the steps it should put it exactly where you specified. This is a bit newer to me, so comment with any issues and I can try help out!

8. Click ‘save and deploy’ to get the worker saved

 

 

9. Go back to the cloudflare website you’d like this added to, and navigate to workers again, then click on ‘add route’.

 

10. In the modal that pops up, edit the location you’d like the worker run at, along with select the worker, and then just select production, and click save. I only want this on the homepage, so I’ll modify the screenshot to remove the wildcards (*).

 

11. The route should now be loaded in the account, with the worker selected.

 

 

12. Go test the site! It should be live on the pages you selected, and it should load server-side!

 

You can confirm it’s server-side by viewing the page source, and then just searching for the code you added.

Perfect. Just what I wanted!

Just make sure you’re putting the worker in the right spot, if its not a unique piece of code you’re adding.

If a div is available across the site, and you set your path in the route to be the whole site, then that piece of code will show up everywhere.

Great if that’s what you want, not so great if its not what you want!

 

How does this magical HTML insertion with Cloudflare workers work?

This all uses the Cloudflare HTMLRewriter class, with a heap of documentation here.

There’s essentially an audible woosh as this all flys over my head.

 

The options for the HTMLRewriter class

Whilst I don’t understand most of this, yet, I can tell you the following important options.

Inside the code you have;

element.append(`*HTML TO INJECT*`, {html: true});

Instead of ‘append’, you have a heap of different other options here that you can use.

removeAttribute(name) – Removes the attribute.

before(content, contentOptions) – Inserts content before the element.

after(content, contentOptions) – Inserts content right after the element.

prepend(content, contentOptions) – Inserts content right after the start tag of the element.

append(content, contentOptions) – Inserts content right before the end tag of the element.

replace(content, contentOptions) – Removes the element and inserts content in place of it.

setInnerContent(content, contentOptions) – Replaces content of the element.

remove() – Removes the element with all its content.

So just tweak this to how you want to use it.

Cheers Cloudflare!

 

Adding Links with Cloudflare

Using the same code, we drop a link wherever you want it!

For this code;

element.append(`*HTML TO INJECT*`, {html: true});

You would want to modify it to be something like;

element.append(`<a href=”https://www.sammyseo.com/”>Best SEO Blog Ever</a>`, {html: true});

And then just find the best location selector you can that will suit. ‘before’, ‘after’, ‘prepend’ or ‘append’ would probably be your best options to get the link inserted.

 

Great way to insert HTML into somewhere you can’t get access to!

Would love to hear if you use it, and if so, how!

Trailing Slash vs No Trailing Slash – Whats Better for SEO?

Trailing Slash vs No Trailing Slash – Whats Better for SEO?

When it comes to trailing slashes, some are for them, and some are against them.

A trailing slash is the slash on the end of of the URL.

No trailing slash: https://www.sammyseo.com/trailing-slash-verse-no-trailing-slash

Trailing slash: https://www.sammyseo.com/trailing-slash-verse-no-trailing-slash/

Negative effects of trailing slashes

But, it’s just an extra character on the end of the URL?

Surely they can’t harm a site?

Well, that little character creates an entire duplicate copy of a page that’s accessible, and indexable, by Google.

You need to pick one or the other.

With or without the slash.

You can’t let both stay crawlable & indexable.

 

Should you enforce a trailing slash?

Personally, I prefer trailing slashes. I mostly work with enterprise/larger scale clients, that tend to be using programmatic SEO.

This style of SEO leads to heavy folder usage, to ensure a good, clean hierarchy for the site structure.

In the early days of websites, well, 10+ years ago anyway, a trailing slash used to dictate that a URL was a folder, and no trailing slash would be a page, with more commonly a .html or similar extension.

So you’d have;

domain.com/folder/

domain.com/page.html

With pretty URLs, that then changed to;

domain.com/folder/

domain.com/page

Nowadays it really doesn’t matter as much, but I go back to folders and being able to pass the value around the hierarchy in the structure.

It’s also the WordPress default. WordPress powers 34% of the internet, and has a 60% CMS share. Maybe they got it right?

Because of this, I prefer everything just to have trailing slashes.

Many others though prefer no trailing slashes.

If it’s an existing site, pick the one that has the most rankings and redirect the other.

If it’s a new site, pick whatever you want. Maybe the one that would lead to fewer fights with developers.

 

You must 301 redirect one to other

No matter whether you choose trailing slashes or no trailing slashes, you must redirect the one you don’t choose, to the one you do choose. This is non-negotiable.

Whilst a canonical tag should strip, or add, the trailing slash, this is not enough to ensure proper and clean indexation and crawling.

You must redirect the secondary option, to the option you chose, with a server-side 301 redirect.

No clientside JS redirecting, as this won’t avoid indexation.

 

Force trailing slash with htaccess

You can force a trailing slash with a 301 redirect via htaccess, with the below code;

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)([^/])$ /$1$2/ [L,R=301]

However, this will also redirect file extensions.

If you don’t want them redirected, you will need to use;

RewriteCond %{REQUEST_URI} !\.(php|html?|jpg|gif)$
RewriteRule ^(.*)([^/])$ http://%{HTTP_HOST}/$1$2/ [L,R=301]

 

Both of those are added below the following line in the htaccess file;

RewriteEngine On

 

Strip trailing slash with htaccess

You can strip the trailing slash, and enforce a no trailing slash policy, with the following code;

RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ /$1 [L,R=301]

Just like adding a slash, this code is added under;

RewriteEngine On

 

Force a trailing slash in cloudflare

Unfortunately, this is not possible 🙁

If it becomes possible, please let me know. I need this.

 

Strip a trailing slash in Cloudflare

You can strip it with two redirects, with the first to ensure parameters remain, and then the second to strip the trailing slash.

No query string: https://foo.com/*/ -> https://foo.com/$1
Yes query string: https://foo.com/*/?* -> https://foo.com/$1?$2

Provided you are not actively linking, or using/have used, the trailing slash version, then this 2 redirect setup will work okay.

If however, you have actively used the trailing slash version, then it would be recommended to instead seek a way to do this redirect in a single step to assist cleanup.

 

NextJS 308 Trailing Slash Redirect

NextJs now has built in trailing slash redirect, that you can either enforce trailing slash, or enforce no trailing slash with.

Rather than a 301 though, they’ve built it to do a 308 redirect.

Whyyyyy developersssss whyyyyyy.

“But a 308 is like a 301”. Urgh.

You’ll get responses like this all day, but how does Google actually handle this redirect?

Well, I was lucky (or unlucky enough?) to get some data about this via a URL migration that was happening, where I was able to see the specific redirects in crawl stats.

Even though the URLs correctly report as a 308 redirect via HTTPstatus (bottom of screenshot), Google search console is reporting these as 302 redirects.

Luckily though, these redirects weren’t vital to what was happening, it was just a little over indexation.

However, based on GSC reporting, these redirects are being treated like 302s, which we definitely wouldn’t be wanting.

John Mueller commented on 308s back in 2018;

But hey, he’s been wrong before!

Tried to ask him about it, but alas, I’m not famous enough to get a response.

That or I don’t know how to use twitter. Maybe both?

 

Why has NextJS done 308 redirects then?

Something something security.

From here.

Difference between 301 and 308 redirect

So something to do with modifying the request, assumably to stop some sort of hack of the data/privacy related.

The good news though, is that you can override their default function to make it a 301.

Unfortunately, I can’t help you with the how as it was done by a dev team. But, guaranteed 100% can be overwritten, so you can stop stressing and grab a beer to celebrate!

 

To slash, or not to slash

That is the question.

A question that only you and your circumstances can answer.

Client Side vs Server Side Rendering

Client Side vs Server Side Rendering

A debate that’s gone on for years now.

Google announced in 2014 they’ve;

decided to try to understand pages by executing JavaScript

Well, that sent developers wild believing they could build anything and Google could index it fine.

The amount of times I received that link, or another similar link, is ridiculous.

Yeah, Google can execute javascript. But it’s like a baby. You need to spoon-feed it that javascript, and it also needs to essentially visit a page twice.

How does Google execute/process JS?

They sum it up here nicely, with a pretty chart;

Google essentially needs to hit your page twice.

Once on the initial crawl, and once on the rendering of all the JS.

You’re making Google do more work!

 

What is Client-Side Rendering (CSR)?

Client-side rendering is when the rendering of a webpage, is done in a users browser, so on the “client”.

The server sends a minimal amount of information to the user, and then the user’s browser processes all this information, makes the requests for data, and renders the page locally.

 

What is Server-Side Rendering (SSR)?

Server-side rendering is when all the rendering for the webpage is done on the server before the page is sent to a user’s browser.

The server processes everything first, so processes everything, makes all data requests, and then builds the page, before delivering a user the formatted HTML.

 

 

What is the Difference Between Client-Side and Server-Side?

Client-side rendering forces the browsers technology to do the work, which in most cases is fine – except for the ‘lower tech’ search engines. It can have faster initial response times from the server, since less is processed and less work is done, but increases the resource requirement on the end-user.

Server-side rendering generally requires significant caching, to ensure subsequent loads of a page are already rendered, which will save both server resources and load time. SSR removes the requirement of search engines having to execute anything, so its essentially easier for them to process what was initially received, with no re-processing required to render that page.

 

Determining What Content is Loaded Server-Side Verse Loaded Client-Side

Disabling javascript in your browser

The absolute easiest way to determine what is server-side and what is client-side, is to just disable javascript in your browser.

In chrome, you disable javascript by;

1. Right clicking, then ‘inspect’

2. Clicking the options menu

3. Enable the ‘disable javascript’ option

 

Now you just need to reload the page.

Any content that is loaded client-side will now simply not load. If you get a completely white page, well, have fun with that.

Checking the HTML source of a page

The second way to check if content is CSR or SSR, is to look at the HTML source code of the page.

Not the inspect element HTML code, as this is the final browser view, but the right-click “view page source” option.

Right click, view page source in chrome.

Then just CTRL+F to find the content you’re looking for.

So if it’s a text widget you’d like to check for SSR, just copy some of the text from the page and then try and search the HTML source for it.

If you find the content in the HTML, it’s SSR. If it doesn’t exist, it’s not HTML.

However, there may still be additional CSR blockers being used, that could render different content, so just keep that in mind and make sure you also check with JS disabled to confirm any possible issues.

 

Things to keep in mind when checking the HTML source

  • HTML Formatting

HTML formatting can sometimes be different in the HTML, verse what you’ll see in the client. Try and search for the shortest thing you can, to find what you’re looking for. This will lead to less formatting changes and less chances of missing what you want.

  • JSON or other code

Sometimes what you’re looking for will exist, but it will sit within a JSON code section. If it sits here, and only here, then the content will load client-side, as the browser will request the content from this section when it loads. The easiest way to tell this is if it’s HTML the text is inside, you’re okay. You’ll see javascript formatting with out of place {} and similar “code” if its JS.

 

How to Server-Side Load Client-Side Content

 

1. Rebuild the app

If you really care about your SEO, then you should be making your App work server-side properly.

You shouldn’t be taking shortcuts to make CSR “work” as it won’t be the same.

Do it right from the start, and you’ll set yourself up for years to come.

 

2. Use prerender.io or similar

A solution like Prerender.io will take your CSR app, and essentially make it run server-side for Google.

Google’s changed how this sort of setup works over the years, by deprecating the initial way this type of setup worked back in 2015 because they are “generally able to render and understand your web pages” – lol. They then deprecated the new way prerender worked a few years ago, but then relaunched/tweaked it with the release of this… which is where my knowledge of the actual processing ends, but, it’s still the same principle. You’re slapping lipstick on a pig.

I still recommend against this if possible, and really recommend rebuilding properly so you have a solid base to build off.

However, I also know this isn’t always possible. If it’s not possible, then prerender.io will be your go-to choice.

 

Server-side your app from the start and you will have smooth sailing

Building an app with SSR in mind from the get-go will simplify things going forward, and give you the best standing with Google without any hacky integrations.