The old HTML sitemaps hey.
Are you still actively trying to create one for large-scale programmatic builds?
Crawling? Indexing? Deeplinking? Passing SEO value?
Yeah, they can help you shortcut passing values and reduce click depth to deep links.
So can a good, and more natural, internal linking structure though, especially once some prioritised links are included.
I am not talking about a few links to the top sections of your site. Thye might still valid, particularly for smaller sites.
I am talking about the endless page(s) of links, that basically replicate an XML sitemap.
What is an HTML sitemap?
An HTML sitemap is one of those pages you’ll normally see linked in site footers, under an anchor of “Sitemap”.
Basically just another way of being able to tell Google about the different links on your site, particularly for larger sites.
Google recommended a “user viewable site map” back in 2010 and this is really where HTML sitemaps stem from.
In their latest SEO start guide here, they recommend the following;
Create a navigational page for users, a sitemap for search engines
Include a simple navigational page for your entire site (or the most important pages, if you have hundreds or thousands) for users. Create an XML sitemap file to ensure that search engines discover the new and updated pages on your site, listing all relevant URLs together with their primary content’s last modified dates.
Avoid:
- Letting your navigational page become out of date with broken links.
- Creating a navigational page that simply lists pages without organizing them, for example by subject.
Why you should avoid using an HTML sitemap
HTML sitemaps are just a shortcut, to a proper internal linking implementation.
A friendly header menu, and friendly overall site navigation, should be covering what an HTML sitemap would.
Do you think Google is really going to care what links you’ve got on a page when it’s just full of links?
It’s like those automated link swaps from 2008 where you’d add a page on your site, that would automatically update with links to every other site in the network. They were killed off a while ago.
Sometimes these sitemaps are broken up into pages and could have an H1 detailing what the links are about.
ie, “real estate in Sydney” could be a page full of links related to Sydney, and the suburbs of Sydney.
This can cause duplicate keyword targeting, for a page that’s purely made of links.
Yes, I have seen these types of pages indexed.
They’re also another area on the website where you need to be mindful of what pages you’re linking to.
Particularly when it comes to 0-result SRP pages.
Ensuring that links are always updated, with no error pages etc.
If the main purpose is deep linking from the HTML sitemap, then one of Google’s ‘avoids’ from the SEO starter guide comes into play;
Creating a navigational page that simply lists pages without organizing them, for example by subject.
So it kind of writes the deep linking off, as many programmatic sites will try and jam as many links in here as possible.
A natural linking setup should offer a much better solution to this.
Why you would still create an HTML sitemap
There is only one real reason that I see a full HTML sitemap as valid for larger builds.
Tech limitations.
No matter what ideas you have, the strategies you want to implement, the optimisations you want to make, tech limitations will find you.
And they will cut your dreams off.
Sometimes you can’t do what you want, either due to actual tech talent available, or you’re up against a product manager that doesn’t want to hear the magical 3 letters – S E O.
Apart from fighting harder, and/or going around the product manager, you sometimes have to do hacky SEO.
In this case, an HTML sitemap should be seen as hacky SEO.
Not the best, but it’ll get the job done at the start until a more viable solution can be put in place.
You’re just filling in the gaps where the tech can’t properly do a good link structure.
If you can do a proper link structure, however, there should be no need for a built-out HTML sitemap though.
Google should be naturally crawling the site, going from page to page, discovering the links with context.
Not just discovering everything on a page with 500 other links.
How to deprecate your HTML sitemap
The first step is ensuring you’ve properly implemented internal links – this is crucial.
Links to parent pages, child pages, and some cross-links need to be across the site.
Ideally, along with some “priority” links, ensure your top pages / most competitive pages have links from some pages further up in the hierarchy.
After that, you’ll need to redirect the sitemap URLs. You could 404 / 410 if you want, but I prefer to redirect to related URLs where possible to help retain any value.
If you had a paginated HTML sitemap, particularly one that included keywords like my earlier example, then try and redirect these as best possible to an appropriate page.
Treat this depreciation like a migration, where you want to redirect to the most similar page possible.
Do you need to deprecate it though? I’ve had clients where I won’t bother recommending it.
Particularly when there are things that are more valuable that could be done.
I will, however, recommend killing it off as soon as I see it become an actual issue.
Just remember, by removing something like this you need to make sure you’ve properly covered internal links. Even though this sort of setup isn’t ideal, you could be creating a tonne of orphan pages if you’re not set up properly before removal.
Some may disagree
I know there are SEOs that will disagree with this, but I haven’t seen an HTML sitemap do anything useful in the last 5+ years.
It’s time to cut them from site builds.