There's a lot of conflicting and/or confusing advice online about the best way to markup a document using HTML5. Now that I'm using Oxygen Builder I wanted to understand the best way for me to use HTML5 and thereby improve the accessibility and SEO of future websites I build.
If you're about to do what I've just done, which is read a lot in an attempt to figure it all out, I hope I can save you the trouble.
You'll stumble across documents that say the HTML5 spec never really advocated HTML5 outlining (which is the thing causing most of the confusion), and others (including Google's John Mueller) that say you can use as many H1 tags on a page as you like. While others say HTML outlining and multiple H1s were both myths.
You can use H1 tags as often as you want on a page. There’s no limit, neither upper or lower bound ... your site is going to rank perfectly fine with no H1 tags or with five H1 tags.John Mueller - Senior Webmaster Trends Analyst, Google
So what's the answer? Should we listen to John Mueller? Let's start with some definitions.
There are a number of tags that can provide semantic context to a page. Here are the ones I wanted to focus on as they seem to be at the crux of all the discussions online.
As a developer they make code easier to read. But semantic tags also make the page easier to consume for search engines or accessibility tools such as screen-readers. For example, a screen-reader which could be used by site visitors with visual impairment, will read the contents of a page to the visitor and annotate that text with information provided by the semantic markup. This helps the screen reader to present content in a way that is more closely aligned to the visual experience.
For example, if a header is repeated on every page, unless it semantically marked up as a header, the screen reader will be obliged to read it every time on every page, as it does not know to interpret the header as less relevant to the main content. Whereas a header that has say an orange background and loads of links in it, that is repeated on every page and not semantically marked as a header can be quickly and easily filtered out by a sighted person.
In other words, it's important to communicate as much as possible to the screen reader, including where the header is, where the footer is and where the main content of the page is.
The HTML5 outlining algorithm is the ability for software to understand the structure of a document based on the placement of headings. Prior to HTML5 heading tags alone were used for this purpose.
If you gave a blog post an H1 heading, that was OK when the blog was presented in a page by itself. But that same H1 would end up being incorrect on a page where many blogs posts were listed, as in say, an archive page.
In this case, the archive page might have an H1 of "My Posts" and then each individual post would therefore require a heading of H2, to maintain the hierarchy. In fact all headings would have to be demoted to one lower in all blog posts in this case.
The idea of section nesting was supposed to address this. Section nesting would do away with heading numbering altogether and allow the user to use H1 headings everywhere leaving it up to the browser, testing tool or accessibility tool to work out what level of heading is appropriate based on the section's depth in the hierarchy.
So instead of
Apparently, we were supposed to have this :
But the HTML outline algorithm was never implemented, or never implemented well and so there has never has been any point in using HTML5 outlining markup.
Although HTML5 Outlining was described in the draft HTML specification, and talked about everywhere as it it were real, it never existed anywhere in real life. However, the algorithm was described in official documents so it seeped into the collective HTML user psyche and more importantly, into loads of online advice.
Despite bringing it up in the first place, the HTML 5 spec later said HTML5 Outlining could not be relied up and the advice has always been to use H1 to H6 tags in markup explicitly.
Meanwhile though, discussions about the future of document outlining continue, and one idea is to keep <h1> to <h6> tags as is, but to create a new tag simply called <h> that would implement the document outline using sections.
This approach seems most sensible to me as it maintains backwards compatibility and gives us hope that someone somewhere will implement something at some point in the future that may work.
The use of <h> would be free from any backward incompatibilities and would also free us from the current restriction of nesting depth of just six levels (that could be useful e. g. for HTML-based books that may contain far more levels than a regular HTML page).Marat-Tanalin W3c Github HTML Issues
You should follow the loudest HTML5 specification advice which is to continue using H1 through to H6, and also don't forget to make sure they are being used in the right order. Thankfully if you are using Oxygen this is easier to do than on most typical WordPress themes. With Oxygen, you are in charge so largely you can make things work as you wish.
This means you can structure your documents properly and this should mean they'll be better indexed in search engines and be better for accessibility tools. Most people won't be doing this so it should place your site at an advantage. Right now my own sites do not do this. Indeed, this article is part of my own research to improve this situation for myself.
For a long time it was always held that you should only have one H1 on a page. This goes back to pre-HTML5 times when the H1 really existed to point out the overall topic of a page to both the reader and the search engine. But then came along the idea of HTML5 Outlining and somehow got confused with the idea that it was OK for multiple H1 tags to lay about anywhere.
So despite the spec still allowing for multiple H1 tags in very specific circumstances, the real-life advice is still to only use one H1 per page. One H1 per page is what accessibility tools require, so that should be reason enough for us to adhere to one per page.
No ... but I've decided to keep my use of all the HTML5 semantic tags to their most basic. Bear in mind that section tags are not the only tags capable of sectioning. Article, aside, nav and body tags are also sectioning tags.
So by all means use the article tag to demarcate an article (in this context read blog post). But reserve the use of sections for anything that is a demarcated within an article, or within say a home page and has no better tag to implement that demarcation.
You could add a section every time you add a heading, to make the demarcation of the heading absolutely clear. I think. But this would be tough if you are using the Gutenberg editor for the majority of your content, as I imagine you are. All search engines, accessibility tools, and analysis tools are capable of understanding the area headed up by a heading tag in a document. They don't need sections to help them do it, so I am not going to use sections at every turn.
Given that we will not use sections at every turn, when might we use them? Here is an example. Say you like to summarise each of your blog posts and place that summary at the top of the post, you could place the summary inside a section tag simply to demarcate it.
This will help screen readers really point out to users that this summary section is in the post but somehow is to be noticed or highlighted in the post.
Look at it another way. If you just gave the blog post summary an H2 tag, then it is no different to any other paragraph headed by an H2 on the page. It's up to you to decide to use a section in this instance or not. If an area of a page looks different to a sighted visitor - is highlighted in some way - and is not an aside, nor nav, then maybe it deserves the potential to be highlighted in accessibility tools too. You can begin to make this happen by placing it inside section tags.
When using WordPress these are the easiest conclusions I could draw:
Aside tags are used for repeated information that appears across the site, information that is not really part of the flow of the main content on the page, or information only loosely related to the main content. So this could be :
If your blog post or website page is simply constructed, to the point, and concentrated on its single H1 promise, then it's easier for the search engine to recognise and index.
The use of asides is important to keep repeated content separate from unique content on a page, but don't forget, this separation works in the context of sectioning elements. The aside is related but somewhat separate to the content inside the enclosing sectioning tag (be that an article, section or body tag). Also note, the main tag is not a sectioning tag.
So what might a simple page structure look like? As I read around the available information I found that what works for accessibility tools is not necessarily what the HTML5 spec suggests, or it may be that the HTML5 spec is vague in places, or both.
The page structure below is what I will use for WordPress pages. It won't be exactly like this, because with Oxygen/WordPress there will unavoidably be some extra divs in between some of those semantic tags. But the semantic tags will be in that order.
That is a basic structure I'll be using for pages. Notice I've deliberately put in three aside elements.
The article tag HTML element contains the re-distributable content of the page. In may example, the aside inside the article element is for a pull-quote. This is because putting the aside inside the article specifically says this information is aside (not absolutely required, but related to the article) and should be included in the article unit distribution, if the article is seen on its own elsewhere or out of context.
The main HTML element contains the dominant content of the body element, and normally there should only be one main element on a document. It contains the unique content for the page or view.
By placing the related posts aside outside the article tag we are indicating that the related posts do not form part of the individual distributable article unit. But they are related to and unique to the contents of main.
Note however that the main tag is not a sectioning tag and as such, placing the aside here (as opposed to in the body tag) is not significant. However the main tag does identify the primary content on the page. So if your aside is considered part of the primary page content, as related posts would be, IMO, then place the aside inside main.
This is reserved for information that is related to the website, but not specifically to the article. It is also probably not unique to any particular page. In my example I have shown an aside with a list of popular posts, or a call to action.
That is a basic structure I'll be using for posts. I'll only discuss the items I have not mentioned already in information about page structure above.
I've used one section element inside the article. Note it has its own heading tag. All sections should have a heading tag. This is to show a separate area of the article that is not an aside - i.e it is wholly part of the article - but it deserves to be specially demarcated. In this case it is a one paragraph summary of the article.
The information about sidebars is various. This is what I have managed to glean. If the sidebar - literally the area usually found on the right hand side of your page - contains information that is relevant to the main content, it can appear inside the main element. But if it is general to the whole site and repeated on many pages, then it should be outside the main element, in the body element.
The only other thing to mention is that in my research I saw sidebars - areas to the right of the main content - set up as divs with aside elements inside them. This is because making the divs appear at the side requires styling and when we do things specifically for styling the HTML5 specification says use a div.
However on a mobile device, right-hand or left-hand sidebars move to beneath the main content. As most people access the web on mobile devices, I am inclined to put sidebars beneath the main content on large devices too, and have done. In which case I don't think I'd bother encasing any aside element inside a div. I'd just let the natural flow handle it.
OK - so that was my interpretation of acres of the advice I have read. What do you think?