We talk all the time about how to better communicate both visually and verbally. We talk about making your aesthetics meaningful and using design principles to help your audience understand your content. What about your code?
Can you make the code behind your websites more meaningful? Yes, you can and you do that through the use of semantic html.
What Are Semantics
Before we get to semantic html we should quickly define semantics.
semantics (adj.)
- of or relating to meaning, especially in language
- the meaning or relationship of meanings of a sign or set of signs
When people say they want to make something more semantic, they simply want to make that thing more meaningful.
What is Semantic HTML?
Semantic html is using html to reinforce structural meaning. It’s about using tags, class names, and ids that reinforce the meaning of the content within the tags.
When content is a paragraph of text you mark it up with paragraph tags. When you have a list of items you use list tags. If there’s some order to the list items you use an ordered list and when the order isn’t important you use an unordered list.
Each of the tags mentioned is semantic since they describe the content inside the tag.
Semantic html is also about using tags in the right way. A blockquote exists to hold a quote inside, not because some bit of text needs to be indented.
How something looks has nothing to do with what it means. It’s why we separate html and css. The former is for structure and meaning, while the latter is for how we present that structure and meaning.
Say you write an article with a main heading and several subheadings. You could easily place each of the headings in a div, add a class or id, and style those divs in any way you’d like.
We could for example style the following html to visually show our main heading and subheadings:
{code type=html}
Visually each of those headings could communicate hierarchy through size and weight. You could clearly communicate the meaning of each heading to your audience through presentation.
However anyone or anything viewing the html alone wouldn’t have this meaning communicated to them. Your hierarchy could be inferred by comparing all those divs, but there’s no real hierarchy there.
<div class=”big-and-bold”> tells us absolutely nothing meaningful about the content inside the div. It only suggests what the content will look like.
An <h1> tag on the other hand clearly says this is the most important heading on the page. This is the top of the hierarchy.
Most people, who aren’t web developers, designers or SEOs, probably aren’t going to view your article by looking at your html directly. Since you can visually communicate meaning to your audience why is semantic html important?
Why Semantic HTML is Important
Semantic html is an additional layer of communication. When you use semantic html you communicate more than when you use non-semantic html. Isn’t that pretty much the point of what we do? Communicate.
Real people looking only at how your page displays may never get that additional communication, but machines will.
Machines like screen readers and feed readers and search engines. Providing that extra meaning allows those machines to translate the meaning for real people.
Semantic html is important because it’s:
-
Clean — It’s easier to read and edit, which saves time and money during maintenance.
Imagine adding the non-semantic class=”red” to a span of text. Later you decide that text should be green. That’s going to be confusing to someone editing the html at a later date.
Better would be to use something like class=”price” (assuming the content is a price on an ecommerce site). You could then change the color from red to green to blue to orange without confusing what that content is.
-
More accessible — It can be better understood by a greater variety of devices. Those devices can then add their own style and presentation based on what’s best for the device.
A screen reader could raise and lower volume to communicate the hierarchy of your h1-h6 tags for example since you’ve clearly indicated a hierarchy.
The more meaningful the structure of your content, the better different tools can make use of your content.
-
Search engine friendly — This is still debatable as search engines rank content and not code, but search engines are making greater use of things like microformats to understand content.
Google can read the hreview microformat and can use the data to create richer snippets below search results.
They could potentially rank pages they know to be reviews higher when someone is specifically searching for a review.
I’m sure you’ve seen arguments for why it’s better to use css than tables to layout a website. One reason is semantics.
An html table is meant to house data, the kind of stuff you’d place in a spreadsheet. Using table for layout confuses the communication.
Just as you wouldn’t create a slide for a presentation by placing images and text in spreadsheet cells you shouldn’t do the same inside html tables.
You might have a slide that features data in a spreadsheet, just as you might create a web page that features data in a table, but you wouldn’t try to design every slide using a spreadsheet.
Is it OK to Use Non-Semantic HTML?
Many would answer no. Some might think I’m crazy for even asking the question.
Since semantic html is communicating more than non-semantic html it would seem to make sense to always use semantic html. But are there times when non-semantic html might actually make development easier?
Consider css grid frameworks, which typically include class names like grid_1 or container_12. They aren’t semantic.
A class name like container_12 indicates a 12 column grid, but what happens later when you want to change the site to a 16 column grid or a 4 column grid.
Classes like grid_1 don’t exactly help screen readers or search engines understand content either. They aren’t describing the content. They’re describing the presentation of content.
At the same time those class names can be very meaningful to a designer who works with grids. At a glance the underlying grid structure of the design is readily apparent.
Grid frameworks generally help speed development time as well so there are benefits to using them. Yet some would say we should never use them because of the lack of semantics
Where do we draw the line?
Good class names should never need to change, however the presentation of a website sometimes does change. When structure and presentation are mixed as in class=”red” or class=”grid_5″ we have to:
- Change structure to change presentation
- Confuse style and structure such as styling class=”red” to be green
- Leave behind classes in html that are no longer styled at all
One solution is to use frameworks like Compass, Scaffold and Sass to convert our non-semantic code to semantic code.
Doing so we gain the benefits of working with grid based class names during development and then convert our code to semantic markup for the live site.
Like most design decisions there’s always a tradeoff. We have to weigh the good and bad and decide which is best to use for a given project.
I’m still mixed on how semantic my code needs to be. All things being equal we should always choose the semantic option, but all things are seldom equal.
At times I think the benefits of some non-semantic html as in the grid based class names are worth the loss of semantic meaning. I do think we should strive to write semantic code. I don’t think non-semantic code is as evil as some would suggest.
Ask me in 6 months though and I may easily have changed my mind. If you have arguments for or against the use of semantic code please share.
Summary
Communication is the central task of web designers. Most of the time we think of communication in terms of the words we use and the visuals we create, but it also extends to the code we write.
Semantics are about meaning. Writing semantic code means writing more meaningful code. We should strive to write html that describes content and not the presentation of that content.
class=”red” is always a bad idea, but what about class=”grid_4″? Is all non-semantic markup bad or is it ok at times and under certain circumstances?
My best answer is we should do our best to write semantic code, but we shouldn’t obsess over it. Similar to how we should do our best to write valid code though not worry about an invalid line here or there? Again I haven’t entirely made up mind about this.
Where do you stand. Should all code be semantic? Is it ok to be non-semantic at times? What do you consider best practice?
Download a free sample from my book, Design Fundamentals.
Semantic HTML coding IMO is actually good thing. But not always is needed. Not all the sites need #header, #footer, #navigation … But we definitely try to make web sites that have semantic “meaningful” names. I think that people often confuse meaningful(semantic) is not equal to (#header, #footer, #navigation).
About the CSS Framework nobody forbids to have div id=”header” class=”grid_1″ or div id=”nav” class=”grid_4″ .
I also think that CSS Frameworks can have meaningful naming system. Example I usually don’t give names like grid_1, grid_2 to for my CSS Frameworks I usually go with something like dl200 (div left 200px) or g960 (grid width=960px) so the name contains meaningful(semantic) information and if you know that g960 is 960px you will know that g480 means 480px.
I think we’re thinking alike. I don’t have a problem using grid_4 etc. I do agree they impart semantic meaning too, though that meaning is presentational and doesn’t describe the content. I know many wouldn’t consider that to be semantic html, but some certainly would.
I see semantic html as something to strive for, but not to get too caught up if you occasionally use a non-semantic class here or there if it helps in some area of development.
On the other hand, using class names like “dl200” is really, *really*, bad practice. When you use class names like that, there is a number of problems:
1) It’s basically the same as having a class called “red”. Whenever you want to change the way a certain element on your website looks like, you either have to change the HTML (the class that is used), or change the CSS to something that no longer corresponds to the class’ name.
2) It is a kind of pseudo-inline css. If you want a centered div, 200 pixels wide, red text and a blue bg, you could get things like this:
Which may seem fine, because hey, it says exactly what it does. The problem is that you could have just typed all that out in a style-attribute, it’s basically the same.
The main thing to remember is that semantic classnames aren’t going to do much for you short term. They really do shine in the long run though, whenever you have to go back to some project you worked on a few months ago, you’ll be real glad you used semantic names.
All true. I would argue that it’s ok to have non-semantic classnames here and there though. I think we can get too locked into semantics at the cost of getting something done sometimes.
Granted a class like dl200 says absolutely nothing and should be avoided. Something like grid_4 on the other hand isn’t semantic in the sense of it describing the content, but it is semantic in the sense of describing what the class does.
@Tom: You are right pointing some bad practice about dl20. That are the class names that I “invented” for my CSS Framework – Emastic.
I wanted something that will be fast and intuitive to implement and to produce small HTML footprint because of the short class names.
The problem you mentioned is the maintenance or the problem of changing the HTML rather than CSS, that for larger sites can be problem. But from practical experience multi-layout sites like Yahoo use frameworks similar to my Emastic where HTML content is changing but not the CSS Framework.
Interesting that you see larger sites changing the html more while keeping the css framework. Are the sites moving more toward an object oriented css?
Actually Yahoo YUI is the first version of OO CSS. Google is using their own CSS Framework. I think OO CSS works well for the big sites.
Thanks Vlad. I didn’t realize Yahoo was first with OO CSS. I take it Nicole Sullivan was working there at the time.
You shouldn’t use div and span for everything, though both tags do have their uses.
Great article, don’t really now too much in this field but you have educated me well.
I’m not a techie or a web developer, so I apologize in advance. Basically semantic html is all under the hood, right? It should be used to make the html code more readable and accessible to other coders and has no effect on how the final page actually looks or functions in the browser, correct?
Right. It’s not going to change how things look to you or me. A web developer can easily make semantic and unsemantic code look exactly alike.
Part of it is to to make things easier for anyone who later works on the site, including yourself. It’s more though than just making it more readable to coders though.
Part of it is so other machines can better understand what’s going on. For example in the post I mentioned using an h1 tag as opposed to a div with a calls on it. Regardless of how it looks browsers, screen readers, and other machines that can read the code will understand that what’s inside the h1 tags has a different importance than other text on the page.
That machine could then communicate that importance in some way. A screen reader might speak the words louder. The screen reader wouldn’t know to speak louder if the div tag were used. The structural meaning of the content wouldn’t be communicated.
Does that make sense?
Hi thanks for a really thorough post,In my personal opinion I believe that it is always better to use semantic HTML, this creates a more universal language that can be read by colleagues and fellow developers, providing instant recognition to those who understand HTML. I disagree with your line where you imply that we should “not worry about an invalid line here or there” I believe that good clean code should always be used and it should be W3C compliant regardless. Thanks again for an interesting read.
Thanks Alan. I agree that in general we should strive to write semantic code. I’m not sure how important it is to make every line semantic though. I think many of us would disagree anyway about whether or not some code is or isn’t semantic.
With invalid code I think it’s the same. If you’re interested here’s a post I wrote awhile back on the importance of writing valid code. An example of where it’s fine to write invalid code is when using progressive enhancement. Usually you’ll include vendor specific code like-moz or -webkit. Neither will validate, but both are fine to use.
I would prefer always semantic HTML coding.
Good Information. Thanks Alan..
I think it is best to use it as it is very clean web design code!
Your article is well written. It’s a well-written article. Either or. I’m not a stickler. But yet I’m mostly a perfectionist.
Semantic coding is a good idea but like most good coding ideas, taking it too far or out-of-place is as bad as not doing it at all.
Common sense and good natural judgement (which isn’t always learned) rule. Unless expediency is commanding the day. Which is often.
Thanks Gary. True about not taking things too far. Everything in moderation is often a good rule of thumb.
A little late; I found this researching industry positions on web semantics.
I think what is useful to consider here, especially when it comes to elements and classes/IDs, is where the meaning comes from. Ultimately, you can say semantics simply stem from who and how many agree on it [1].
[1] http://meiert.com/en/blog/20111026/on-semantics-in-html/
Thanks Jens. Better late than never. I’m late too in replying to your comment.
I’m publishing a series about some of the new html5 elements like header and article, etc. The first post should go live on Monday. I’m working on a demo for it now.
I mention a few times exactly what you’re saying about semantics. If we as an industry can start using the new elements in a consistent way, regardless of whether or not it’s the way the w3c suggests, we communicate semantics. It’s the consistency that creates them.
I went into the series skeptical that I would use the new elements, but now that I’m reaching the end, I’ve had a change of mind and plan on using them more and seeing if I can be part of shaping the semantics of the new elements.
And thanks for the link. I remember your article when you first published and looking at it again, I think it helped shape my thoughts about this topic.
Hello, Steven!
First and foremost, thank you for the article, the effort you put into writing this and answering comments is greatly appreciated.
My question – do you think there is a practical advantage to using the ‘new’ semantic tags instead of spans and divs? I mean, I understand there is a difference in theory – article, section, paragraph, aside and other such tags being much more specific with regards to the content they wrap. But from a practical SEO standpoint, is there a difference? Does it help Google parse and understand the content of a page better, and thus possibly rank the website higher? I’ve done some digging with GeekReport, few webmasters seem to be using the new tags.
Thanks Ryan.
That’s a great question, but I’m not sure I have a great answer. I think anything search engines can use to help them deliver better results is something they’ll use. I do think they read semantic tags as do screen readers among others and I do think the semantic tags help search engines better understand your web pages and site.
Does it have a huge impact on where a page ranks? I doubt it does at the moment, but I would think it helps. My guess is a lot of people don’t use the tags because we developed habits prior to them existing. I still reach for divs and spans more out of habit than any other reason.
I do use the nav element and the main element, and the article element. I tend not to use section and if I use header and footer, it’s usually just for the main page header and footer. Even if they don’t currently make a lot of difference in where a page ranks, it’s not as though they take more time to use. I used to wrap navigation in a div with a class of nav. Now I wrap it in a nav element. Once I was used to doing that it, it felt no different than using a div as the wrapper.
Thanks for this article.
This helps clears up my confusion about the markup elements/tags.
I have this book mentioning about structural tags and another book and articles online mentioning about semantic tags that confused me. I also read about inline elements and block elements which add up even more to my confusion.
Based on what I read and understood from your article and from what others commented here, markup elements are now appropriately called us semantic tags/elements.
Markup tags/elements can be categorized as non-semantic such as span and div and semantic such as header, footer, paragraph. Semantic tag can still be further subdivided into structural(block level) and text inline semantic.
I hope you could confirm if my understanding is correct.
Thanks Miko. I’m glad I could help. I think you generally have it right. The idea is that semantic tags have meaning. An H1 for instance means heading level one and its content is something different than content inside a paragraph tag. On the other hand a div is a container without any specific meaning.