How Important Is Semantic HTML?

Posted: 23 May 2011 05:30 AM PDT

We talk all the time about how to better communicate both visually and verbally. We talk about making your aesthetics meaningful and using design principles to help your audience understand your content. What about your code?

Can you make the code behind your websites more meaningful? Yes, you can and you do that through the use of semantic html.

What Are Semantics

Before we get to semantic html we should quickly define semantics.

semantics (adj.)

of or relating to meaning, especially in language
the meaning or relationship of meanings of a sign or set of signs

When people say they want to make something more semantic, they simply want to make that thing more meaningful.

What is Semantic HTML?

Semantic html is using html to reinforce structural meaning. It’s about using tags, class names, and ids that reinforce the meaning of the content within the tags.

When content is a paragraph of text you mark it up with paragraph tags. When you have a list of items you use list tags. If there’s some order to the list items you use an ordered list and when the order isn’t important you use an unordered list.

Each of the tags mentioned is semantic since they describe the content inside the tag.

Semantic html is also about using tags in the right way. A blockquote exists to hold a quote inside, not because some bit of text needs to be indented.

How something looks has nothing to do with what it means. It’s why we separate html and css. The former is for structure and meaning, while the latter is for how we present that structure and meaning.

Say you write an article with a main heading and several subheadings. You could easily place each of the headings in a div, add a class or id, and style those divs in any way you’d like.

We could for example style the following html to visually show our main heading and subheadings:

 <div class="big-and-bold"> <div class="not-quite-so-big-and-bold"> <div class="small-and-thin">

Visually each of those headings could communicate hierarchy through size and weight. You could clearly communicate the meaning of each heading to your audience through presentation.

However anyone or anything viewing the html alone wouldn’t have this meaning communicated to them. Your hierarchy could be inferred by comparing all those divs, but there’s no real hierarchy there.

<div class=”big-and-bold”> tells us absolutely nothing meaningful about the content inside the div. It only suggests what the content will look like.

An <h1> tag on the other hand clearly says this is the most important heading on the page. This is the top of the hierarchy.

Most people, who aren’t web developers, designers or SEOs, probably aren’t going to view your article by looking at your html directly. Since you can visually communicate meaning to your audience why is semantic html important?

Why Semantic HTML is Important

Semantic html is an additional layer of communication. When you use semantic html you communicate more than when you use non-semantic html. Isn’t that pretty much the point of what we do? Communicate.

Real people looking only at how your page displays may never get that additional communication, but machines will.

Machines like screen readers and feed readers and search engines. Providing that extra meaning allows those machines to translate the meaning for real people.

Semantic html is important because it’s:

Clean — It’s easier to read and edit, which saves time and money during maintainence.
Imagine adding the non-semantic class=”red” to a span of text. Later you decide that text should be green. That’s going to be confusing to someone editing the html at a later date.

Better would be to use something like class=”price” (assuming the content is a price on an ecommerce site). You could then change the color from red to green to blue to orange without confusing what that content is.
More accessible — It can be better understood by a greater variety of devices. Those devices can then add their own style and presentation based on what’s best for the device.
A screen reader could raise and lower volume to communicate the hierarchy of your h1-h6 tags for example since you’ve clearly indicated a hierarchy.

The more meaningful the structure of your content, the better different tools can make use of your content.
Search engine friendly — This is still debatable as search engines rank content and not code, but search engines are making greater use of things like microformats to understand content.
Google can read the hreview microformat and can use the data to create richer snippets below search results.

They could potentially rank pages they know to be reviews higher when someone is specifically searching for a review.

I’m sure you’ve seen arguments for why it’s better to use css than tables to layout a website. One reason is semantics.

An html table is meant to house data, the kind of stuff you’d place in a spreadsheet. Using table for layout confuses the communication.

Just as you wouldn’t create a slide for a presentation by placing images and text in spreadsheet cells you shouldn’t do the same inside html tables.

You might have a slide that features data in a spreadsheet, just as you might create a web page that features data in a table, but you wouldn’t try to design every slide using a spreadsheet.

Is it OK to Use Non-Semantic HTML?

Many would answer no. Some might think I’m crazy for even asking the question.

Since semantic html is communicating more than non-semantic html it would seem to make sense to always use semantic html. But are there times when non-semantic html might actually make development easier?

Consider css grid frameworks, which typically include class names like grid_1 or container_12. They aren’t semantic.

A class name like container_12 indicates a 12 column grid, but what happens later when you want to change the site to a 16 column grid or a 4 column grid.

Classes like grid_1 don’t exactly help screen readers or search engines understand content either. They aren’t describing the content. They’re describing the presentation of content.

At the same time those class names can be very meaningful to a designer who works with grids. At a glance the underlying grid structure of the design is readily apparent.

Grid frameworks generally help speed development time as well so there are benefits to using them. Yet some would say we should never use them because of the lack of semantics

Where do we draw the line?

Good class names should never need to change, however the presentation of a website sometimes does change. When structure and presentation are mixed as in class=”red” or class=”grid_5″ we have to:

Change structure to change presentation
Confuse style and structure such as styling class=”red” to be green
Leave behind classes in html that are no longer styled at all

One solution is to use frameworks like Compass, Scaffold and Sass to convert our non-semantic code to semantic code.

Doing so we gain the benefits of working with grid based class names during development and then convert our code to semantic markup for the live site.

Like most design decisions there’s always a tradeoff. We have to weigh the good and bad and decide which is best to use for a given project.

I’m still mixed on how semantic my code needs to be. All things being equal we should always choose the semantic option, but all things are seldom equal.

At times I think the benefits of some non-semantic html as in the grid based class names are worth the loss of semantic meaning. I do think we should strive to write semantic code. I don’t think non-semantic code is as evil as some would suggest.

Ask me in 6 months though and I may easily have changed my mind. If you have arguments for or against the use of semantic code please share.

Summary

Communication is the central task of web designers. Most of the time we think of communication in terms of the words we use and the visuals we create, but it also extends to the code we write.

Semantics are about meaning. Writing semantic code means writing more meaningful code. We should strive to write html that describes content and not the presentation of that content.

class=”red” is always a bad idea, but what about class=”grid_4″? Is all non-semantic markup bad or is it ok at times and under certain circumstances?

My best answer is we should do our best to write semantic code, but we shouldn’t obsess over it. Similar to how we should do our best to write valid code though not worry about an invalid line here or there? Again I haven’t entirely made up mind about this.

Where do you stand. Should all code be semantic? Is it ok to be non-semantic at times? What do you consider best practice?