Explaining semantic mark-up

I have a strong interest in semantics in general, and when it comes to web developing, the benefits of properly marking up a document should not be neglected. One problem is that some people don't understand the difference it makes, so therefore let me humbly make an attempt to explain why semantics is important.

What is a semantically marked-up document?

I think it's important for a web developer to view HTML documents without any external formatting applied. That means without CSS, no JavaScript enhancement, and, if you want, no images as well; instead just the raw content. Look at it, read it through. Does it make any sense? Do you understand which parts are more important than others, which texts are headings, which parts are connected to each other? If the answer is yes, the document is probably marked up in a nice understandable semantic fashion.

Element usage and code examples

Let us first talk about the base of semantic HTML elements. These are heading elements (<h1> through <h6>) for all kinds of headings, paragraph elements (<p>) for paragraphs of texts, list elements (<ul> and <ol>) when listing things such as navigation markup etc. Basically, think about the meaning of that particular piece of content you want to come across, and mark it up accordingly. If we take it one step further, make sure to use elements such as <em> or <strong> if you want to emphasize or respectively make something appear more important. When it comes to forms, use <label> elements to connect label texts with their corresponding form fields. Given the limited ways we once had to create layouts, <table> elements have been (and unfortunately, still are) heavily overused to achieve this. Newcomers to CSS-based layouts on the other hand shy away from tables like the plague, and believe that a document containing one single table is truly evil. Neither of these views are correct. Tables are only meant to present tabular data, where they definitely should exist, and they're not there to decide where you want to have columns in your web page. Tabular data means anything that you would present in a table within a normal document, and be such things as statistics, distances between destinations, train time tables etc. People making the transition away from tables usually get the div-itis, meaning that they use <div> elements for everything instead. This is just about as bad as using tables for every context, since div elements have no meaning but acting as containers for parts of a web page. Let me exemplify all this with a bad code example compared to a semantically improved version:

A bad example


<table id="web-site-container" width="100%">
	<tr>
		<td id="navigation">
			<table width="100%">
				<tr>
					<td>
						<a href="http://www.apple.com/macosx/">Mac OS X Leopard</a>
					</td>
				</tr>
				<tr>
					<td>
						<a href="http://www.microsoft.com/windows/products/windowsvista/default.mspx">Windows Vista</a>
					</td>
				</tr>
				<tr>
					<td>
						<a href="http://en.wikipedia.org/wiki/Semantic_Web">Wikipedia: Semantic Web</a>
						<table>
							<tr>
								<td>
									<a href="http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21">The Semantic Web article</a>
								</td>
							</tr>
						</table>
					</td>
				</tr>
			</table>		
		</td>
		<td id="content">
			<div class="heading">
				Web site name/Document name
			</div>
			<!-- Let's a lot of <code><br></code> elements here to get a nice bottom margin -->
			<br><br><br><br><br><br>
			<div>
				This is <span style="font-style: italic">the best content</span> text ever written.
			</div>
			<div>
				<!--   are great for indenting text! -->
				            
				Indented pre-amble text explaining something.
			</div>

		</td>
		<td id="contact-form">
			<form action="/contact" method="post">
				<div>
					Name: <input type="text">
					<input type="submit" value="Send">
				</div>
			</form>
		</td>
	</tr>
</table>

A better example with actual semantic meaning


<div id="web-site-container">

	<div id="navigation">
		<ul>
			<li>
				<a href="http://www.apple.com/macosx/">Mac OS X Leopard</a>
			</li>
			<li>
				<a href="http://www.microsoft.com/windows/products/windowsvista/default.mspx">Windows Vista</a>
			</li>
			<li>
				<a href="http://en.wikipedia.org/wiki/Semantic_Web">Wikipedia: Semantic Web</a>
				<ul>
					<li>
						<a href="http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21">The Semantic Web article</a>
					</li>
				</ul>
			</li>
		</ul>
	</div>

	<div id="content">
		<h1>
			Web site name/Document name
		</h1>
		<!-- Bottom margin is applied through CSS to the <code><h1></code> element -->
		<p>
			This is <em>the best content</em> text ever written.
		</p>
		<!-- Indentation is applied through a general "pre-amble" CSS class -->
		<p class="pre-amble">
			Indented pre-amble text explaining something.
		</p>

	</div>

	<div id="contact-form">
		<form action="/contact" method="post">
			<div>
				<label for="user-name">Name</label>: <input id="user-name" type="text">
				<input type="submit" value="Send">
			</div>
		</form>
	</div>

</div>

Don't mix up presentation and meaning!

This point can't be emphasized enough! They way you mark up content has no relation to how you want it to be presented within your design. Everything presentation-related should be controlled through CSS. Don't use headings just to get a larger font, <blockquote> elements to have a text indentation and so on. List element have a semantic meaning, and it doesn't mean that everything that's a list has to be presented with a bullet or number before each list item. Let me say it again: it's about meaning, not about looks.

The benefits of semantics

The benefits of using good semantics in a document are:
  • It will be more accessible to people seeing the document in an environment where CSS cannot be applied.
  • It will be understandable and coherent to people having it read to them with the help of a screen reader.
  • It will help to get a better search engine ranking, since search engines can easier distinguish the importance level of the document's different parts and what message is being conveyed.
  • It will be a lot easier for web developers to maintain the code, and to separate content (HTML) from presentation (CSS).
  • In most cases, there will be less code, which isn't cluttered by formatting, meaning that the web page will be faster to load.
Posted in Developing,HTML5/HTML/XHTML,Technology |

63 Comments

  • [...] Check it out! While looking through the blogosphere we stumbled on an interesting post today.Here’s a quick excerptI have a strong interest in semantics in general, and when it comes to web developing, the benefits of properly marking up a document should not be neglected. [...]

  • [...] Explaining semantic mark-up I think it’s important for a web developer to view HTML documents without any external formatting applied. That means without CSS, no JavaScript enhancement, and, if you want, no images as well; instead just the raw content. Look at it, read it through. Does it make any sense? Do you understand which parts are more important than others, which texts are headings, which parts are connected to each other? [...]

  • Siegfried says:

    Full ACK! Absolutely.

  • [...] Social Media Train? Search Engine Rankings: Only Google matters or maybe little from Yahoo, MSN.. Explaining semantic mark-up – Robert’s talk Google vs Webmasters and Google is Losing | Search Engine Optimization | St.. Ramblings About SEO [...]

  • Olly says:

    I've been meaning to write this post for years now. You finally beat me to it :)

  • Ed Everett says:

    I'm not sure that all your conclusions logicaly follow from using "semantics". The last two points are about separation of presentation from content/structure, this could be acheived just using divs and classes for every tag. And I'd suggest that your first two points in the conclusion are really the same point.

    The trouble is there isn't actually a lot about semantic markup that the public or clients care about. It comes down to trying to pursuade people to care about "blind people" (most people's view of accessibility), and believe us that it'll improve search engine rankings as an unprovable article of faith (bbc.co.uk does pretty well with rubbish markup).

    Who is it that you want to care about semantics? Web developers, clients or the general public? I'm not sure that the last two groups need or should be expected to care.

    I'd also suggest that "semantic markup" is verging on standardista jargon, "meaningful HTML" or "using HTML correctly" may be more useful.

    (I guess I'm just a little uncomfortable with "semantic markup" being used as a catch all phrase for "good HTML")

  • Rob Mason says:

    Good effort mate. Clear and concise.

    What's the view on developer comments in the code? I was always told to remove all comments (<!– / like this /–>) from customer facing code…are they technically semantic with this still in?

  • Robert Nyman says:

    Siegfried,

    Good!

    Olly,

    Well, at least with this subject; I’m sure you’ll be first with the next one. :-)

    Ed,

    Good questions!

    I agree that the two last points border more on separation than just semantics (although I’d say that good semantics is necessary to achieve best result with it). The two first points are both indeed about accessibility, but whereas one is about how you view the result and the other is about having it read out aloud to you; two different experiences.

    I’d say when making the case for semantics with customers, very few care about whether a blind user can use the web site. Their main focus is money, so you need to explain that while reaching more users = more potential customers, it will improve your search engine ranking. When it comes to very large companies like BBC and others, many other factors play in, such as being one of the major companies in the world within their trade.

    For the rest of the companies, good mark-up becomes even more important to stand out. And I’m not saying that semantics is the only way to reach a good search engine ranking, but it’s one of the vital technical parts. I wrote a short summary of what’s important for SEO in a comment the other day.

    Using the word semantic is done very much on purpose; partly because that’s in line with Tim Berners-Lee vision about a semantic web, partly because it’s a term many web developers use hence a a lot will hear about, and also a term people will search for.

    Personally, I agree that meaningful HTML means just as much to me.

    Rob,

    Thanks! The idea with the comments here is to show the difference between the examples, but personally I try to make sure that no comments are delivered to the end user. Basically, comments are good in a developing environment but should generally be removed from what the web site visitor gets.

  • Steven Clark says:

    agree with you 100 per cent so you're probably preaching to the converted robert…

    although i would digress to mention that the term semantic has become one of those overused words which tends to get bandied around – in the end everything sounds like semantic sense to someone it seems…

    we probably do need a new word to represent what we really mean. "natural"? Nuh I can't think of another one. Excellent post, there needs to be more basic information out there.

  • Robert Nyman says:

    Ed,

    I forgot to reply to this:

    <blockquote cite="http://www.robertnyman.com/2007/10/29/explaining-semantic-mark-up/#comment-129153"&gt;

    Who is it that you want to care about semantics? Web developers, clients or the general public? I’m not sure that the last two groups need or should be expected to care.

    Couldn't agre more. It is only relevant for the first group, but the difference in the result should definitely be noticeable to the other two groups.

    Steven,

    Good that we agree! :-)

    It is overused, I agree. But my reasons for using it is explained in my comment above to Ed.

  • Steven Clark says:

    the part I don't understand is why some people don't get it…

    i mean its simply saying make your markup and content meaningful, use the right tool for the right job etc… which to me has always been common sense.

    and i do understand those who are maybe new or a little under developed but it confuses me that some people when confronted with the logic of semantics actually reject it outright.

    then again i've got my last uni exam tomorrow so there's a lot in the world i don't get at the moment (Mobile and Ubiquitous Computing for one lol – ok I get it a bit)… :)

  • Robert Nyman says:

    Steven,

    I think that people who take the time to listen to the arguments will agree. Problem is, most don't have, nor take, that time, and in some cases their view is very narrow: if it looks somewhat ok in IE in Windows when you have a 20/20 vision and JavaScript isn't blocked in any way, then that's sufficient.

    Needless to say, that's not my view. :-)

  • i can't believe no one has mentioned microformats here. semantics is old hat now.

  • Robert Nyman says:

    lewis,

    Eh… Semantic HTML and Microformats are indeed very different things. Microformats is semantic in the aspect that it uses semantic class names and such, but that's not the same thing as semantic HTML, nor something that will improve accessibility or search engine ranking.

  • lewis says:

    true point, but if you're talking semantics it's imperative you mention microformats (maybe not in the context of your article, but i was surprised not to find a comment regarding it). after all it's one step closer to the XML structured web of the future, and the next DOM i might add where XHTML will be surpassed.

    semantics underpin accessibility and search. i don't know what you're talking about there?!

    it's not very accessible for a system to process a hcard marked up with obscure, non-standard, class names any less than it is having a navigation class called 'elephant' or 'rock' as an example. if something isn't standards-based it also makes search a lot harder – you and your readers will know this already.

    set me straight if i'm wrong.

  • Could you do me a HUGE favor and come down to Australia Perth and do a one on one lesson with my superior at my government job???

  • Robert Nyman says:

    lewis,

    I agree that it is related with MicroFormats, and MicroFormats is a good thing to use (at least in the right context). What I meant about accessibility and search engines, is that semantic HTML will improve that.

    MicroFormats, as far as I know, isn't generally taken into consideration for those things.

    But I absolutely agree that proper class names also should be a vital part in the code.

    Jermayn,

    If you get the trip paid, I'm there!

    Actually, Perth is one of the places I haven't been in Australia, so that would coincide perfectly with my own travel desires. :-)

  • Johan says:

    You don´t need < div id="navigation">, you can style the ul-list like any block element. Use < ul id="navigation"> instead.

    Wouldn´t a fieldset be more semantic than < div id="contact-form">?

  • [...] Nyman has put together a great article “Explaining semantic mark-up” that will help explain the benefits of using semantic markup. Here is a quick list of the key [...]

  • Robert Nyman says:

    Johan,

    Regarding navigation: in this exact example, yes. But generally a navigational part of a web site consists of more than just an unordered list, so it was kind of implied there would be more content there. Sorry if this wasn't clear.

    A fieldset would absolutely be better. I avoided it on purpose here, though, since there are some presentational issues about controlling how it is rendered, and the scope of explaining it was just a little bit bigger than this context.

  • [...] Anlaml? kod yaz?m?na dair güzel bir makale. Ba?lant? [...]

  • I would love to meet you etc when you come to Perth but unfortunately the speaking about semantics to my superior would probably make you regret ever coming to Perth :|

  • Robert Nyman says:

    Jermayn,

    Ha ha!

    Well, if I ever go there, I'll make my best to meet up with you! :-)

  • [...] external JavaScript files. Progressive enhancement. Graceful degradation. Unobtrusive JavaScript. Semantics. Accessiblity. Plain Old Semantic HTML (POSH). It includes a broad range of best practices which [...]

  • [...] you’ll save time when you develop the prototype and you get a higher quality of the website. Click here to read more, why it is important, to have a good semantic markup. Sphere: Related [...]

  • [...] Semantisk HTML är en principen att html-dokument ska byggas med element med innebörd för innehÃ¥llets betydelse. En lista ska markeras som en lista (ul/ol/dl), en rubrik som en rubrik (h1-h6) o.s.v, utan nÃ¥gon hänsyn eller hänvisning till hur de ska se ut. Jesse Skinner har mer instruktioner hur man skriver och du kan även läsa mer i Robert Nymans inlägg om semantisk html. [...]

  • [...] I came across a blog that simplified it all for me.  In Robert’s Talk, the author shows how semantic markup makes the coding easier to read, and it leaves out all [...]

  • [...] The first writing for this week re-hashes some of the definitions we have covered in class concerning what semantic [...]

  • [...] Explaining semantic mark-up – Robert’s talk – Web development and Internet trends Nice explanation of semantic markup. (tags: xhtml webstandards standards semantic HTML markup semantics semantic-markup mark-up posh) [...]

  • [...] or not it is relevant to the users search. I also came across a good post by Robert Nyman at Robert’s Talk . This post did a really good job of explaining Semantic Markup and why you benefit from using it. [...]

  • [...] Explaining semantic mark-up [...]

  • Thank you very much for sharing this valuable information. It really easy for web developer to use with less code. with less code and formatting the web page will be faster to load

  • [...] or not it is relevant to the users search. I also came across a good post by Robert Nyman at Robert’s Talk . This post did a really good job of explaining Semantic Markup and why you benefit from using it. [...]

  • [...] Robert Nyman suggests removing all the CSS, scripting and possibly even images when examining the content of your page, to see if the content is well organized and connected, and even though I had never considered this before, I could see how this could be useful. My typical approach to reviewing the completion of a website is to take a step back from it, and see how the page flows graphically. This technique could certainly help me evaluate the content as well, so that it is both presentaionally pleasing, and semantically appropriate as well. [...]

  • Robert Nyman says:

    dani,

    Sorry for a late reply.

    I'd say it all depends on modern layouts, where one need all kinds of hooks for making CSS coding as flexible as possible.

    However, could I have less? Sure, one can always improve. :-)

  • [...] is not anything new for anyone who’ve heard for web-standards, semantic markup and progressive [...]

  • [...] Explaining Sematic mark-up Published in: [...]

  • [...] Explaining Semantic Markup – Robert’s Talk [...]

  • JhonB says:

    Sorry mate! You've got it all wrong. Semantic web is a mark up for Knowledge finding and you need a tool like Protege that supports OWL. May be you wanted to say that a well sctructured site uses correct syntax but you are confussed with what Semantic means…. Semantic Mark up is used to create Ontologies.

  • Robert Nyman says:

    JhonB,

    The idea behind the semantic web, and Tim-Berners-Lee's vision, is to have everything marked up in a good semantic fashion, for a number of reasons.

  • JhonB says:

    Still you are using the word "Semantic" incorrectly. Lets say, use proper HTML. Semantic Mark up is for Machines only (not human readable) HTML is human readable. I understand what you mean, yes, but simply put it this way… you've over used the word "Semantic", that is all, all the rest is good and your post makes sense in terms of good coding.

  • Robert Nyman says:

    JhonB,

    I see what you mean, but if you do your research, see what all the top names in the web development use and so on, you will find that the term semantic is the de-facto term for describing names with meaning.

  • Nice Topic on Semantic mark up. Very Good Explanation.

  • Thanks Robert, this article explain the perfect way how to structure a website due to semantic development.

    Separation of content and style is one of the most improved feature and reason to use CSS and XHTML in a prefect way.

    Advantages of the CSS to style everything globally.

  • Robert Nyman says:

    Web Designing India, Michael,

    Thank you!

  • Tara says:

    I just stumbled upon this article years after you wrote it and I wanted to say thank you so much! All I ever hear about coding is "semantics, semantics, semantics", but no one ever bothered to explain what the heck it meant. Your article is also very semantic. It's meaningful and only includes the important stuff. Thanks again!

  • Robert Nyman says:

    Tara,

    Thank you, it makes me very happy to hear!

  • Robert Nyman says:

    Montreal Web Design,

    Cool, thanks for the tip!

  • Neil says:

    “i can’t believe no one has mentioned microformats here. semantics is old hat now.”

    With microdata, microformats are now old hat, oh how we just keep re-inventing the wheel…

  • Robert Nyman says:

    Neil,

    Well, if you ask me, semantics are still vital on the web. Microdata is an interesting complement, though. :-)

  • Andreas says:

    Ahh, finally I got it! What made me understand was this sentance “I think it’s important for a web developer to view HTML documents without any external formatting applied. Does it make sense?”. Thanks!

  • [...] HTML before you go much further. Try Googling a few articles but this one is an OK starting point; Explaining semantic mark-up – Robert's talk __________________ Websites: Web Designers Bristol | IT Company Bristol | Computer Repair [...]

  • a says:

    Seems like “semantic” anything are buzzwords used by people who don’t care about actually making a useful product. Semantic markup largely says any kind of UI control, like a dial for example, is invalid. There is no semantic to represent that in HTML, but you’d probably use one in some cases. Heck, even look at the source of http://www.google.com shows that the search input is a TABLE WITH A SINGLE ROW INSIDE 20 NESTED DIVS. How is that for “semanticism”. I am pretty sure “semantic” whatever it may be is the LEAST important problem to be solved compared to ACTUAL problems.

  • a says:

    Why do people even use the phrase “readable code” anymore still. If code was readable, then why even use a web browser to look at any given Internet page?”I’ll just read the html and be happy to surf the web” Come on people. The 9000000000 pound cthulu-elephant creature in the room is hard to miss. BTW, there’s a javascript error on this page, but I don’t even care, because i’m just reading content and responding. FORGET ABOUT THESE SMALL DETAILS AND FOCUS ON THE MORE IMPORTANT PROBLEMS.

  • Robert Nyman says:

    a,

    I’m not gonna re-iterate all the points at the end of the blog post that are still valid. And for no other reason, it’s still crucial for SEO. Code should be readable since you will hand it over to other people/developers that will continue to maintain it – making it readable drastically keeps down those costs.

    Semantic code and UI is not the same thing, nor do they exclude each other. If you’re looking for a connected UI, look into the various options with HTML5 Forms.

  • […] web pages with proper semantic mark-up tend to give the best […]

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>