Explaining semantic mark-up

I have a strong interest in semantics in general, and when it comes to web developing, the benefits of properly marking up a document should not be neglected. One problem is that some people don’t understand the difference it makes, so therefore let me humbly make an attempt to explain why semantics is important.

What is a semantically marked-up document?

I think it’s important for a web developer to view HTML documents without any external formatting applied. That means without CSS, no JavaScript enhancement, and, if you want, no images as well; instead just the raw content. Look at it, read it through. Does it make any sense? Do you understand which parts are more important than others, which texts are headings, which parts are connected to each other?

If the answer is yes, the document is probably marked up in a nice understandable semantic fashion.

Element usage and code examples

Let us first talk about the base of semantic HTML elements. These are heading elements (<h1> through <h6>) for all kinds of headings, paragraph elements (<p>) for paragraphs of texts, list elements (<ul> and <ol>) when listing things such as navigation markup etc. Basically, think about the meaning of that particular piece of content you want to come across, and mark it up accordingly.

If we take it one step further, make sure to use elements such as <em> or <strong> if you want to emphasize or respectively make something appear more important. When it comes to forms, use <label> elements to connect label texts with their corresponding form fields.

Given the limited ways we once had to create layouts, <table> elements have been (and unfortunately, still are) heavily overused to achieve this. Newcomers to CSS-based layouts on the other hand shy away from tables like the plague, and believe that a document containing one single table is truly evil. Neither of these views are correct.

Tables are only meant to present tabular data, where they definitely should exist, and they’re not there to decide where you want to have columns in your web page. Tabular data means anything that you would present in a table within a normal document, and be such things as statistics, distances between destinations, train time tables etc.

People making the transition away from tables usually get the div-itis, meaning that they use <div> elements for everything instead. This is just about as bad as using tables for every context, since div elements have no meaning but acting as containers for parts of a web page.

Let me exemplify all this with a bad code example compared to a semantically improved version:

A bad example


<table id="web-site-container" width="100%">
	<tr>
		<td id="navigation">
			<table width="100%">
				<tr>
					<td>
						<a href="http://www.apple.com/macosx/">Mac OS X Leopard</a>
					</td>
				</tr>
				<tr>
					<td>
						<a href="http://www.microsoft.com/windows/products/windowsvista/default.mspx">Windows Vista</a>
					</td>
				</tr>
				<tr>
					<td>
						<a href="http://en.wikipedia.org/wiki/Semantic_Web">Wikipedia: Semantic Web</a>
						<table>
							<tr>
								<td>
									<a href="http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21">The Semantic Web article</a>
								</td>
							</tr>
						</table>
					</td>
				</tr>
			</table>		
		</td>
		<td id="content">
			<div class="heading">
				Web site name/Document name
			</div>
			<!-- Let's a lot of <code><br></code> elements here to get a nice bottom margin -->
			<br><br><br><br><br><br>
			<div>
				This is <span style="font-style: italic">the best content</span> text ever written.
			</div>
			<div>
				<!--   are great for indenting text! -->
				            
				Indented pre-amble text explaining something.
			</div>

		</td>
		<td id="contact-form">
			<form action="/contact" method="post">
				<div>
					Name: <input type="text">
					<input type="submit" value="Send">
				</div>
			</form>
		</td>
	</tr>
</table>

A better example with actual semantic meaning


<div id="web-site-container">

	<div id="navigation">
		<ul>
			<li>
				<a href="http://www.apple.com/macosx/">Mac OS X Leopard</a>
			</li>
			<li>
				<a href="http://www.microsoft.com/windows/products/windowsvista/default.mspx">Windows Vista</a>
			</li>
			<li>
				<a href="http://en.wikipedia.org/wiki/Semantic_Web">Wikipedia: Semantic Web</a>
				<ul>
					<li>
						<a href="http://www.sciam.com/article.cfm?articleID=00048144-10D2-1C70-84A9809EC588EF21">The Semantic Web article</a>
					</li>
				</ul>
			</li>
		</ul>
	</div>

	<div id="content">
		<h1>
			Web site name/Document name
		</h1>
		<!-- Bottom margin is applied through CSS to the <code><h1></code> element -->
		<p>
			This is <em>the best content</em> text ever written.
		</p>
		<!-- Indentation is applied through a general "pre-amble" CSS class -->
		<p class="pre-amble">
			Indented pre-amble text explaining something.
		</p>

	</div>

	<div id="contact-form">
		<form action="/contact" method="post">
			<div>
				<label for="user-name">Name</label>: <input id="user-name" type="text">
				<input type="submit" value="Send">
			</div>
		</form>
	</div>

</div>

Don’t mix up presentation and meaning!

This point can’t be emphasized enough! They way you mark up content has no relation to how you want it to be presented within your design. Everything presentation-related should be controlled through CSS. Don’t use headings just to get a larger font, <blockquote> elements to have a text indentation and so on.

List element have a semantic meaning, and it doesn’t mean that everything that’s a list has to be presented with a bullet or number before each list item. Let me say it again: it’s about meaning, not about looks.

The benefits of semantics

The benefits of using good semantics in a document are:

  • It will be more accessible to people seeing the document in an environment where CSS cannot be applied.
  • It will be understandable and coherent to people having it read to them with the help of a screen reader.
  • It will help to get a better search engine ranking, since search engines can easier distinguish the importance level of the document’s different parts and what message is being conveyed.
  • It will be a lot easier for web developers to maintain the code, and to separate content (HTML) from presentation (CSS).
  • In most cases, there will be less code, which isn’t cluttered by formatting, meaning that the web page will be faster to load.
Posted in Developing,HTML5/HTML/XHTML,Technology |

Leave a Reply

Your email address will not be published. Required fields are marked *