Creating good content mark-up: Content Management Systems vs. hand-crafted HTML
There’s an inner beauty of HTML code that I can never seem to get away from. The wonderful world of semantics – choosing the right element for the right task, something that conveys meaning, makes it more accessible and strikes the perfect balance of different parts of a web page. Which moves us on to Content Management Systems.
Content Management Systems’ editing of today
As a result of this, naturally, it becomes quite saddening when working with Content Management Systems. Usually an Interface Developer code templates for the framework of the web site’s pages, all structural parts, navigation etc. Then on top of that there is a content area, usually the main text of an article, where editors of the company behind the web site will write text and create content.
And depending on your CMS of choice, that content, HTML-wise, could be either terrible, decent or good – but then we’re only talking about the validation level and, at best, some sort of semantics for it. But getting perfect semantics, trying out new/alternative elements (especially with the new-found semantic richness HTML5 gives us) and constantly tweaking the details just doesn’t happen, and the HTML result is usually mediocre.
Where is the problem?
So, where lies the problem? How come practically no detail is put into the HTML of the content area? There are three factors to look into:
- The Content Management System
- The editor writing the content
- The Interface Developer’s role
First, as long as a CMS offers the way to create semantic elements in any sort of fashion, like heading elements, paragraphs, lists etc and outputs valid HTML, it has sort of done its part. They do have a challenge, though, if/when they want content creators to create forms, tables and layout, and they definitely do need to look into HTML5 options as well, for the future.
When it comes to editors, and there are many different opinions about this, I personally don’t think they should need to know about HTML. Their job is to write good content, not to know about code. What they need to know, however, is how to mark up different parts of their content accordingly with headings, lists and so on, and, very importantly, write good and descriptive link texts.
From an Interface Developer perspective, all that can be done is influence the choice of CMS and WYSIWYG editor and try to educate the editor how important it is to have a distinction between different elements in a web page, for SEO, accessibility and code maintenance (when we talk about potential bugs/problems for the Interface Developer to support later on).
What it could be like
A couple of years ago, I was working for a company providing online gaming, and in their case, they had copy-writers putting together good texts with good structure, keyword density, the works. But, they were never allowed to put it into the actual web page – instead, their finished documents were sent to an Interface Developer who carefully made sure to mark it up in the most optimal fashion. Knowing and respecting the importance of good HTML code, for so many reasons, led to extremely streamlined, optimal and maintainable code and great results.
The downside of this, of course, is the potential time it can take for an Interface Developer to do this, but my experience is that getting things right the first time is so much more important than constant fixes and workarounds later on; also, getting things (at least closer to) perfect directly is such a mental relief for anyone involved.
Conclusively, I don’t see an optimal solution for this, but if you have the appropriate skill sets in your team and it works out workflow-wise, I’d recommend having someone who knows HTML be the last one to touch the content before it is being published.
Yes I’ve never understood how a company (or a public service) can pay so much for something like MySource Matrix then any old manager gets to write content and basically “piss on the post”. Worse, it then becomes political about fixing the rubbish content.
In a public sector org it may be better to have a web team that works as a bottleneck. Let the managers write the crap content then the web team ferrets it back with some suggestions and before you know it they mark something up properly – or tell the manager NO.
In a streamlined private sector org you could have the dev do this personally, especially if there’s one person, and the time would be minimal investment in relation to the value created.
Now the trick for both orgs is to get the managers to understand the value of good copy and markup. Mmm that is the sticking point. Because they all think they’re the best copywriters on the planet unfortunately.
Its a good point you raised though Robert because its an endemic part of the realignment we need to do as an industry. So 15 year olds can’t make your company’s website, and anybody can’t write the content.
Just tell them straight – we can do it better, we will make you money. Why? Because bad copy doesn’t sell anything much at all.
Cool way of doing things,
would love to work like that.
I do most of my work in Polopoly or SilverStripe. In Polopoly, the Interface Developer work close the System Developer. It is my responsibility to make sure that the input methods for content have a balance between editor freedom and that the final HTML Document stays in good shape.
Unfortunately, In our common Polopoly solution, we long ago implemented a handmade teaser which lets the editor put in whatever HTML they want whereever they want, resulting in every editor quick-hack their own solutions instead of using our input methods. Sites with 400+ validation errors do exists …
In SilverStripe, it is the same except that SilverStripe is far more quick hacked. Since it is light and easy to work with, it is not needed to have a handmade teaser. Therefore, our SilverStripe sites tend to stay in good condition trough the years.
My five cents:
* Work close the the system developer to make sure it is easy, straightforward and fun to create content in the CMS.
* Make it really hard to not follow the contstraints the Interface Developer has created.
* Never, ever trust the editors to be responsible of the front-end code quality.
Closely related: I wrote about how to approach writing HTML a while back. Basically, minimal and verbose way of going at it are at opposite ends, and the decisions we make for that are usually for non-technical reasons.
Why do we even use HTML as the structured format for authoring? Is there a single web based CMS with a schema aware editor and a simple authoring focused markup language?
I think authoring web content is falling behind in many aspects. Developers complain about non validating content authored in the CMS, but still no one uses an editor which respects a schema or DTD. No wonder we get errors…
Wouldn't a simple markup language focused only on content authoring combined with a schema aware editor be a good start? I see several benefits with this aproach:
1. No more invalid content.
2. Reduce the number of authoring mistakes since they're forced to comply with a schema.
3. A simple markup language focused on authoring would be more understandable than HTML and at the same time it could introduce more semantics if needed (but don't go Docbook!).
4. Simple transition between different output, content not authored in a certain HTML version.
My five ören, and trying to keep it short, so I'll stop here…
I think a really good solution to this problem is to use Markdown. With Markdown you can be sure to _always_ get at least valid HTML, and if the editor learns a few "tags" you'll easily also get semantics.
I agree HTML quality is often a let down by editors not understanding how to edit content in WYSIWYG's, but I'm not sure injecting a web team layer to create all content is really the best approach.
The idea of a good CMS is one that allows non-techy people to edit content and get it online quickly. If content has to go via a specialised interface developer this slows things down and removes one of the main advantages of a CMS. You almost might as well stick to a well-built static site.
I think the best idea would be a mix of education for content writers and CMS tools that encourage semantic code a bit more thoroughly. At present most WYSIWYGs simply let people do anything, hence the utter mess that most user-inputted HTML ends up being. A CMS is really about control, letting people do more with less. Whether there are any CMSs out there that actually addresses this, god knows!
@Andreas L – although I agree Markdown is likely to create better code, it's simply too techy for most normal users. They expect some form of visual editor, it doesn't have to be a perfect representation of their web page – just not with tags or special markup code.
I couldn't agree more! Being that person who knows html and who is usually the last person to look at it, some out of the box CMSs make my life extrememly difficult.
I am lucky enough that I work on sites that allow me to hand code them from scratch so I usually don't have to deal with them.
CMSs do have their place though, don't get me wrong. Some sites greatly benefit from this, but in that case, try to go custom built CMS!
Thanks for the comments, guys!
Yes, it is! 🙂
Never, ever trust the editors to be responsible of the front-end code quality
Exactly. But at the same time, I don't think they should ever be in that position to have that kind of responsibility.
Thanks for the link!
Absolutely, it's always hard to sell/emphasize something to people in charge the most suitable way.
Well, it's an interesting idea. But, at the same time, the upside of HTML is carefully crafting it the right way, and I'm not sure it would always be possible to do based on a content authoring schema, especially not automatically.
Would be interesting to try, though.
Perhaps, but as simon mentioned, I'm not sure it's optimal for ordinary users.
Yes, maybe. I agree about the upside of publishing something directly, but at the same time, in larger organizations, there's already a workflow in place anyway where content has to be approved etc.
I'm just afraid that by demanding of the editor to learn code, it might kill their writing/creating spirit. With WYSIWYGS, sure, we could definitely demand more there…
I'm happy for you! 🙂
Regarding custom built CMSs, I presented my thoughts in Content Management Systems are a dying breed, if interesting.
Wouldn't a good copy writer understand proper document structure and therefore be capable of at least some HTML structure? Perhaps, some challenging components of table structure and others might require tweaking after the copy writing is done.
Maybe. I get the reasoning, structure-wise, but I think HTML is an art in itself. 🙂
Problem, though, is that in most cases the content is generated via a WYSIWYG tool, and hell to try and edit afterwards. Maybe easier if the editors actually coded HTML directly.
Sure, the author who can't handle an editor the right way is often a problem.
But there are many big CMS around which code base has grown over the years but the core output code was never updated. Will say, that many CMS generate (bad) code automatically (.e.g. DotNetNuke) and you can't do anything against it, if you don't want to make your hands dirty by hacking around in the core code…
So the first thing a good CMS must handle is strict separation from code and behaviour
I completely agree.
[…] online content, learn some basic SEO – There are now whole industries devoted to helping you rank better online and get better search results. You probably don’t need all of that; but learning a bit about […]