Thoughts on HTML 5
People have asked of my opinions about HTML 5 and the road they’re taking. Basically, I feel that you need to do a lot of reading up to actually be eligible to have an opinion, so I’ll try tread lightly with mine, and only cover certain areas.
Other factors matters why i haven’t written about this before: HTML 5 in practice, widespread amongst web browsers is very far away. Simon Pieters estimate, from a Geek Meet we had, is a whopping 15 years until it’s fully and properly implemented by web browsers. Just imagine how much Microsoft and Adobe will have come up with by then, and forced out into the market.
Also, really smart folks like Tommy and Roger have written about it more thoroughly in Forward Towards the Past respectively Another look at HTML 5, and basically I agree with everything they have written.
But, in general, I think HTML 5 is a good thing with some sensible goals and ambitions. We need a new standard, and we need a plain text/html
version and an XML route. One of the best benefits of it is more and better suitable form elements for various needs.
Problems I see
Perhaps not fair nor balanced, but I’ve decided to focus with this post on the negative. Things that I personally believe can, and should, be improved.
For text/html
: It allows sloppy code
In general, you can add attributes to an element in any fashion you like. All of these are valid:
<input disabled>
<input disabled=disabled>
<input disabled='disabled'>
<input disabled="disabled">
Quick closing of elements is also optional, meaning that any of the below, mixed together in the same document, are ok:
<input type="text">
<input type="text" />
<link href="css/main.css">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
To me, some of the code above is an abomination. The web is riddled with piss-poor code, and allowing all of the above certainly won’t change that. My stance is that any attribute must have a value, and it must have quotes (preferably double-quotes, but I can live with single-quotes, as long as they’re always there).
My take is that the HTML Working Group assumes that one single web developer will choose or the other and then consistently go with it. To me, that’s just naive. That’s not how it works in the real world; lots of web developers work on the same code, and allowing all kinds of formatting means that there will be all kinds of formatting in the resulting code.
Web developers are lazy, and a number of them are ignorant, and if not enough discipline is forced upon them, it will be chaos. I definitely understand that they don’t want to have too strict guidelines because then no one would use it, but come on. Proper code for attributes and closing elements can’t be that hard, can it?
No version number in the DOCTYPE
This means that it will be impossible to distinguish what version of HTML is being delivered. The problem with this is that you can’t tell on the server what version it is to, for example, pre-validate it. Another side-effect is that any implementation of an HTML element can never be changed. So, say something is sub-par, we have to live with it through HTML 6, HTML 7 etc.
I understand the sensibility of backwards compatibility, but sometime one has to put the foot down and move on. The ability to improve less good things have to be there. Otherwise, we will, in many years time, have a gigantic toolset (where 50% is faulty/unusable) and every web browser will be bloated like there’s no tomorrow, just to cover up for something implemented the wrong way 10, 20, 30(?) years ago.
font
is still in there
I sincerely can’t believe that the font
element is still allowed. Or, even worse, it’s not allowed for web developers, but only über-crappy WYSIWYG tools, given that they identify their true suckiness by a meta
element. This is so thoroughly wrong!
If, in say 15 years time, WYSIWYG editors still can’t format code any better way than with a font
element, I say it’s time to abolish them completely. My suggestion for WYSIWYG makers is that the only formatting options should be good semantic HTML elements and through CSS. The CSS classes should then be made available through a custom web developer CSS in the solution, which in turn will present it properly to the public viewer.
Alternatively, the WYSIWYG tool, when in a CMS, could offer a few default formatting options with CSS, which will be in a CSS file that is automatically included for the presentation layer.
Attitude
I completely agree with what Roger writes that there definitely seems to be an attitude problem. Some people are, to a certain degree, condescending towards others and seem to think that they know it all. Please, let me inform you that no one knows it all.
When I was younger, I thought I could do everything better than older people who hadn’t delivered a perfect solution. Naturally, in certain areas, I was wrong. Hell, everyone in the world can step in and think they can do something better than anyone else previous to them, but the vital mistake in that approach is that you never have all the facts, you don’t know what problems and obstacles the person before you ran into, and during what circumstances it was developed.
And that’s why it’s so saddening, especially with people who are around 20, coming off with such an attitude. Sure, they can be frickin’ geniuses, theoretically, but unless you’ve worked with web developing and customers for a long time, you’re not really the one in title to condemn others (and really, no one ever actually is in title to do that).
So, unless you can’t be humble, open for other people’s opinions, and most importantly, respectful, don’t waste your time trying to tell others how to do their job, because they sure as hell won’t listen to you. Please don’t bother, if you’re more interested in becoming a dictator than a democrat.
Conclusively
Ok, I got ranting a little there. But, HTML 5 can be a good thing, it really can. But please, I humbly ask you to consider and think through what I’ve written above, and remember that all other emerging technologies and companies certainly won’t stand still while the work proceeds.
Swapping one tyrant for another
Your definition of sloppy is possibly subjective. <code><input disabled></code> has always been valid HTML. What problem would be solved by disallowing it now?
While I understand that you wanted to illustrate syntax here, <code><link href="CSS/main.css"></code> and <code><meta HTTP-equiv="Content-Type" content="text/html; charset=UTF-8" /></code> actually aren't conforming HTML5.
What problem is solved by forcing consistency? Note that XML doesn't enforce consistency either. Single quotes vs. double quotes, empty element syntax vs. start tag followed by end tag, entities vs. decimal character references vs. hexadecimal character references vs. plain characters, etc., etc.
If you want a consistent format, then you can use the format that is output by the .innerHTML algorithm. To check that a document is in this format it should be possible to roundtrip it through innerHTML and compare the input with the output.
We are not locking ourselves into a corner by not having a version number in the doctype in HTML5. If we find that we need to introduce versioning in HTML6 then it can be done then — there is no need to do it now because there might possibly be a need to do so in the future. Compare with CSS.
For the time being, you can assume that pages that start with <!doctype html> are HTML5 documents. If there will be an HTML6 that uses the same doctype, then you know that it will be backwards compatible with HTML5 (otherwise it would have introduced versioning) and you can assume that it is HTML6.
I can understand that you are frustrated, but your proposed alternative (to change WYSIWYG tools to work with semantics exclusively) might not be very realistic. Ask Daniel Glazman. Basically, the situation is that WYSIWYG tools have presentational features, and they will continue to have them. The question is what markup should they emit in such cases?
Browser vendors will want to continue to support today's Web in the future. Thus, they will continue to support old markup. Introducing a new rendering mode where old elements aren't supported doesn't make the browser architecture any simpler — it actually makes it more complex, making it even harder for a new vendor to enter the market. Not specifying how old markup is to be treated doesn't help either, because then browser vendors are simply forced to reverse engineer each other instead of following a spec. The latter is far cheaper and leads to more interoperable implementations than the former, and is one of the motivations for browser vendors to work on HTML5.
Who the shit let these ignorant lazy fucks on the internet? How exactly did they get their internet publishing license?
Jesus Roger, loosen up. Chaos? Please, do explain! What would come of us if your Grandmother didn't close her ! And woah what if she didn't nest her tags correctly!
And getting all huffy about single or double quotes! That just tips the bucket man. You've gone off the deep end.
One question I often ask my self is if it is useful to have a language that needs a very strict coding if nobody takes care anyway. Many web-developers write pages in xhtml 1.1 but don't do it valid. So as long as the UAs display this crap it's pointless having a language that insists in strict code.
And, something to laugh (or to cry):
http://www.tagesspiegel.de/
This german page was recently redesigned. Take a look at the code; take a look at the page with disabled css. It's a horrorshow 😉
zcorpan,
Thank you for long and thorough replies, I appreciate it.
<blockquote cite="http://www.robertnyman.com/2007/06/07/thoughts-on-html-5/#comment-67361">
What problem is solved by forcing consistency?
In development environments, the more options you give web developers (meaning options than can be contradictory to each other), the more inconsistent code will be.
And, for maintenance purposes of code and I guess alo web browser logic, I believe it would be better with fewer options to accomplish the same thing.
Interesting about doctypes. If that's the case, I can live with HTML 5 not having a version number.
To me, WYSIGYG editors should only produce elements like <code>H1</code>-<code>H6</code>, <code>p</code>, <code>em</code>, <code>strong</code> (well, actually, it would be nice to have <code>i</code> and <code>b</code> as well).
When it comes to other things, such as font-size etc, it should be applied by adding a CSS class to either the parent element, if everything in is selected, or additionally add a <code>span</code> element around the inline part that will be formatted.
The CSS classes offered should come from a CSS file that ther web developer can supply, which is the same that will be used to present it publicly.
In worst-case scenarios, where the WYSIWYG content is supposed to be completely stand-alone, I'd rather prefer a <code>style</code> block with the classes used.
I can't really relate to why it would be harder for a web browser to remove support, or alter, support for some elements, but then again I'm not a web browser manufacturer so I can't really tell.
Charles,
Ok, I'm Robert, not Roger, but that's ok.
When I do something, I want to do it as proper and good as possible, and I hope other people think the same way. The reason search engines give us poor results, that we have to download a lot of extra superfluous code and that the accessibility state of the web is so poor, is because of poor coders.
If we want the web to be a serious tool, we have to do our jobs properly. Especially for web developer consultants billing at least $100 or more per hour, the least I think we should expect is for them to write valid code.
If my grandmother, or any other happy amateur web site maker write invalid code, I couldn't care less. On the contrary, I'm just glad that they want to share and love the web.
What I target here is solely web professionals building web sites for multinational companies or, even more important, the public sector, where a lot of people depend on being able to access the web sites they build.
Chris,
It's a good point. If the error handling in web browsers is so extensive, what's the incentive for writing valid code? Well, rendering-wise, not much. But for maintenance, cross-browser and cross-platform compatibility, it's still vital.
Thanks for the link, it was truly horrendous.
First, my apologies for getting your name wrong.
Is WHATWG targeting you, a serious web developer, or are they putting in place a specification that will deal with the chaos that you speak of. From what I've read from Ian H., this specification is meant to stand the test of time, to deal with syntax error's gracefully, etc. etc.
As a serious web developer, I doubt merely writing code to a some specification will earn you the $100/hr jobs. I might imagine that the luster of "I write valid XHTML" might wear off sometime soon. Not that it's not important, but it's just part of the pie.
Maybe XHTML 2 is more fit to your desires?
What is considered conforming for authors to use does not affect web browser logic at all.
I can understand that consitent coding style can ease maintenance in a development team, but is document conformance the right place to enforce it? Isn't it better to have an internal style guide in the development team? You might need it anyway if you want consistent bracing style for the programming languages you use, which a conformance checker won't see.
Charles,
<blockquote cite="http://www.robertnyman.com/2007/06/07/thoughts-on-html-5/#comment-67524">
As a serious web developer, I doubt merely writing code to a some specification will earn you the $100/hr jobs.
Absolutely not, but it's definitely a good start to know how to write proper code, and all the benefits of having valid code.
zcorpan,
<blockquote cite="http://www.robertnyman.com/2007/06/07/thoughts-on-html-5/#comment-67566">
What is considered conforming for authors to use does not affect web browser logic at all.
I can understand that consistent coding style can ease maintenance in a development team, but is document conformance the right place to enforce it?
It's an interesting discussion. Who is the document main target audience? Is it to make life easier for web developers of web browser vendors? I imagine the answer, naturally, is both, but what if you come to a point where you have to choose?
I can't really tell if the right place to enforce it is in a document or not. Personally, I think I'd prefer that, but maybe many other wouldn't.
And yes, every team should have a style guide no matter how strict or not the specification says.
To put things into perspective, the choice of quoting does not cause any technical harm and XML does not require you to be consistent with your attribute quoting, either.
Again, to put things into perspective, the choice of void element syntax does not cause any technical harm and XML does not require you to be consistent with your empty element syntax, either.
In HTML, quote omission and boolean attributes without an equals sign have been proper for years. In HTML, the slash at the end of void element tags was previously non-conforming. It was too hard for many people and a change to allow the slash in HTML5 was lobbied in against Hixie’s personal opinion.
The current spec language regarding <code>font</code> is a failed experiment. I would not spend too much energy on that part of the spec at this time.
The main target audience is implementors of software that consumes HTML. I expect tutorials, O’Reilly books, etc. to be published for Web authors.
You are welcome to configure your editor to use the indent style of your choice, but I think the definition for document conformance is the wrong place to enshrine your favorite indent style as the only right one.
Robert, thanks for your feedback. Regarding font, you're certainly not alone in your opinion. I'm sure it will be getting removed from the spec in due course, it's not something you need to stress about.
Regarding code conventions, like attribute quoting, attribute minimisation, void element syntax, etc. that's really a matter of personal opinion and I can guarantee that if the spec tried to enforce one particular convention to please people like yourself, there will be plenty of other people complaining. Enforcing code conventions would be the job of a lint (e.g. like HTML Tidy), not a true conformance checker. Authors are free to make use of lints that warn about conventions if they like, but we shouldn't enforce one set of conventions upon everyone through conformance requirements.
The attitude problem seems to be a result of people not accepting their ideas being questioned or constructively criticised and then getting all defensive. It's unfortunate, but when one side gets all defensive and argumentitive, so does the other side and the problem escalates into attacking each other. I know I've been attacked several times just for questioning another's POV and/or asking for clarification. It would help if people would remain calm and rational.
Henri,
Good seeing you here!
I agree that it maybe doesn't cause any technical harm; my opinions are from the perspective of working with a number of different web developers with various skill sets, on the same code, and therefore I personally prefer stricter guidelines to enforce more consistency.
<blockquote cite="http://www.robertnyman.com/2007/06/07/thoughts-on-html-5/#comment-68365">
In HTML, quote omission and boolean attributes without an equals sign have been proper for years.
Oh, absolutely, proper as in correct, but just less consistent. Less experienced web developers seem to have an easier time grasping XHTML since everything has to be closed, and you aren't allowed to omit any end tag, as opposed to HTML, which takes more skills and knowledge to know when it's ok and when it's not.
<blockquote cite="http://www.robertnyman.com/2007/06/07/thoughts-on-html-5/#comment-68365">
In HTML, the slash at the end of void element tags was previously non-conforming. It was too hard for many people and a change to allow the slash in HTML5 was lobbied in against Hixie’s personal opinion.
Given Hixie's famous document, I can imagine that he didn't like that decision. 🙂
Also, good to hear about <code>font</code>!
Lachlan,
I won't get too upset about <code>font</code> then, I promise. 🙂
Yes, I understand that enforcing a certain convention will certainly upset some people, and maybe the document isn't the right place. It's just that my personal experience is that you need to be tough with web developers, or they will just ad lib too much. 🙂
<blockquote cite="http://www.robertnyman.com/2007/06/07/thoughts-on-html-5/#comment-68427">
It would help if people would remain calm and rational.
Most definitely. I think that the discussion here, for instance, has been great. Commenters with strong opinions and great knowledge has been humble and completely open to listening to other people's input and tried to understand their perspective.
Thank you!
Actually, <code><meta HTTP-equiv="Content-Type" content="text/html; charset=UTF-8" /></code> is not valid <abbr>HTML</abbr> 4 strict, even if it probably works in most user agents. Strict mode doesn't allow ending slashes in meta tags.
I'm not sure that is going to change in <abbr>HTML</abbr> 5.
David,
Oh, I know. But why I expect that it will be ok in HTML 5 is from Simon Pieters' statement:
<blockquote cite="http://blog.whatwg.org/html5-geekmeet">
Do you need to be consistent with the use of /> vs. >?
No.
I think the most pressing problem is that it does not have a version annotation in the doctype! It sure will give web browser developers in the future problems…
Daniel,
If it will be as zcorpan suggests above, I think it can become a reasonable situation.
About quotes: I totally agree, but the problem is about users. We (web-developers) are a minority. HTML and webpages are made by people. Your mother who want to show something she do, your brother that want to make a website about his favourite band, et cetera. In that way the only problem is WYSIWYG editors… They provide crappy code by default and the users don't really care, they want a webpage, and they want it very quickly!
I'm not particularly pleased with disabled="disabled", this is redundant and waste time to write. Using only disabled is not the correct way, but maybe a status attribute will be better?
I agree for the "no version number in the doctype", but doctype is unfortunately something we write, not users. They don't care about which version of HTML is used…
Font is something we hope to never see again. But for compatibility we need it. And I'm sure that some WYSIWYG editor or scripts still use it…
Do you think that a span tag will be better? I don't really think so…
> especially with people who are around 20 […] don’t waste your time trying to tell others how to do their job, because they sure as hell won’t listen to you.
I'm 22 and I agree but some people listen, the problem is how to tell.
[…] Note: this post was inspired by Robert Nyman’s Blog. […]
Nicolas,
Thanks for your comment; seems like we agree on most things then.
<blockquote cite="http://www.robertnyman.com/2007/06/07/thoughts-on-html-5/#comment-68741">
Do you think that a span tag will be better? I don’t really think so…
For inline formatting, yes. But if it's for a blocke element, I think it should be applied as a class to that, and not add any extra elements at all.
<blockquote cite="http://www.robertnyman.com/2007/06/07/thoughts-on-html-5/#comment-68741">
I’m 22 and I agree but some people listen, the problem is how to tell.
Oh, definitely. Naturally people should listen, but how to tell things to get them across is about respect and skills, too.
In (X)HTML5, <code>disabled=""</code> and <code>disabled="disabled"</code> are both conforming.
In HTML5, the empty attribute syntax (i.e., without ="") means that the value is empty, and so <code>disabled</code> and <code>disabled=""</code> are equivalent.
Does that help remove the redundancy?
zcorpan,
To me, at least, that sounds fine. But I agree with Nicolas that its actual value, the same you would give it when scripting it, would be better. For example:
<code>disabled="true"</code>
Do your team rules require a consistent quote character in XML and e.g. Python?
In general, it makes sense to use single quoting in human-written code, because single quotes are more ergonomic to type on common keyboard layouts (and Dvorak, too). Machine generated code is biased towards double quoting, because the entity asymmetry in IE.
We could do
disabled="true"
but due to legacy reasons we are unable to dodisabled="false"
. Allowing onlydisabled="true"
as conforming would cause confusion, so it is better to allow neither as conforming.Henri,
<blockquote cite="http://www.robertnyman.com/2007/06/07/thoughts-on-html-5/#comment-71154">
Do your team rules require a consistent quote character in XML and e.g. Python?
It naturally depends on the team and on the project, but the goal (not saying that it happens all the time) is to be consistent no matter the language.
<blockquote cite="http://www.robertnyman.com/2007/06/07/thoughts-on-html-5/#comment-71154">
We could do disabled="true" but due to legacy reasons we are unable to do disabled="false". Allowing only disabled="true" as conforming would cause confusion, so it is better to allow neither as conforming.
If that's the case, I agree. Consistency is the main objective to strive for, in my opinion.
[…] Robert Nyman thoughts about HTML5 […]
You mention the attitude of HTML5 advocates. I personally found Henri Sivonen to be the worst. Being somewhat dyslexic I had problems with deprecated and depreciated – having never met the difference (I still see deprecated *as* depreciated) – I found myself ridiculed.
If this man wants to mock disabled men, maybe he should not be involved in web standards featuring so much accessibility issues.
For more evidence of his "attitude" http://hsivonen.iki.fi/wannabe/
But then, he's a man who has lived in academia for a long time, perhaps his social skills are not quite tuned to the real world yet.
Former WHATWG member,
Although I don't really want to point fingers here, I appreciate you sharing your experience with the work. And following some of the discussions, I do agree that the understanding about accessibility in the group seems to leave you wishing for more.
Yes, I have this issue with the self-closing tags. I’d like that as soon as I use one, I should be made to use it on all elements. Otherwise I need to use two different tools to get the valiity and well-formedness of the xml serialization. Not a big problem, as if I’m sending as XML then I probably instinctively know wht I’m doing.
The missing version number sounds like a good thing. I was writing the html5 dtd at the top of my pages before I ever learned about html5…. I was experimenting to see how short I could get it without non-xml browsers falling into Quirks mode… I can’t be bothered writing version numbers. If HTML6 is non-backwards compatible (let’s cross our fingers?) maybe all browsers will be incapable of Quirks mode and then we won’t even need a DTD or any of that other stuff.