How to generate valid XHTML with .NET

A common problem is that the Web Forms and Web Controls in ASP.NET generate invalid XHTML. Amongst these errors are invalid attributes and inline elements without a correct block element container, as well as good ol’ HTML comments in a script block, which prevents you from sending your XHTML with the application/xhtml+xml MIME type.

All these errors are automatically created when the page in question is being rendered in the web browser, meaning that even if you write flawless code you will still fail to get it valid.

To the rescue, some solutions to take care of this:

Another option can be to write your own fix and customize it after your specific needs. This should take something from a day and up, depending on where you set the bar.

Or maybe you’re one of the people that hope ASP.NET 2.0 will take care of all this? In that case, I recommend you reading Charl van Niekerk’s posts ASP.NET 2.0 – Part 1 and particularly ASP.NET 2.0 – Part 2.

ASP.NET 2.0 outputs lovely (*irony*) things like:

<form onsubmit="javascript:return WebForm _ OnSubmit();">

and

document.all ? 
document.all["Login1_UserNameRequired"] : 
document.getElementById("Login1_UserNameRequired")

(for the vast IE 4 support required?) and still HTML comments in scripts block.
And when it comes to semantics, structure and unobtrusive JavaScript, it’s a mess.

Don’t get me wrong, I think it’s great that it validates, but only validating doesn’t necessarily make it good code. Validation is just one of the components necessary for a good web page; there’s, for instance, also semantics, accessibility and unobtrusive JavaScript (or, as important, offering a way for it working without JavaScript as well; kind of connects back to accessibility).

My advice to you: some way or another, make the extra effort to make sure your XHTML from .NET is valid. It’s not that big of a deal, and it’s totally reusable in your next .NET-based project.

 

Do you have experience of above techniques to make it valid, or some other way to accomplish it? Please feel free to share!

16 Comments

  • Martin S. says:

    Well, I don't use any of those techniques. I guess I'm quite lucky when it comes to what's necessary in my development – I don't use objects as the DataList which creates invalid XHTML and makes use of JavaScript causing errors. Maybe I'm silly and too naive, but it works great for me.

    Now the only things disturbing me considering .NET is the lack of support for correct (XHTML) markup and objects that could make my work easier if they did validate.

    I've also found out you can't separate content from structure to 100 % when it comes to .NET, accessibility and web standards (can you make a complete separation when building dynamic sites at all?), even though Microsoft seems to have thought some about it – using codebehind is one of the great advantages in .NET if you ask me.

  • Robert Nyman says:

    Martin,

    The automatically generated JavaScripts in .NET are a shame, they seem to be written by someone that doesn't know JavaScript.

    One thing I didn't mention in my post is also that I recommend web developers to turn off .NET's adaptive rendering so it doesn't do what its default setting is: generate (invalid) <acronym title="eXtensible HyperText Markup Language">XHTML</acronym> to <acronym title="Internet Explorer">IE</acronym> and some mix of <acronym title="HyperText Markup Language">HTML</acronym> 3.2 and (one again, invalid) <acronym title="eXtensible HyperText Markup Language">XHTML</acronym> to all other web browsers.

    To quote Milan Negovan from his post Dissecting Adaptive Rendering (Milan is a guy with the rare interest ASP.NET and web standards):

    It's a subject for a heated debate which browsers are superior and which one(s) are "downlevel", but the fact is Internet Explorer/Win is badly outdated and yet is given the upper hand, while more advanced browsers are still treated by the engine as "downlevel" and their rich capabilities aren't taken advantage of.

    The only downside to this is that the validation controls generate scripts that will only work in <acronym title="Internet Explorer">IE</acronym>. Solution? Tweak the controls, write your own or only do server-side validation (to me, the ultimate validation is made both on the client-side and the server-side)?

  • Yeah remember to turn off adaptive rendering. The whole idea of that is so nineties.

    A couple of other things: turn off or reduce Viewstate as much as possible and only wrap that weird form that .net insists on using around whatever part of the page that actually needs it instead of around the whole thing. That way you can use normal forms as well when needed (i.e. have more than one form without faking).

    And I'm not in the least surprised that ASP.NET 2.0 still is way behind.

  • Kalle Wibeck says:

    Does anyone know if there exists a set of controls that are made with focus on quality instead of speed in development?

    Martin,

    Of course you can build dynamic sites with a complete separation!
    Just because ASP.Net contains these invalid controls doesn’t mean you have to use them… πŸ˜‰
    .Net controls are one thing, a dynamic website something completely different.
    The must powerful way to create a complete separation is to let your CMS’s business logic end in an XML structure where you let an XSLT powered presentation logic take over…
    Programming language isn’t really an issue here, this kind of layered architecture is possible to create in must languages available, even though I prefer .Net since I don’t master Java very well ; )

  • Robert Nyman says:

    Roger,

    Thanks for some additional tips!

  • Robert Nyman says:

    Kalle,

    Absolutely! But who does? πŸ™‚

    I don't want to limit the system developers in what kind of controls they can use, I want to fix what those controls outputs.

    Otherwise, I do like .NET as a development environment, and I also relly like a structure where the presentation layer is generated from <acronym title="eXtensible Markup Language">XML</acronym> through <acronym title="eXtensible Markup Language Transformations">XSLT</acronym>.

    <acronym title="eXtensible Markup Language Transformations">XSLT</acronym> allows total control of the output (you have to write <acronym title="eXtensible HyperText Markup Language">XHTML</acronym> which you then choose to output as <acronym title="eXtensible HyperText Markup Language">XHTML</acronym> or <acronym title="HyperText Markup Language">HTML</acronym>), and it also offers a lot of good ways to handle the data.

  • Kalle Wibeck says:

    Robert,

    In the name of validity my opinion is that there are some built in ASP.Net controls you shouldn't use at all since they messes thing up too much.

    I respect your "non-limiting" point of view but in my world these fixes merely treats the symptom, not the actual disease.

    A set of tailored w3c-compliant user controls on the other hand would indeed treat the disease…

  • Robert Nyman says:

    Kalle,

    I get your point, and of course the ultimate goal is to "treat the disease".

    Another reason for trying not to exclude certain controls is that it's easier to sell in to system developers in projects.

    If I, for instance, were to tell them that they can only use HtmlControls instead of WebControls, it would be a hard sell.

    Hence, for now trying to get the current ones compliant might then be an option to stay friends with them and at the same time get what you want… πŸ™‚

  • Martin S. says:

    Robert, the Dissecting Adaptive Rendering link was kind of great, I will take a closer look at it later. Thanks. πŸ˜‰

    But which language today is the best when you want to completely separate content from structure? It isn't fully possible with PHP, but how is it with Java?

    I use .NET and will do so 'til I find a perfect programming language and development environment. It still works for me, even though it some times mess things up and makes me spend time on things that shouldn't have messed from the beginning. πŸ˜‰

  • Robert Nyman says:

    Martin,

    Thank you. Or rather, thank Milan who wrote it.

    If possible, I go for the <acronym title="eXtensible Markup Language">XML</acronym>/<acronym title="eXtensible Markup Language Transformations">XSLT</acronym> scenario. And that is an option in .NET as well as Java or PHP.

  • Robert Nyman says:

    Josef,

    I just wondered if this is of any use to anyone?

    Well, that is up to one and each to make that call. But to me, it’s a start to get rid of (some of the most common) invalid code automatically generated.

    To disable all invalid things, you need to do some more work. But basically, use inheritance in .NET to override those things that aren’t apropriate.
    Another good practice/tip is not using WebControls, but HtmlControls instead.

    Thanks to the link to the web site with screenshots of the Visual Studio.NET dialog settings. They’re something every System Developer should take a look at.

  • Josef says:

    I just wondered if this is of any use to anyone? It helps slightly, but it .NET is still really bad when it comes to fucking with your code automtically. Anyway have a look at this:

    Some tips for turning off automatic formatting in .NET

    I hope the next version of .NET improves!

  • Josef says:

    Oh sorry Robert, i was misunderstood, when i said "I just wondered if this is of any use to anyone" i meant the website that i made, not your post. Sorry if you misunderstood me. Im from England πŸ˜‰

    But you are right, i really hope .NET will improve.

  • Robert Nyman says:

    Josef,

    Ha ha! πŸ™‚

    No. no, my fault. Now I understand what you meant!

    And yes, I think the dialog settings in Visual Studio.NET are good to know for everyone.

  • inoodle says:

    Hi,

    Had a brief search around this area recently, prompted by your article… this seems ok so far.

    RiderDesign XHTML 1.0 Transitional Filter

  • Robert Nyman says:

    inoodle,

    Thanks for the tip, looks interesting!

    I'll have to look into that one.

    By the way, I took the liberty to change the text of your link, since it was a little bit long. Hope you don't mind. πŸ™‚

Leave a Reply

Your email address will not be published. Required fields are marked *