Do We Really Need XHTML?

I've been following the development of HTML for a long time, Web-wise. I first encountered it in 1993, and eventually wrote some tutorials on the subject. Boy, those were the days: there was nobody around to explain to me that <P> wasn't supposed to be a shorthand for <BR><BR>. (Of course, there doesn't seem to be anyone explaining that to Web design package vendors to this day, so maybe I shouldn't feel too bad.) Over those years, we went from simple document markup to tables, frames, and so much more. A lot of crap got produced, and some of it slipped into the specification itself.

So when I heard that HTML was being translated into XML, I was intrigued. About ten seconds later, I'd arrived at disgusted. I admit I was a bit slow on the uptake, but it was a Tuesday morning. Sue me.

Sure, it sounds good: a version of HTML which has to be structurally correct. Consistent syntax, a move away from the presentational cruft which has been clogging up HTML for years now, and it's XML. And should anyone ask why basing a language on XML makes it better, let me say this: BECAUSE IT'S XML, OKAY? IT'S JUST AUTOMATICALLY COOL! YOU WILL BOW DOWN BEFORE THE MIGHT OF XML, WHOSE MAGNIFIENCE YOU ARE NOT WORTHY TO CONTEMPLATE!

Um, don't mind me. I was channeling Murray Altheim for a second there. So like I was saying, XHTML sounds like a good deal for clean markup. At first. Then I remembered that HTML started out pretty clean too, back when there were just a few scientists playing with it. Then Marc Andreessen came along and, in the process of popularizing the Web and making it the huge success story it is, made some very bad decisions on how to extend it. To his credit, Marc has largely admitted his errors in this area, so there's some hope that he'll come to his senses in other areas as well. As for other browser programmers, some have been slower to realize their own hubris. In fact, I can think of at least one group which is fond of insisting that its bad decisions are widely used, and therefore popular, and therefore definitionally good decisions.

Of course, this kind of thinking is at least more practical than the reasoning of some standards bodies. Their thinking usually goes like this: our decisions are definitionally good. Compelling, isn't it? Kind of makes you want to contribute some feedback to the process, like smacking the hell out of somebody. If I wanted dogmatic assertions of revealed truth, I'd get my ass to a church (or listen to tapes of a presidential candidate debate).

So why do we need XHTML? We don't. What we need is HTML 5, with some useful extensions and maybe some deprecation of things which HTML no longer needs. Unfortunately, we're never going to get it-- and don't let someone try to tell you that XHTML is it. There are too many stupid decisions surrounding XHTML, and basing it on XML was one of the biggest. HTML can and should be its own thing. Understand that I have nothing against XML: it has its uses, and some of them are very cool. However, XML is a very complicated beast, despite its billing as the panacea for all that ails us.

One of the biggest strengths of HTML was its simplicity and fairly minimal set of requirements. You could write it in any combination of upper- and lowercase, and browsers didn't care. It has a very simple syntax. It asked only that authors and tools respect a fairly simple structure and not mess up their attribute quoting. As it turns out, even that was too much. Once the browsers started trying to save authors from their own mistakes, it was all over. HTML had fallen in with a bad crowd, and we all stood by (or even cheered it along) as it got more and more rotten. Sure, go ahead, blame the browser vendors all you want, but they're only part of the story. The other part is the unthinking encouragement that we gave every time we wrote mangled HTML, expected it to work, and filed bug reports when it didn't do what we meant.

So now someone's taken the child, after years of neglect and abuse, given it a new set of clothes, and sent it back to run with the same crowd that corrupted it the first time around. Why should we think things will turn out any differently this time?