Slashdot’s Validity

Published 12 years, 5 months ago

With the Redesign Watch back up and running, the most recent entry is Slashdot, the venerable geek portal so infamous for its ability to kill web servers with a single link that the site’s name is a verb meaning “to bring a server grinding to a halt”.

I was asked in a comment:

What’s your feeling on slashdot being HTML 4.01 (and slightly failing validation) VS XHTML 1.0?

My feeling is good.  Why?  Let’s take the second part first.

When it comes to HTML versus XHTML, I just do not care.  Sure, sure, people will tell you that XHTML is XML so it’s more transformable or something.  That’s a very good argument when the XHTML is well-formed and valid.  It’s also a very good argument for using HTML when it’s well-formed and valid.  Conversely, neither HTML nor XHTML is easily transformed when ill-formed and invalid.  This is an experiential point of view, too: I’ve written XSLT (which is itself so tortuous and ugly that it almost by definition cannot be called well-formed) to transform both HTML and XHTML, and the effort is pretty much the same each way—assuming well-formed, valid markup.

So as far as I’m concerned, there’s really no major practical difference between HTML and XHTML.  There are plenty of minor practical differences, like having to throw trailing slashes on all your empty elements in XHTML and needing some namespace information.  Some people will tell you the whole MIME-type thing is a major practical concern, but I’m just not that much of a purist.  Take that for whatever it’s worth.

I mean, imagine a world where Slashdot had used XHTML instead of HTML, and was failing validation.  How would that be any better or worse than things are now?

Okay, so that’s the second part.  The first part, the failure to validate, is not something I can get too terribly upset about.  Slashdot, as a site that accepts ads, is going to get horrible markup shoved into its pages.  That’s just the way it is.  If you want major sites to be perfectly valid, then in all honesty advertisers are the place to start.  So they’re already operating with a major handicap there.

Even if we were to ride our high horses along a very hard line and say that ads are just no excuse, I’d be hard-pressed to fault the job they’ve done.  For example, I ran a check on the Slashdot home page.  Out of 1,262 lines of code, there were exactly four validation errors, and that’s using HTML 4.01 Strict—you’ll note they bypassed Transitional, which only increases my respect.  Three of the errors revolved around an image in a noscript element, and the last was due to the presence of a language attribute on a script element—something they can fix in fifteen seconds, once it gets to the top of the to-do list.

You know what?  I’d be ecstatic to have that low a failure rate when launching the markover of an incredibly complex site like Slashdot.  Think about all the content they have to manage, stitch together, and offer up.  Four errors out of all that dynamically assembled markup?  I say somebody should organize them a parade for doing such a good job, and showing that any site can make use of and benefit from standards.

I’m also really looking forward to the restyling of Slashdot through user-created style sheets, and the Greasemonkey enhancements built on top of this new structure.  If there’s a site whose readers are inherently primed to script the holy bejeezus out of it, that would be the one.

Would I be happier if they’d managed to achieve total validation?  Of course.  In the meantime, though, I’m going to be very nearly as happy for what they’ve accomplished, and also for the simple fact of it being another major site that’s taken a big step forward.  Progress is always a cause for celebration in my world.

  1. You missed one other major cause for bad markup and it’s in the case of sites that are designed for a company: the company itself. When the client recieves the site and proceeds to copy and paste MS Word HTML into the CMS, or uses some other WYSIWYG interface to input their content, well, you can be pretty sure that every single page with content will break.

    As for HTML vs. XHTML, you’re right. No point in nitpicking when they don’t validate.

  2. My feeling is that, unless the site needs to embed some XML technology (e.g. MathML) into the markup then there is no reason to use XHTML. I believe that if the content is meant to be purely informational with no data buried in it then the document should be HTML. So, I agree with you and the direction Slashdot went.

    But I am just an amateur who likes standards (and is picky). What would I know? :)

  3. I have to admit, I too had kind of wonderd where exactly your take would fall on this one. Happily, it came out pretty much as I had thought it would. I’d have to agree that I am overjoyed that they have gotten as far as they have witht he redesign, and while I also understand that perfection as a goal is one to be aspired to, I am not so naive as to say it is always attainable.

    With that said, and with your position on the “nonvalidity of slashdot” now answered I can only imagine what lies in wait for your upcoming workshop in the windy city of Chicago?

    If it si anything like your books, (yeah I admit, I own all of them and have just about worn the covers off of two) I know chances are I’ll be leaving Chicago with a head full of questions, what if’s and I wonder’s, a few gallons of almost giddy anticipation to try out the new tricks and techniques, and wondering where I’ll ever find the time to actually put even half of what you covered into practice.

    In short, I can’t wait! Thanks for not backing down from the slashdot thing. Look forward to seeing you in Chicago.

  4. Slashdot, as a site that accepts ads, is going to get horrible markup shoved into its pages. That”s just the way it is.

    iframes or object elements

  5. […] 2; 4:06 pm “Eric Meyer just posted… Slashdot’s Validity which I rather liked” As we all know that old s […]

  6. To nitpick, HTML can not be well-formed or ill-formed. For HTML you can only be valid or invalid. (Which makes sense, as you need to go through a tag soup parser which expects a specific formatted document.)

  7. Also, the media type (MIME) thing is important. text/html means your document will be handled by the tag soup parser and the result will be a HTML DOM, always. */*+xml or application/xml means your document will be treated as XML. It will means that attributes like xml:lang have different meaning (namespaces are recognized) it will mean that node names are returned in the way you wrote them in your markup and a lot of other things. It also means that you are using XHTML.

  8. XHTML vs. HTML:

    I have one argument in favour of using XHTML instead of HTML; and it’s the reason that I’m currently porting to XHTML, and not to HTML.

    The argument is that in the future I might be tempted to start importing my pages into a single page interface using liberal doses of xmlhttp and all that stuff. In that case XHTML pages, which are (supposed to be) valid XML, are far easier to parse than HTML pages. It’ll make my future job easier.

    Other than this (admittedly as yet theoretical) reason I agree that there’s little reason to prefer XHTML over HTML.

  9. Eric: totally agree with you. I don’t see what all the fuss about XHTML is for. HTML does its job quite well. A few years ago a lot of people jumped on the XHTML bandwagon, and I think many didn’t really know why they were doing it; they just did it because everyone else was.

  10. Does a site validate if it uses italics for more than 90% of the text displayed? I believe it shouldn’t.

  11. A little off topic (but not much), but is XSLT the future of styling? Or will CSS be around for a long time?

  12. Yeah, as Anne said, the MIME type is important, because if it’s text/html, then your markup is HTML, regardless of what it looks like when you view source.

    The major advantage of XHTML is that you can embed other XML vocabularies such as SVG or MathML – that’s not possible with HTML. If you’re not doing that and don’t intend to make use of it at some point, then you’re better off using HTML, for the most part.

  13. Huzzah!

    And XHTML is a nice toy.

  14. I”ve written XSLT (which is itself so tortuous and ugly that it almost by definition cannot be called well-formed) to transform both HTML and XHTML

    Just a little thing here. XSLT can only be well formed, otherwise it simply doesn’t work. In the context of what you’re talking about; XSLT doesn’t need to be well-formed because, well, it’s not a markup language.

    Besides that, it’s used to transform XML from one design to another (XML design that is). so unless it’s XHTML, again, it won’t work :)

  15. Any XML-based format such as pure XHTML, XSL or RSS needs to be well-formed. As for XHTML served as xhtml/xml, we all know this is effectively dead due to IE6’s refusal to accept it. Sadly, IE7 won’t either, according to Microsoft.

  16. @Liam Egan: It’s not true that XSLT can only output XHTML; XSLT can output XML (including XHTML), HTML or plain text. And, thanks to the justly reviled disable-output-escaping attribute, it can even spit out stuff that is “almost, but not quite well-formed XML”.

  17. I applaud their choice to use real HTML from the get-go rather than building an XHTML document and then sending it as text/html—a very pointless and borderline harmful practice, in my opinion.

    Don’t screw with MIME types; if your page is being completely mislabeled at the server level, then how can you applaud yourself for something as petty as validation?

    Of course, XHTML is awesome if you’re doing it properly. ;) But ten points to them for using the right DOCTYPE, and a Strict one at that!

  18. It”s a difficult balance and one should know where to draw the line. My mum looking up recopies on the web is unlikely to view source or run it through a valuator. We should go for standards to progress our careers however the web to me is about the average Joe who can put up a website and the browsers should try and render it now and in the future.

Leave a Comment

Management reserves the right to edit or remove any comment, especially when abusive or irrelevant to the topic at hand. HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <em> <i> <q cite=""> <s> <strong> <pre class=""> <kbd>

Comment Preview

If you're satisfied with what you've written, then go ahead...