Posts in the Web Category

Inspector Scrutiny

Published 14 years, 10 months past

It’s been said before that web inspectors — Firebug, Dragonfly, the inspectors in Safari and Chrome, and so forth — are not always entirely accurate.  A less charitable characterization is that they lie to us, but that’s not exactly right.  The real truth is that web inspectors repeat to us the lies they are told, which are the same lies we can be told to our faces if we ask directly.

Here’s how I know this to be so:

body {font-size: medium;}

Just that.  Apply it to a test page.  Inspect the body element in any web inspector you care to fire up.  Have it tell you the computed styles for the body element.  Assuming you haven’t changed your browser’s font sizing preferences, the reported value will be 16px.

You might say that that makes sense, since an unaltered browser equates medium with “16”.  But as we saw in “Fixed Monospace Sizing“, the 16px value is not what is inherited by child elements.  What is inherited is medium, but web inspectors will never show you that as a computed style.  You can see it in the list of declared styles, which so far as I can tell lists “specific values” (as per section 6.1 of CSS2.1).  When you look to see what’s actually applied to the element in the “Computed Styles” view, you are being misled.

We can’t totally blame the inspectors, because what they list as computed styles is what they are given by the browser.  The inspectors take what the browser returns and prettify it for us, and give us ways to easily alter those values on the fly, but in the end they’re just DOM inspectors.  They don’t have a special line into the browser’s internal data.  Everything they report comes straight from the same DOM that any of us can query.  If you invoke:

var obj = document.getElementsByTagName('body')[0];
alert(getComputedStyle(obj,null).getPropertyValue('font-size'));

…on a document being given the rule I mentioned above, you will get back 16px, not medium.

This fact of inspector life was also demonstrated in “Rounding Off“.  As we saw there, browsers whose inspectors report integer pixel values also return them when queried directly from the DOM.  This despite the fact that it can be conclusively shown that those same browsers are internally storing non-integer values.

Yes, it might be possible for an inspector to do its own analysis of properties like font-size by checking the element’s specified values (which it knows) and then crawling up the document tree to do the same to all of the element’s ancestors to try to figure out a more accurate computed style.  But what bothers me is that the browser reported computed values that simply aren’t accurate in the first place.  it seems to me that they’re really “actual values”, not “computed values”, again in the sense of CSS2.1:6.1.  This makes getComputedStyle() fairly misleading as a method name; it should really be getActualStyle().

No, I don’t expect the DOM or browsers to change this, which is why it’s all the more important for us to keep these facts in mind.  Web inspectors are very powerful, useful, and convenient DOM viewers and editors, essentially souped-up interfaces to what we could collect ourselves with JavaScript.  They are thus limited by what they can get the browser to report to them.  There are steps they might take to compensate for known limitations, but that requires them to second-guess both what the browser does now and what it might do in the future.

The point, if I may be so bold, is this:  never place all your trust in what a web inspector tells you.  There may be things it cannot tell you because it does not know them, and thus what it does tell you may on occasion mislead or confuse you.  Be wary of what you are told — because even though all of it is correct, not quite all of it is true, and those are always the lies that are easiest to believe.


Fixed Monospace Sizing

Published 14 years, 10 months past

Monospace text sizing is, from time to time, completely unintuitive and can be quite maddening if you don’t look at it in exactly the right way.  Fortunately, there is a pretty simple workaround, and it’s one you might want to consider using even if you weren’t aware that a problem existed.

But first, allow me to lay some foundations.  Assuming no other author styles beyond the ones shown, consider the following:

span {font-family: monospace;}

<p>This is a 'p' with a <span>'span'</span> inside.</p>
All right, what should be the computed font-size of the span element?  Remember, there are no other author styles being applied.

The savvier among you will have said: “It depends, but most likely 13px.”  That’s because here, the size of the monospace text is controlled by the browser’s preferences.  The vast majority of users, of course, have never touched their default settings of “16” for proportional fonts and “13” for monospace/fixed fonts.  For them, then, the answer is 13px.  Similarly, if I’d asked about the p element’s computed font-size, the answer would be: “It depends, but most likely 16px.”

So let’s add a bit more and see where we land.

span {font-family: monospace; font-size: 1em;}

<p>This is a 'p' with a <span>'span'</span> inside.</p>

As before: bearing in mind that there are no other author styles, what should be the computed font-size of the span element?

In this case, building on the previous question and answer, you might say, “It depends, but most likely 16px.”  The reasoning here is pretty straightforward:  since the computed font-size of the p element is 16px, the font-size: 1em; assigned to the span will result in it having the same size.

And that’s true… in two of five browsers I tested: Opera 10 and Internet Explorer 8.  In the other three I tested—Firefox 3.6, Safari 4, and Chrome 4—the computed (and rendered) font-size of the span is 13px, the same as in our first example.  This result holds true if the rule is changed to use font: 1em monospace; instead of the two separate properties.  The behavior continues to persist even when adding specific font families, like Courier New, Courier, Andale Mono, and so on to the rule.  It also persists if 1em is converted to 100%.

So in other words, even though I have written CSS that explicitly says “Make the font-size of this element the same as its parent”, three of five browsers apparently ignore me.

I say “apparently” because what’s happening is that those browsers are allowing the span to inherit the default font-size from its parent (and thus, indirectly, all its ancestors), but the default font-size is medium.  If you go look up medium, you find out that it doesn’t have a defined numeric size. So what those browsers do is equate medium with the preference settings, which means it’s different for monospace fonts than for everything else.

In other words, those three browsers are doing something like this:

  1. This span needs to have the same font-size as its parent element.
  2. The parent’s font-size is medium, even though when my web inspector (or an author’s DOM script) asks, I report the 16px I used to output the text.  So the span‘s font-size is actually medium.
  3. This medium-sized span is using a monospace font.  The preference setting for monospace is “13”, and I equate medium with the preference setting, so I’ll output the span using 13-pixel text.

Opera 10, as I said, doesn’t do this, even if your monospace font preference setting is the default value of “13” or indeed different from the preference for non-monospace fonts.  And IE8 doesn’t appear to do it either, although you can’t set numeric font size preferences in IE8 so what it’s actually doing is open to interpretation.  Oh, IE8, you inscrutable little scamp, you.

All that might seem reasonable enough, but it turns out that’s not the whole story.  No, the three resizing browsers are being a good deal more “clever”, if that’s actually the word I want, than that.  In fact, what those browsers do makes it seem like they use the two preference settings to create a ratio, and that ratio is used to scale monospace text.  That’s not actually what’s happening, but it looks that way at first.  To see what I mean, let’s consider:

span {font-family: monospace; font-size: 2em;}

<p>This is a 'p' with a <span>'span'</span> inside.</p>

Again: in the absence of other author styles, what should be the computed font-size of the span element?

The answer: “It depends, but most likely 26px as long as we aren’t talking about Opera 10 or IE8.  If it is one of those two, then most likely 32px.”  Why?  Because the resizing browsers see the font-size: 2em; declaration as “twice medium” and twice 13 is 26.  Opera 10 and IE8, as previously established, don’t do the resizing.  Or else they simply interpret medium as being equal to the proportional font size preference setting.  Whatever.

Okay.  So what all this means is that in many browsers, you can declare that an element’s font size should be twice the size of its parent’s and have it actually be 1.625 times the size — or, if you want to look at it another way, 0.8125 times the size you expected it to be.  The 0.8125 comes from 26/32, which of course reduces to 13/16.  If you were to adjust your browser’s preferences so the monospace setting is “15”, then monospace fonts would be 0.9375 (15/16) times the expected size.

But — and here’s where things get really fun — this is not always so.  See, you may not have run into this problem if you’ve been declaring specific font families with no generic fallback.  Consider this variation (note that I dropped back to 1em for the font-size):

span {font-family: "Courier New"; font-size: 1em;}

<p>This is a 'p' with a <span>'span'</span> inside.</p>

This time, in every one of the five browsers I mentioned before, assuming the browser defaults, the computed (and rendered) font-size of the span will be 16px.  Not 13px.  And the only difference is that we switched from a generic font family to a specific one.

“Hey presto!” you shout.  “We’ll just tack the generic family on the end there and be right as rain!”  Sadly, no.  For if you do this:

span {font-family: "Courier New", monospace; font-size: 1em;}

<p>This is a 'p' with a <span>'span'</span> inside.</p>

…then the answer to the question I keep asking will be:  “It depends, but given browser defaults it will be 16px, unless we’re talking about Safari.  In that case, it’s 13px.”

Really.  Alone among the browsers I tested, Safari goes back to doing the resizing when you provide a generic fallback to your specific family.  Or even multiple families.  Do your best to make sure the user at least gets a fixed-width font, and you get a size smaller than you’d intended.  (You can get the back story on this in a late-2006 post on the Surfin’ Safari blog.)

So what do we do?  Get creative.  That’s what the ARIA folks did in their specification’s style sheet, where they declare two font stacks: the first with a generic fallback, and the second without it.  That works, but it’s ugly.  I didn’t like that at all.  And then, halfway through writing up this post, a fix came to me like a shot in the dark.  Check this out:

span {font-family: "Courier New", monospace, serif; font-size: 1em;}

<p>This is a 'p' with a <span>'span'</span> inside.</p>

This time around, the answer is:  “It depends, but given browser defaults, 16px.”

Really!  Even in Safari!  And in all tested browsers, it falls back to a generic monospace font at the requested size even if the specific family (or families) we declare aren’t available!  This can be verified by altering the specific font family to something that doesn’t actually exist:

span {font-family: "Corier Neu", monospace, serif; font-size: 1em;}

<p>This is a 'p' with a <span>'span'</span> inside.</p>

Monospacey goodness at the intended, parent-matching size.  It’s enough to make a body believe in monotheism.

Since I generally assume that anything I devise was already invented by someone else, I went Googling for prior art.  And wouldn’t you know it, the Wikipedia folks had worked it out around the end of last year.  This, of course, supports my contention that Wikipedia is the new Steve Allen.  I also found some claims that ending the font stack with monospace, monospace would have the same effect, but that wasn’t borne out in my testing.  Perhaps it worked in older versions of browsers but no longer does.

I did leave out another way to make monospaced fonts behave as expected, which you may have already figured out from the preceding: declare the font-size for any parent of a monospaced element to be a length value, along the lines of body {font-size: 12px;}.  That will pass the length value down the document tree to the monospaced element via inheritance, which will use it without resizing it in every browser I tested.  Though you may have heard that page zooming makes pixel-sized text okay, I’m not really convinced.  Not yet.  There are too many people who don’t know how to zoom, and too many whose browsers aren’t advanced enough to zoom pages.  Even in page-zooming browsers, there are problems with pixel text.  So I’m still on the ems-and-percentages bandwagon.

In fact, there are a fair number of details and extra browser oddities that I left out of this, as it’s already way more than long enough, and besides you don’t really want to hear the gory details of manually stepping through 37 different preferences settings just to verify a theory.  Plus you already heard about the font-size rounding investigation that spawned off of this one, about halfway through.  I think that’s more than enough for the time being.

I should also lay down a caveat: it’s possible that this behavior will be interpreted as a bug by the Safari team and “fixed”, if that’s the word I want, in a future release.  I really hope not — and if they’re looking for ways to improve how they handle monospace font sizing, I have a few suggestions — but it is possible.  Adjust your expectations accordingly.

And with that, I’m going to stop now.  I hope this will be useful to you, either now or in the future.


Rounding Off

Published 14 years, 10 months past

In the course of digging into the guts of a much more complicated problem, I stumbled into an interesting philosophical question posed by web inspection tools.

Consider the following CSS and HTML:

p {font-size: 10px;}
b {font-size: 1.04em;}

<p>This is text <b>with some boldfacing</b>.</p>

Simple enough.  Now, what is the computed font-size for the b element?

There are two valid answers.  Most likely one of them is intuitively obvious to you, but take a moment to contemplate the rationale for the answer you didn’t pick.

Now, consider the ramifications of both choices on a situation where there are b elements nested ten layers deep.

If you hold that the answer is 10px, then the computed font-size of the tenth level of nesting should still be 10px, because at every level of nesting the mathematical answer will be rounded down to 10.  That is: for every b element, its computed font-size will be round(10*1.04), which will always yield 10.

If, on the other hand, you hold that the answer is 10.4px, then the computed font-size of the tenth level of nesting should be 14.802442849px.  That might get rounded to some smaller number of decimal places, but even so, the number should be pretty close to 14.8.

The simplest test, of course, is to set up a ten-level-deep nesting of b elements with the previously-shown CSS and find out what happens.  If the whole line of text is the same size, then browsers round their computed font-size values before passing them on.  If the text swells in size as the nesting gets deeper, then they don’t.

As it happens, in all the browsers I’ve tested, the text swells, so browsers are passing along fractional pixel values from level to level.  That’s not the interesting philosophical question.  Instead, it is this:  do web inspectors that show integer font-size values in their ‘computed style’ windows lie to us?

To see what I mean, load up the font size rounding test page in Firefox and use Firebug to inspect the “1(“, which is the first of the b elements, in the first (1.04em) test case.  Make sure you’re looking at the “Computed Styles” pane in Firebug, and you’ll get a computed font-size of 10.4px.  That makes sense: it’s 10 × 1.04.

Now try the inspecting that same “1(” in Safari or Opera.  Both browsers will tell you that the computed font-size of that b element is 10px.  But we already know that it’s actually 10.4px, because the more deeply-nested layers of b elements increase in size.  These inspectors are rounding off the internal number before showing it to us.  Arguably, they are lying to us.

But are they really?  The reason to doubt this conclusion is that the values shown in those inspectors accurately reflect the value being used to render the characters on-screen.  To see what I mean, look at the last example on the test page, where there’s sub-pixel size testing.  The “O” characters run from a flat 10 pixels to a flat 11 pixels in tenths (or less) of a pixel, all of their font-sizes assigned with inline style elements to pin the characters down as much as possible.  In Safari, you can see the size jump up one pixel right around the text’s midpoint, where I wrote font-size: 10.5px.  So everything from 10px to 10.49px gets drawn at 10 pixels tall; everything from 10.5px to 11px is 11 pixels tall.  Safari’s inspector reflects this accurately.  It’s telling you the size used to draw the text.

A comparative illustration of the many-O test case in three different browsers showing three different results.  The browsers used to create the illustration were Safari, Opera, and Firefox.

In Opera 10.10, you get the same thing except that the jump from 10 to 11 pixels happens on the very last “O”, both visually and in the inspector (Dragonfly).  That means that when it comes to font sizes, Opera always rounds down.  Everything from 10px to 10.9px — and, presumably, 10.99999px for as many nines as you’d care to add — will be drawn 10 pixels tall.  Brilliant.

In Firefox for OS X, there’s no size jump.  The “O” characters look like they form a smooth line of same-size text.  In fact, they’re all being drawn subtly differently, thanks to their subtly different font-size values.  If you use OS X’s Universal Access screen zooming to zoom way, way in, you can see the differences in pixel shading from one “O” to the next.  Even if you don’t, though, the fact that it’s hard to tell that there is an increase in size from one end of the line to the other is evidence enough.

In Firefox for XP, on the other hand, the size jump occurs just as it does in Safari, going from 10 pixels to 11 pixels of text size at the 10.5 mark.  But Firebug still reports the correct computed font-size values.  Thus, its reported value doesn’t match the size of the text that’s been output to the screen.  Arguably, it’s lying just as much as Safari and Opera,  in a different way.

But, again: is it really?  The computed values are being accurately reported.  That there is a small variance between that fractional number and the display of the text is arguably irrelevant, and can lead to its own confusion.  Situations will arise where apparent rounding errors have occurred — I see people complain about them from time to time — when the apparent error is really an artifact of how information is delivered.

I have my own thoughts about all this, but I’m much more interested in the thoughts of others.  What do you think?  Should web inspectors report the CSS computed values accurately, without regard to the actual rendering effects; or should the inspectors modify the reported values to more accurately reflect the visual rendering, thus obscuring the raw computed values?

Addendum 10 Feb 10: I’ve updated the test page with a JS link that will dynamically insert the results of getComputedStyle(el,null).getPropertyValue("font-size") into the test cases.  The results are completely consistent with what the inspectors report in each browser.  This tells us something about the inspectors that most of us probably don’t consciously realize: that what they show us rests directly on the same JS/DOM calls we could write ourselves.  In other words, inspectors are not privileged in what they can “see”; they have no special view into the browser’s guts.  Thus another way to look at this topic is that inspectors simply repeat the lies that browsers tell the world.


MIX Judging

Published 14 years, 11 months past

I was recently honored to be asked to be a judge for the MIX 10k Smart Coding Challenge, running in conjunction with Microsoft’s MIX conference.  The idea is to create a really great web application that totals no more than 10KB in its unzipped state.

Why did I agree to participate?  As much as I’d like to say “fat sacks of cash“, that wasn’t it at all.  (Mostly due to the distinct lack of cash, sacked or otherwise.  Sad face.)  The contest’s entry requirements actually say it for me.  In excerpted form:

  • The entry MUST use one or more of the following technologies: Silverlight, Gestalt or HTML5…
  • The entry MUST function in 3 or more of the following browsers: Internet Explorer, Firefox, Safari, Opera, or Chrome…
  • The entry MAY use any of the following additional technology components…
    • CSS
    • JavaScript
    • XAML/XML
    • Ruby
    • Python
    • Text, Zip and Image files (e.g. png, jpg or gif)

Dig that:  not only is the contest open to HTML 5 submissions, but it has to be cross-browser compatible.  Okay, technically it only has to be three-out-of-five compatible, but still, that’s a great contest requirement.  Also note that while IE is one of the five, it is not a required one of the five.

I imagine there will be a fair number of Silverlight and Gestalt entries, and I might look at them, but I’m really there — was really asked — because of the HTML 5 entries.  By which I mean the open web entries, since any HTML 5 entry is also going to use CSS, JavaScript, and so on.

The downside here is that the contest ends in just one week, at 3pm U.S. Pacific time on 29 January.  I know that time is tight, but if you’ve got a cool HTML 5-based application running around in your head, this just might be the time to let it out.


Correcting Corrupted Characters

Published 15 years, 1 month past

At some point, for some reason I cannot quite fathom, a WordPress or PHP or mySQL or some other upgrade took all of my WordPress database’s UTF-8 and translated it to (I believe) ISO-8859-1 and then dumped the result back right back into the database.  So “Emil Björklund” became “Emil Björklund”(If those looked the same to you, then I see “Börklund” for the second one, and you should tell me which browser and OS you’re using in the comments.)  This happened all throughout the WordPress database, including to commonly-used characters like ‘smart’ quotes, both single and double; em and en dashes; ellipses; and so on.  It also apparently happened in all the DB fields, so not only were posts and comments affected, but commenters’ names as well (for example).

And I’m pretty sure this isn’t just a case of the correct characters lurking in the DB and being downsampled on their way to me, as I have WordPress configured to use UTF-8, the site’s head contains a meta that declares UTF-8, and a peek at the HTTP response headers shows that I’m serving UTF-8.  Of course, I’m not really expert at this, so it’s possible that I’ve misunderstood or misinterpreted, well, just about anything.  To be honest, I find it deeply objectionable that this kind of stuff is still a problem here on the eve of 2010, and in general, enduring the effluvia of erroneous encoding makes my temples throb in a distinctly unhealthy fashion.

Anyway.  Moving on.

I found a search-and-replace plugin—ironically enough, one written by a person whose name contains a character that would currently be corrupted in my database—that lets me fix the errors I know about, one at a time.  But it’s a sure bet there are going to be tons of these things littered all over the place and I’m not likely to find them all, let alone be able to fix them all by hand, one find-and-replace at a time.

What I need is a WordPress plugin or something that will find the erroneous character strings in various fields and turn them back into good old UTF-8.  Failing that, I need a good table that shows the ISO-8859-1 equivalents of as many UTF-8 characters as possible, or else a way to generate that table for myself.  With that table in hand, I at least have a chance of writing a plugin to go through and undo the mess.  I might even have it monitor the DB to see if it happens again, and give me a big “Clean up!” button if it does.

So: anyone got some pointers they could share, information that might help, even code that might make the whole thing go away?


Pseudo-Phantoms

Published 15 years, 1 month past

In the course of a recent debugging session, I discovered a limitation of web inspectors (Firebug, Dragonfly, Safari’s Web Inspector, et al.) that I hadn’t quite grasped before: they don’t show pseudo-elements and they’re not so great with pseudo-classes.  There’s one semi-exception to this rule, which is Internet Explorer 8’s built-in Developer Tool.  It shows pseudo-elements just fine.

Here’s an example of what I’m talking about:

p::after {content: " -\2761-"; font-size: smaller;}

Drop that style into any document that has paragraphs.  Load it up in your favorite development browser.  Now inspect a paragraph.  You will not see the generated content in the DOM view, and you won’t see the pseudo-element rule in the Styles tab (except in IE, where you get the latter, though not the former).

The problem isn’t that I used an escaped Unicode reference; take that out and you’ll still see the same results, as on the test page I threw together.  It isn’t the double-colon syntax, either, which all modern browsers handle just fine; and anyway, I can take it back to a single colon and still see the same results.  ::first-letter, ::first-line, ::before, and ::after are all basically invisible in most inspectors.

This can be a problem when developing, especially in cases such as having a forgotten, runaway generated-content clearfix making hash of the layout.  No matter how many times you inspect the elements that are behaving strangely, you aren’t going to see anything in the inspector that tells you why the weirdness is happening.

The same is largely true for dynamic pseudo-classes.  If you style all five link states, only two will show up in most inspectors—either :link or :visited, depending on whether you’ve visited the link’s target; and :focus.  (You can sometimes also get :hover in Dragonfly, though I’ve not been able to do so reliably.  IE8’s Developer Tool always shows a:link even when the link is visited, and doesn’t appear to show any other link states.  …yes, this is getting complicated.)

The more static pseudo-classes, like :first-child, do show up pretty well across the board (except in IE, which doesn’t support all the advanced static pseudo-classes; e.g., :last-child).

I can appreciate that inspectors face an interesting challenge here.  Pseudo-elements are just that, and aren’t part of the actual structure.  And yet Internet Explorer’s Developer Tool manages to find those rules and display them without any fuss, even if it doesn’t show generated content in its DOM view.  Some inspectors do better than others with dynamic pseudo-classes, but the fact remains that you basically can’t see some of them even though they will potentially apply to the inspected link at some point.

I’d be very interested to know what inspector teams encountered in trying to solve this problem, or if they’ve never tried.  I’d be especially interested to know why IE shows pseudo-elements when the others don’t—is it made simple by their rendering engine’s internals, or did someone on the Developer Tool team go to the extra effort of special-casing those rules?

For me, however, the overriding question is this: what will it take for the various inspectors to behave more like IE’s does, and show pseudo-element and pseudo-class rules that are associated with the element currently being inspected?  And as a bonus, to get them to show in the DOM view where the pseudo-elements actually live, so to speak?

(Addendum: when I talk about IE and the Developer Tool in this post, I mean the tool built into IE8.  I did not test the Developer Toolbar that was available for IE6 and IE7.  Thanks to Jeff L for pointing out the need to be clear about that.)


HTML5 And You

Published 15 years, 3 months past

I mentioned in my previous post that I “had come away with my head reeling from the massive length and depth of the often-changing specification”, which is entirely true.  Printouts of the current draft of the HTML5 spec can reach, depending on your operating system and installed fonts, somewhere north of 900 pages.  Yes: nine hundred.  There are unabridged Stephen King novels that run shorter.

You might well say to yourself: “Self, is it just me, or are the people doing this completely off their everlovin’ rockers?  Because the specification for something as fundamentally simple as HTML should reach maybe 200 pages, max.”  You might even despair that the entire enterprise is doomed to failure precisely because nobody sane will ever sit down to read that entire doorstop.

But there’s no real reason to panic, because here’s the thing about the HTML5 specification that might not be obvious right away:  it’s not for you.  It’s for implementors.  And that’s a good thing.

If you do start reading the HTML5 draft, you’ll start running into really lengthy, excruciatingly detailed algorithms for, say, parsing a time component.  Or moving through the browser’s history.  Or submitting a form.  There’s an entire (long) chapter on how to process the HTML syntax.

Those are all good things, actually.  They greatly increase the chances of interoperability actually happening within our lifetimes.  There’s no guessing about, well, much of anything.  It’s all been exactingly defined, to the extent that one can exactingly define anything using a human language.  A browser team doesn’t have to wonder, or even guess, what to do when the document has been completely parsed.  It’s all spelled out.  And the people on those browser teams will, in the end, be the people who read that entire doorstop.  (Their sanity is another matter, and not discussed here.)

How is all that stuff relevant to you, the author?  In the sense that when browser teams follow the spec, their products will be interoperable, which is to say consistent.  (Just imagine that for a moment.)

Beyond that, though, the detailed implementation stuff isn’t relevant to you.  You are not expected to know all those algorithms in order to write HTML documents.  Pretty much all you need to know is the markup.  That’s the part that should be no more than 200 pages, yeah?

Turns out it is, and by a comfortable margin.  Michael(tm) Smith’s HTML5: The Markup Language is a version of the HTML5 draft with all of those eye-wateringly pedantic implementor sections stripped out, and when I generated a PDF it came in at 147 pages.  That’s what you really need in order to get up to speed on what’s in HTML5.  It’s for you.


Nine Into Five

Published 15 years, 3 months past

Like so many others, I had tried to dig into the meat of HTML5 and figure out just what the heck was going on.  Like so many others, I had come away with my head reeling from the massive length and depth of the often-changing specification, unsure of the real meaning of much of what I had read.  And like so many others, I had gone to read the commentary surrounding HTML5 and come away deeply dispirited by the confusion, cross-claims, and rancor I found.

Then I received an invitation to join a small, in-person gathering of like-minded people, many of them just as confused and dispirited as I, to turn our collective focus to the situation and see what we found.  I already had plans for the meeting’s scheduled dates.  I altered the plans.

Over two long days, we poked and prodded and pounded on the HTML5 specification—doing our best to figure out what was meant by, and what would result from, this phrase or that example; trying to reconcile seemingly arbitrary design choices with what we knew of the web and its history and the stated goals of the HTML5 specification; puzzling over the implications of example code and detailed algorithms and non-normative notes.

In the end, we came away with a better understanding of what’s going on, and out of that arose some concerns and suggestions.  But in the main, we felt much better about what’s going on in HTML5, and have now said so publicly.

Personally, there are two markup changes I’d like most to see:

  1. The content model of footer should match that of header. As others have said, the English-language name of the footer element creates expectations about what it is and how it should work.  As the spec now stands, most of those expectations will be wrong.  To wit: if your page’s footer includes navigation links, and especially if you have an HTML5-structured “fat footer“, you can’t use footer to contain it.

    If this feels a little familiar, it should: the same problem happened with address, which was specified to mean only the contact information for the author of a page.  It was quite explicitly specified to not accept mailing addresses.  Of course, tons of people did just that, because they had an address and there was an address element, so of course they went together!

    A lot of us cringed every time this came up in the last ten years of conducting training, because it meant we’d have to spend a few minutes explaining that the meaning of the element’s name clashed with its technical design.  We saw a lot of furrowed brows, rolled eyes, and derisively shaken heads.  That will be magnified a millionfold with footer if things are allowed to stand as they are.

    As I said, the fix is simple: just change the content model of footer to state:

    Flow content, but with no header or footer element descendants.

    That’s exactly the same content model as header, and for the same reasons.

  2. time needs to be less restrictive.  That’s not very precise, I know.  But as things stand now, you can only apply time to Gregorian datetimes, and you’re not supposed to use it for anything that couldn’t be easily represented in a calendaring program.  The HTML5 specification says:

    The time element is not intended for encoding times for which a precise date or time cannot be established.

    That makes me wonder, in a manner not at all like Robert Plant, how precise do we have to be?  The answer, I’m sorry to say, is too much.

    To pick an example: I have what I think of as a great use case for the time element, and while it uses the Gregorian calendar, it’s only accurate to whole months (as is Wikipedia’s version).  In some cases I could get the values down to specific days; but in others, maybe not.  So I can’t use the datetime attribute, which requires at least year-month-day, if not actual hours and minutes.  I could omit the attribute, and just have this:

    <time>October 2007</time>
    

    In that case, the content has to be a valid date string in content—which is to say, a valid date string with optional whitespace.  So that won’t work.

    I’ve pondered how best to tackle this, as did the Super Friends.  Our suggestion is to allow bare year and month-day values as permitted in ISO8601.  In addition, I think we should allow a valid date string to only require a year, with month, day, and time optional.  That seems good enough as long as we’re going to go with the idea that the Gregorian calendar contains all the time we ever want to structure.

    But what about other, older dates, some of which are fairly precisely known within their own calendars?  On that point, though the historian in me clamors for a fix, I’m uncertain as to what.  PPK, on the other hand, has put alot of thought into this and written a piece that I have skimmed but never, perhaps ironically, found the time to read in its entirety.

These are not my only concerns, but they’re the big ones.  For the rest, I concur with the hiccups guide, though of course to varying degrees.  I’m still trying to decide how much I care (or don’t) about the subtle differences between article and section, for example, or the way aside fits (or doesn’t) with its cousin elements.  And dialog just bugs me, but I’m not sure I have a better proposal, so I’ll leave it be for the time being.

At the other end of the two days, I felt a good deal more calm and hopeful than I did going in.  As Jeffrey said, “the more I study the direction HTML5 is taking, the better I like it”.  While there are still rough edges to be smoothed, there is time to smooth them.  We’ve already seen responsiveness on some of the points we addressed in the hiccups guide, and discussions around others.  The specification itself is daunting, especially to those who might remember the compact simplicity of the HTML2 spec.  Fortunately, it has good internal cross-linking so that you can, with effort, track down exactly what’s meant by “valid date string with optional time” or “sectioning content” or “formatBlock candidate“.

With HTML5, the web is not ending, nor is it starting over.  It’s evolving, slowly and in full view of the public, with an opportunity for anyone to have their say (which is not, of course, the same as having one’s proposals accepted).  It’s the next step, and I feel quite a bit more confident that it’s a step onto solid ground.


Browse the Archive

Earlier Entries

Later Entries