Posts from Wednesday, September 15th, 2004

When Blog Software Attacks!

Published 15 years, 6 months ago

Heard several times in 2002 and 2003: “Hey, Eric, how come you don’t use a blogging package for your site instead of that goofy XML/XSLT thing you cooked up?  It would make your archive URLs easier to remember, you could let people search, and comments would be possible.”

Oh, I don’t know… maybe because once I pour everything into a system, I’m subject to its quirks and whims, whereas if I roll my own system, it’s subject to my quirks and whims—and is thus tuned to my personal expectations, built to account for what I might choose to do even if I don’t realize it?

Despite this totally sensible attitude, I eventually did migrate to a system (WordPress).  That decision just bit me, thanks to its handling of markup in a post title.  Had the markup just been stripped out, that would be one thing—annoying, but totally understandable.  But to only sort-of strip it in the post slug and drop the markup raw into the title element, thus breaking any hope of validation?  Instead of, oh, I don’t know, maybe stripping the markup from both the title element and the post slug, but otherwise leaving it alone, so that the post title could remain marked up in the document itself?

<sigh type="frustrated" />

For the record, this is not evidence that WordPress sucks.  All packages suck in some way, and each one sucks in unique ways.  Eventually, you’re going to trip over something undesirable.  Today it was my turn to take that trip.

Now the post slug of emreallyem-undoing-htmlcss is enshrined forever, because you don’t change permalinks if you can possibly avoid it.  I can avoid it here by gritting my teeth, sucking it up, and wishing I had the programming moxie (not to mention spare time) to implement my own full-featured system.

Really Undoing html.css

Published 15 years, 6 months ago

There’s an aspect of document presentation most of us don’t consider: the browser defaults.  If you take an HTML or XHTML document—for the purposes of this exercise, assume it contains no presentational markup—and load it up in a Web browser with no CSS applied, there will still be some presentational effects.  A level-one heading, for example, is usually boldfaced and a good deal larger than other text, thus leading to the old stereotype of headings being “big and ugly”; the pre element typically honors whitespace and uses a monospace font; a paragraph is separated from other elements by a “blank line”; and so on.  From a CSS point of view, all this happens because the browser has built-in styles.

Tantek recently wrote about his creation of a file called undohtml.css, whose sole purpose is to strip away some of the default browser styles applied to common elements.  By resetting all headings to the same size, for example, he avoids the inconsistencies of heading sizes across browsers and brings everything to a common baseline.  If a different size is desired, the author has to do it manually.  (He should probably zero out the margins on headings as well, as those too tend to be inconsistent across browsers.)  He also zeroes out their margins, using a separate rule that I overlooked when I first posted.  Apologies to Tantek for my initial claim that he didn’t take that step.

Of course, Tantek isn’t really removing all the default styles, but (so far) just those that have given him the most trouble.  When it comes to Gecko-based browsers like Firefox and Mozilla, however, you can completely eliminate all built-in styles.  These browsers use a series of style sheets to control the presentation of documents, forms, MathML markup, and so on.  In OS X, you can find most of these style sheets by showing the package contents of the browser’s application file and navigating to Contents > MacOS > res.  On just about any other OS, it’s even easier; just search your hard drive for html.css and open the directory that contains that file.

If you look in html.css, you’ll find all of the styles that make what we think of as “unstyled” documents act the way they do.  Consider, for example:

area, base, basefont, head, meta, script, style, title,
noembed, noscript, param {
   display: none;

That rule is why the head element and all its contents don’t appear in your browser (as well as all those other “invisible” elements).  From a CSS standpoint, there’s nothing special about those elements as compared to others like div or ul.  The fact that they’re traditionally “invisible” is irrelevant—but with that one rule, the tradition is preserved.  You can always override the rule, of course; try style {display: block;} on a test document that contains an embedded style sheet and load it up in Firefox/Mozilla.  It isn’t magic.  It’s just a change from the usual way that documents are presented.  (See this test document for an example.)

There’s also:

/* nested lists have no top/bottom margins */
ul ul,   ul ol,   ul dir,   ul menu,   ul dl,
ol ul,   ol ol,   ol dir,   ol menu,   ol dl,
dir ul,  dir ol,  dir dir,  dir menu,  dir dl,
menu ul, menu ol, menu dir, menu menu, menu dl,
dl ul,   dl ol,   dl dir,   dl menu,   dl dl {
  margin-top: 0;
  margin-bottom: 0;

So in order to remove the top and bottom margins from nested lists, which is a traditional behavior of HTML browsers, that rule needs to be in the default style sheet.  Remove it, and nested lists would have top and bottom margins thanks to another rule in the style sheet:

ul, menu, dir {
  display: block;
  list-style-type: disc;
  margin: 1em 0;
  -moz-padding-start: 40px;
  -moz-counter-reset: -html-counter 0;

That rule not only sets the usual margins and such, but also includes some Mozilla-proprietary properties that help lists act in accordance with our expectations.  There are certain aspects of traditional presentation that aren’t (yet) fully describable using CSS, so the Mozilla folks have had to add properties.  In accordance with CSS2.1, section 4.1.2, these proprietary extensions are marked with a vendor prefix; here, it’s -moz-.  So any property or value you see starting with that string is a proprietary extension.  (For the record, I have no objection to extensions so long as they’re clearly marked as such; it’s the “silent” extensions that bug me.)

There is more to the presentation story than just html.css.  In the same directory, you can find quirk.css, which is applied instead of html.css when the browser is in “quirks” mode.  Another style sheet, viewsource.css, affects the presentation of any view source window.  All the nifty color-coding happens as a result of that style sheet, which is applied to automatically-generated markup that underlies the actual source you see.

So how do you completely strip out the default styles for an (X)HTML document?  Quit the browser application and rename the file html.css to something like html222.css.  Do the same for quirk.css.  Now re-launch the browser and find out just how much you’ve been taking for granted.  Feel free to browse around the Web and see what happens on various sites, but you’ll have to type blind, because the address bar won’t show any text.  You can still drag HTML documents on your hard drive into the window and see what happens.  If a document has any CSS applied to it, then the browser will use it.  It just won’t have any of the default styles available, so you’ll be applying your styles on top of nothing, instead of the usual foundation of expected presentation.

So what exactly is the point of all this?  As it turns out, I believe there are four:

  1. By studying html.css, beginning CSS users can compare the rules to the “unstyled” presentation of documents, and thus get a much better idea of how CSS works.
  2. By removing the default styles, you can come to a much greater realization of how much presentation is taken for granted, and how much there is to be dealt with when creating a new design.
  3. On a related note, note that the absolute bare minimum presentation of a document is to render all elements with inline boxes, and to show every scrap of content available.  Even something as basic as making a paragraph generate a block box is a style effect.
  4. It helps us to realize that what we often think of as the “special handling” of HTML is anything but: in Firefox/Mozilla, HTML documents are just a case of some markup that happens to have some pre-defined CSS applied to it.  Granted, the proprietary extensions needed to keep things in line with expectations are a case of special handling, and those tell us one of two things:
    • CSS still has a long way to go before it can be called a full-fledged layout tool, since it can’t fully recreate traditional HTML layout.
    • Old-school HTML layout was so totally wack, it’s no surprise that it’s hard to describe even with a tool like CSS at your disposal.
    I’ll leave it to the reader to decide which of those two they prefer.  Or, heck, choose both.  It isn’t as though they’re mutually exclusive.

Have fun fiddling with or completely removing the built-in styles!  Just remember: modifications like those described are made at your own risk, I’m not responsible if you do this and your hard drive vaporizes, no warranty is expressed or implied, not a flying toy, blah blah blah.