Characteristic Confusion
Published 16 years, 7 months past
In the course of building my line-height: normal
test page, I settled on defaulting to an unusual but still pervasive font family: Webdings. The idea was that if you picked a font family in the dropdown and you didn’t have it installed, you’d fall back to Webdings and it would be really obvious that it had happened.
Except in Firefox 3b5, there were no dings, web or otherwise. Instead, some serif-family font (probably my default serif, Times) was being used to display the text “Oy!”.
It’s a beta, I thought with a mental shrug, and moved on. When I made mention of it in my post on the subject, I did so mainly so I didn’t get sixteen people commenting “No Webdings in Firefox 3 betas!” when I already knew that.
So I didn’t get any of those comments. Instead, Smokey Ardisson posted that what Firefox 3 was doing with my text was correct. Even though the declared fallback font was Webdings, I shouldn’t expect to see it being used, because Firefox was doing the proper Unicode thing and finding me a font that had the character glyphs I’d requested.
Wow. Ignoring a font-family
declaration is kosher? Really?
Well, yes. It’s been happening ever since the CSS font rules were first implemented. In fact, it’s the basis of the whole list-of-alternatives syntax for font-family
. You might’ve thought that CSS says browsers should look to see if a requested family is available and then if not look at the next one on the list, and then goes to render text. And it does, but it says they should do that on a per-character basis.
That is, if you ask for a character and the primary font face doesn’t have it, the browser goes to the next family in your list looking for a substitute. It keeps doing that until it finds the character you wanted, either in your list of preferred families or else in the browser’s default fonts. And if the browser just can’t find the needed symbol anywhere at all, you get an empty box or a question mark or some other symbol that means “FAIL” in font-rendering terms.
A commonly-cited case for this is specifying a CJKV character in a page and then trying to display it on a computer that doesn’t have non-Romance language fonts installed. The same would hold true for any page with any characters that the installed fonts can’t support. But think about it: if you browse to a page written in, say, Arabic, and your user style sheet says that all elements’ text should be rendered in New Century Schoolbook, what will happen? If you have fonts that support Arabic text, you’re going to see Arabic, not New Century Schoolbook. If you don’t, then you’re going to see a whole lot of “I can’t render that” symbols. (Though I don’t know what font those symbols will be in. Maybe New Century Schoolbook? Man, I miss that font.)
So: when I built my test, I typed “Oy!” for the example text, and then wrote styles to use Webdings to display that text. Here’s how I represented that, mentally: the same as if I’d opened up a text editor like, oh, MS Word 5.1a; typed “Oy!”; selected that text; and then dropped down the “Font” menu and picked “Webdings”.
But here’s how Firefox 3 dealt with it: I asked for the character glpyhs “O”, “y”, and “!”; I asked for a specific font family to display that text; the requested font family doesn’t contain those glyphs or anything like them; the CSS font substitution rules kicked in and the browser picked those glyphs out of the best alternative. (In this case, the browser’s default fonts.)
In other words, Firefox 3 will not show me the ear-Death Star-spider combo unless I put those exact symbols into my document source, or at least Unicode references that call for those symbols. Because that’s what happens in a Unicode world: you get the glyphs you requested, even if you thought you were requesting something else.
The problem, of course, is that we don’t live in a Unicode world—not yet. If we did, I wouldn’t keep seeing line noise on every web page where someone wrote their post in Word with smart quotes turned on and then just did a straight copy-and-paste into their CMS. Ged knows I would love to be in a Unicode world, or indeed any world where such character-incompatibility idiocy was a thing of the past. The fact that we still have those problems in 2008 almost smacks of willful malignance on the part of somebody.
Furthermore, in most (but not all) of the text editors I have available and tested, I can type “Oy!” with the font set to Webdings and get the ear, Death Star, and spider symbols. So mentally, it’s very hard to separate those glyphs from the keyboard characters I type, which makes it very hard to just accept that what Firefox 3 is doing is correct. Instinctively, it feels flat-out wrong. I can trace the process intellectually, sure, but that doesn’t mean it has to make sense to me. I expect a lot of people are going to have similar reactions.
Having gone through all that, it’s worth asking: which is less correct? Text editors for letting me turn “Oy!” into the ear-Death Star-spider combo, or Firefox for its rigid glyph substitution? I’m guessing that the answer depends entirely on which side of the Unicode fence you happen to stand. For those of us who didn’t know there was a fence, there’s a bit of a feeling of a slip-and-fall right smack onto it, and that’s going to hurt no matter who you are.
Comments (32)
What’s the unicode reference for a spider?
[trying to figure out what an explicit ‘unicode’ reference might look like, so one can see it, or any, webding in ff3, because I can’t, um, picture it]
According to the OS X Character Palette, it’s Unicode F021, Chris. (And F023 is a “no pirates” symbol, in case you ever need that for your ninja clubhouse signs.)
I hear their album is pretty good.
Why is this about Unicode? Isn’t this more a question of meaning? If you start using random characters such as “y” to mean “Death Star” how will non-visual interfaces ever be able to deal with that other than implementing some weird hack to cope with the madness that is Webdings?
that this partly explains some cases of strange rendering of pages I experienced with Firefox 3.0b.
Some websites have single glyphs rendered in a different font.
I experienced this only with sites that use strange font stacks,like:
font-family:'Palatino Linotype',Palatino,'Zapf Calligraphic','URW Palladio L','Book Antiqua',serif;
This example is taken from: http://www.safalra.com/web-design/typography/web-safe-fonts-myth/
On this page in FF 3.0b on Ubuntu Hardy Heron every e is from another font than the rest of the text.
Interestingly, in German there is a special term for a single letter that is erroneously set in a different font. It is called a Zwiebelfisch.
Well if I’m working on a site that needs to be Chinese or Arabic or Hindi (or partly in one of those), I’d much prefer the unicode glyphs to override any font that might destroy the readability of it. I doubt anyone would have a non-easy-to-read font as their fallback in a program or user stylesheet… but…you never know what a user will do. So I have to say Firefox 3’s approach, is best.
I wrote a post about that recently.
Btw, what”s really broken here is Wing-/Webdings, not any of the other parts of the machinery. Although, the fact that a font can sport two different mapping tables, such that they can be inconsistent with each other, both of which are used, but each in different contexts, is certainly braindead in its own way. Even so, the braindeath lies with Wing-/Webdings for using a spider as the glyph mapped to the exclamation mark in one of these tables.
Unfortunately the blame game won”t help, as there are hysterical raisins for that brokenness – Unicode is a relatively recent invention, and character sets prior to it had no code points for the likes of the Wing-/Webdings glyphs. Now that Unicode exists and provides for those, the pre-Unicode mapping table cannot just be disappeared from history: lots of old documents in Office format need the brokenness to render correctly. Of course, you have grown accustomed to the way things were, so what Firefox does with the aid of the new mapping table, although better, feels wrong.
How it should have been, had Unicode been there from the start to make this possible, is that you would have picked the ear, Death Star and spider characters from a palette, and they would have shown up as the ear, Death Star and spider regardless of what font you selected in the dropdown box. Only if you disliked the system”s default choice of symbol font for such glyphs would you ever have to specifically choose Wing- or Webdings from the dropdown – otherwise you would never actively choose those fonts at all.
But maybe we will soon..
http://googleblog.blogspot.com/2008/05/moving-to-unicode-51.html
Of course, the google blog gets minus points for calling UTF-8 Unicode. Not in the graph, where it’s placed as in a category with “US only” or “Western”, but in the text, where it’s placed in a category with ASCII, Latin-1 and Windows-1252.
If the editor is Unicode aware it should tie the symbol to the specific code, the problem is that font selection was created before Unicode and most programs created the interface based on the idea that a letter is a glyph corresponding to a code, but any font can define the glyph for the code in different ways, so the “meaning” of the text is tied to the font, not to the particular code sequence. The Unicode fence only exists because the tech world moved on to a better solution (i.e. Unicode) but the editors and OSes are still tied to their old ways. BTW I find the spec behavior to be “intuitive”, but you could guess it from the rest of my comment.
Aristotle is right in that Webdings is the issue. For this to work as you desire you’d need a Webdings equivalent that’s based on Unicode; the “ear” would really be U+004F and all software would recognize it as an “O”, just a funny looking one.
I agree with others that Webdings is the cause of the problem. If you look in Mac OS X character palette, changing the View to Glyph, you will notice that in the font Webdings, the spider is assigned to the exclamation point. So, with that in mind, Firefox 3 should show that glyph when the font is Webdings. That is certainly what I would expect. So I also agree with Eric, that Firefox may be behaving correctly, but not how we want, and not how is best for users (in this case).
Just to reiterate something, but in a slightly different wording.
As I explained in your previous post, and Smokey went to great lengths to explain it better (I thought you would have known this, hence I did’t go into details), the character (U+F021, that spider thing) you expect is in the private use tables (check with Unicode Checker). The character you input (‘!’, U+0021) is not available in the specified font (Webdings). Thus Gecko 1.9 falls back on your default font. It is unreasonable to expect the browser to do anything else. The fact that WebKit displays the character you expect is imho a bug, and WebKit is inconsistent, as it doesn’t do that for WingDings or Symbol.
Similarly, on a Mac keyboard, you can type
shift + option + k
, and this will yield the apple logo (U+F8FF, private use table) with a bunch of Mac fonts. You can’t be sure that users on any other OS will see that character in their browser.Ultimately there is a mismatch between symbol fonts such as Webdings and Unicode, which is an encoding standard for “written characters and text” (cf. first line in Chapter 1 of the Unicode standard). Symbol fonts were invented as an easy way to insert pictures into text (“pictorial items”), and as such most of the glyphs included within them are not eligible for encoding in Unicode (cf. Chapter 15). For legacy purposes, many glyphs in pre-existing fonts have been encoded, but now that Unicode is up to its fifth major release, I wouldn’t expect much slack to be given to new symbol fonts. I believe the only standards-compliant course of action is for symbol fonts to map their glyphs to the private-use area, but this would of course make entry by users more difficult.
Can we blame Microsoft? :-) Well…
*ONLY* Firefox 3? i thought Firefox had been behaving in a unicode-strict manner for a while now. Opera has been since Opera 6, and I’ve been using a cut-and-paste explanation of why good browsers are correct not to display their webdings (etc) character for years now.
I get the Webdings characters in Firefox 2.0.0.14 and Opera 9.27, Richard. And this would be a great place to paste that explanation (or a URL to it, if you prefer)! I’d really like to see it.
I get the Webdings characters only in IE7, but not in Firefox 2.0.0.14 nor in Opera 9.27 and 9.50.
Prince 6.0r6, interestingly enough, displays “Oy [spider]”
Perhaps your system is somehow misconfigured, then, Eric? I’m seeing substituted glyphs (Times New Roman, by all appearances) in Opera 9.2 and Firefox 2, as I always have (I’ve been aware of this property of the *ding fonts for several years).
I worked with online scientific journals for a few years a while back, and if we wanted a certain character, we had to include a meta content-type charset=utf-8, and then always use unicode to specify desired characters, otherwise all sorts of problems similar to this would occur.
Eric,
The fact that those Webdings characters are displayed in Gecko 1.8.1/Fx 2.0.0.14 is considered a bug (but I forgot the bugnumber…). And Yes ! Opera 9.2x and 9.5b also display those Webdings thingies. But only on OS X, go figure (perhaps the same issue that plagued Gecko 1.8.1 – in the QuickDraw layer).
The right way to do what Eric tried to do (having an crazy fallback font, so it would be obvious what happened) is to use a font this is crazy but actually has the crazy characters in the places of normal letters. That means, that “a”, “b”, “c” etc. should be those actual letters, but looking obviously freaky. So just get one of those fonts (I’m sure there are lots of them out there), install it in your work machine, and specify it in the CSS. That should work according to all the Unicode rules. (And the Firefox Webdings behaviour is indeed the only correct one. It’s also logical and good.)
Very true Anne. Accessibility goes out the window. Another fine example of propriety standards which destroys interoperability and accessibility.
@Eric
May I suggest using an image with alt text for your ear-Death Star-spider combo. Sounds like some name for a hack.
Off-topic, but unless I’m missing something, Firefox 3 still doesn’t handle OS X windows appropriately. Closing the last tab doesn’t close the entire window. In fact, you can’t close the last tab at all, unless I’m missing something.
Pingback ::
Browsersphere » Blog Archive » How Web Browsers Utilize CSS Font Rules
[…] was pointed to a very interesting blog post by Eric Meyer titled Characteristic Confusion, which reveals how web browsers like Firefox utilize CSS font rules. It’s been happening ever […]
How does Firefox know that the Death Star character is not the same glyph as y? (Surely Firefox is not opening the font and using character recognition to determine that it doesn’t look like a y.) Does the Webdings font somehow tell the browser something like, “The Death Star is the glyph for Unicode F021 but you should use it for the letter y?”
Pingback ::
Links of Interest (May 9th 2008 through May 29th 2008) · All the Billion Other Moments (Jason Penney)
[…] Characteristic Confusion While investigating line-height Eric Meyer used font-family: Webdings to display “Oy!” (Webdings doesn’t contain ‘O’, ‘y’, or ‘!’). Firefox 3 unexpectedly displayed “Oy!”, which, it seems, is technically correct, leaving him asking “which is less correcTags: CSS, font-family, line-height, Eric Meyer, unicode, webdesign, […]
I agree with Richards: I get the “Oy”s in Firefox 2.0.0.14 as well.
So then – is there a proper (or, FF3-accepted) way to request the webdings/spider character in html? I can’t seem to get it to work, using the code reference.
I don’t know if I’m missing something obvious, but all this seems more complicated than everybody’s assuming here…
First of all: I’m using Firefox 3, final release, on Ubuntu 8.04, and the Webdings font on your test page renders as you expected, in the way Philippe described as a bug. That is: I see the ear, Death Star and spider.
Second, the explanation doesn’t make sense to me. Sure, the Unicode behaviour is completely logical: if the font doesn’t have this Chinese glyph, find a font that does. However, Webdings actually has the glyphs your are asking for: they just don’t look the way they should, but as Randy pointed out, the browser doesn’t know that… A font should not represent the letter “o” as an ear, but apparently for historical reasons, it does.
Why does it work a certain way on my Ubuntu Linux, and another way on other computers/systems, including my brother’s Windows XP box? Is my Firefox different? (Could be, since it’s from Ubuntu’s official repositories) Or are our Webdings fonts different? Or does it depend on some configuration?
Okay, I’m lost.
Right! It’s as if FF3 is somehow figuring out which fonts contain “approved” representations of the described characters. For instance, U+0021 is the Unicode representation of the exclamation point. This can be represented in encoded html as
& #x21;
(I’m leaving a space between the & and the #x21 character code on purpose because of the way this comments form is parsing it – pretend it’s not there)
This is a simple, and obviously non-standard example set, but this line:
<font face="Arial">& #x21;</font>
produces an exclamation point in Arial. This one:
<font face="Trebuchet MS">& #x21;</font>
produces a differently-styled exclamation point in Trebuchet. This line, however:
<font face="Webdings">& #x21;</font>
which should produce the Webdings font’s version of the exclamation point (which, per the font designer’s choice, just happens to look a lot like a spider), does not. Instead it falls back to Times (default font) and renders that font’s version of the exclamation point.
What gives?
1) how does ffx know wing/webdings do not display code point `o` as ‘a donut’?—it does not, can hardly, and should not (imagine wanting to use a black letter font and suddenly some browsers bail out thinking those `o` glyphs are not ’round enough’…). but, those *ding fonts were specifically marked up as ‘symbol’ fonts, and my guess is that ffx looks at that point of the font metadata. it is like, you didn’t have unicode back when that was done, and a font could only render a single codepage, and maybe mappings to parts of other code pages (which is a laugh, since it is surely not the task of an outlines collection to do code page arithmetics), and it could also claim to render no specific script at all—mere SYMBOLS! so those are the histerical reasons.
2) should a text renderer such as a browser favor the specific font face? or should it fall back and try to render from another font that promises to ‘display a donut-shaped o where `o` was in the text’?
for a vast range of applications, falling back character by character to something ‘probably correct’ is, i think, indeed they way to go. i work a lot with chinese characters from unicode astral planes, and most fonts don’t have those characters. additionaly, the very few fonts that do have those astral characters lack huge parts of the (16bit) bmp characters (for reasons of file size). also, a LOT of chinese fonts only have the most frequent characters as glyphs, for the sheer number of them (almost 21 thousand chinese characters in the bmp alone), so failure on a character-by-character base is as frequent as a leaf blowing in the autumn wind as far as chinese typesetting is concerned. only those applications that are apt at font substitution are future-proof; others will fail when going beyond basic western needs.
3) it would be nice if applications could make configuring character substitutions more configurable, and provide more diagnostics. the only software that i am aware of that allows me to specify which font to use, unicode block for unicode block, is BabelMap (of http://www.babelstone.co.uk/Software/BabelMap.html).
the font configuration menu of all browsers i know of is a joke—you are expected to configure fonts *by language* and have no way to specify unicode code blocks. now what webpage contains a proper language markup? who in this world actually produces html that marks language switches for each paragraph? this would be important for chinese character variant selection (see http://en.wikipedia.org/wiki/Han_unification#Examples_of_some_non-unified_Han_ideographs), but even then it requires an informed user intervention (to be sure, the unicode consortium did the wrong thing here and embraced the technologically wrong kind of solution—code points are the only reliable way to distinguish character forms. you might differ, but looking at said table you will see that it is only *some* important characters that have their variants *not* reflected in unicode; when you go into detail, its a royal mess.)
4) sometimes you have the need to display elements of writing that are not encoded by unicode (i have hundreds of examples). well, unicode will never have a codepoint for everything that people want to write. but, it does provide a private use area—a vast array of codepoints that are meant to be free of interpretation, so you can put there whatever you like. i currently work a lot with typeface (of http://typeface.neocracy.org) to make sure i get exactly the outlines i intended. if downloadable fonts were a reality, then maybe browsers would actually honor the css font-family *when rendering material from a private use area*, for in those cases it is impossible to assume any other intended rendering. of course, that leaves accessibility in the open (but no screenreader today will help you to translate astral plane characters or major parts of non-western unicode anyway).
(@Sebastian Redl: the subject “is unicode an encoding? is utf-8?’ has been discussed in depth over at http://www.artima.com/forums/flat.jsp?forum=106&thread=230157)