Don’t Read; Speak!

Published 12 years, 7 months ago

With the debut of the WSP‘s ATF, a vigorous conversation has gotten underway.  Joe Clark weighed in with some suggestions, Andy Clarke got some rousing comment action, and more have spoken up.  This follows some recent and widely-cited thoughts from Matt May on WCAG 2.0 (with opposing view from Gez Lemon), and from Andy Clarke regarding accessibility and legislation (which inspired the publication of a different view from Andy Budd, not to mention another from Chris Kaminski).  I’ll join the chorus with some points of my own.  (Apparently, my recent post Liberal vs. Conservative was taken as a contribution to the discussion, which it wasn’t meant to be, although the points raised there are definitely worth considering in this context.)

This past May, I delivered a keynote at the 2nd International Cross-Disciplinary Workshop on Web Accessibility in Tokyo, and one of the major points I made was basically this: “Screen readers are broken as designed, and need to become speaking browsers”.

The problem is that screen readers are just that: they read what’s displayed on the screen for a sighted user.  In other words, they let Internet Explorer render the Web page, scrape the visual result, and read that.  I will acknowledge that in the tables-and-spacers era of design, this made a certain amount of sense.  That era is ending; in an important sense, it’s already over and we’re just cleaning up the mess it left.  Which is not to say that table markup is never and should not presently be used for layout purposes, nor is this to say that such markup should be used.  Okay?

What I’m saying is that screen readers need to become speaking browsers: they need to ignore how the page is visually displayed, and read the content.  Use semantic markup when it exists, and otherwise ignore the markup in favor of the actual words, whether it’s plain text or alt text.  Go from the beginning of the document to the end of the document, and ignore the CSS—at least that CSS which is meant for visual media, which these days is pretty much all of it.

You might wonder how a speaking browser should deal with a table-driven site, of which there are still quite a few, he said with some understatement.  One distinct possibility is to do what I just said: ignore the non-semantic markup and read the content.  I can accept that might fail in many cases, so I’ll present a fallback: DOCTYPE switching.  If a document has a DOCTYPE that would put a visual browser into standards mode, then be a speaking browser.  If not, then be a screen reader.

DOCTYPE switching has been, despite a few hiccups, incredibly successful in helping designers move toward standards, and allowing browsers to permit standards-based design without sacrificing every page that’s come before.  The same, or at least a very similar, mechanism could help audible-Web tools.

The WaSP has done great things in their efforts to show vendors why Web design tools should produce standards-oriented markup and CSS.  I sincerely hope they can produce similar results with audible-Web vendors.

  1. I have to say this – I don’t know much about Audible Web Tools (perhaps I should?), but the creation of the WaSP Task Force is a good step in the right direction. I appreciated what you mention on DOCTYPE switching – it’s been quite useful for the same situations you are referring to.
    I suspect the accessibility topic(s) will rage for a while longer as the ‘Tables’ era of layouts fades away…ungracefully…like that lady who was in the Brit TV show ‘Absolutely Fabulous’, smoking too many cigarettes and swearing at things all the time (good metaphor eh?!)

  2. I agree with this.

  3. …ignore the CSS—at least that CSS which is meant for visual media, which these days is pretty much all of it.

    And it will continue to be pretty much all of it until auditory styling is supported. If combined with speech recognition, I think that even people with normal vision would use a speaking browser from time to time.

    But alas, with no support, why should I even try to learn to write them? And how could I practice.

  4. I’ve been thinking for a while that the Mozilla codebase probably has most of the ingredients for a good screenreader, er, speaking browser, already. And I agree, DOCTYPE switching (or probably something more difficult to accidentally trigger). I wonder if there is a microformat solution here, something to explicitly tell the browser and the user that the page has been designed with visual impairments in mind.

  5. I think Apple’s VoiceOver is moving in that direction (speak, rather than read). This article seems to suggest that much. I haven’t done much testing myself, having a hell of a time even understanding the sounds of those syntetic voices. Opera also seem to have done some work with this, for opera 8 Windows.
    The DocType switch is a good idea, I think, mimicking what browsers already try for standards based code.

  6. Agreed. DOCTYPE switching for screen readers is what I’ve been advocating recently after attending @media 2005. However I’m seeking further information on this specific kind of browsers, such as how do they support languages other than English? Do screen readers pay attention to ‘hreflang’, ‘lang’ and ‘xml:lang’ attributes? How do they handle “multilingual, same content” Web sites? Perhaps the WaSP ATF should get in touch with the W3C I18N Activity group(s) (!?)

  7. There’s no question that screen readers need to modernize to understand web-standard coding techniques.

    I’m not sure that !DOCTYPE switching is the answer. Your hypothesis is that screen readers are currently screen scrapers, that they somehow glean stuff from the screen left to right, top to bottom. Maybe the older ones behave that way, but later versions are much more DOM dependent and don’t literally scrape the screen. Yes, they base themselves on IE. But, they extract the DOM tree and work from that. For the most part that is going to be very similar to the document source, no matter what the CSS does.

    To illustrate, I’m running some tests based on the first part of Joe Clark’s question, a question not dissimilar from some of what you have said. Over on Access-Matters, we’re testing 4 of the Zen Garden pages with assistive technology. I selected pages that have very different visual layouts. Of course, we all know they have exactly the same source. Results so far show that the latest screen readers (JAWS 6.01 and IBM HPR 3.04) read those four pages almost identically. The differences are in the areas where image replacement techniques are done with inaccessible methods.

    Next week, we’ll do more test cases with source that varies more.

    So in one sense, the latest screen readers are the “speaking browsers” you seek. They faithfully speak the source document independent of how CSS says it should be displayed. This is what you ask for in your fourth paragraph.

    What I think we really need is not mode switching, but standards compliance. The real problems are (1) inability to parse imported style sheets, and (2) total ignorance of aural style sheets. Fix those first. Then, add a configuration option where users can tell the screen reader to ignore CSS positioning.

    The last thing I want is yet another piece of software that warrants quirks mode considerations and pampering.

  8. I must agree with this. When reading this I was thinking of Opera browser which is already capable of being exactly what you say: a speaking browser. Much better than any screan reader can ever be. That is, only if the web developer is actually capable of doing something good.

    As you said, the era of table-layout (tag soup) is ending. Yet, you must agree that badly coded pages will exist as long as humans exist. The simple fact web developers are switching to CSS layout is not enough.

  9. iCab has been a “speaking browser” on the Mac for quite a long time. It reads the page source and passes the text content to the text-to-speech engine of the MacOS. Images are spoken using their alt attributes, and elements hidden by CSS screen rules are spoken, too

    You may try it with or for example. It’s fun to hear how it speaks meyerweb’s hidden navigation links or how stopdesign’s page stucture presents the most interesting content first (articles) and less interesting content later (such as lists of links).

  10. The results are in for the most popular current screen readers. Jaws 6.1, Window Eyes 5.0, and IBM Home Page Reader 3.04 all SPEAK web pages just as you ask. They handle the HTML from top to bottom, paying no attention to CSS layout.

    Read the full results at Access Matters.

  11. “… all SPEAK web pages just as you ask.”

    No, they don”t, as you yourself point out:

    “In every case, save one, the cause for material not being spoken was the use of display : none.”

    When I say screen readers need to become speaking browsers, and ignore the CSS, I mean that in every sense. It isn”t just about where things appear on the page. It”s about audibly rendering content that”s been visibly hidden for the screen medium.

    Until these tools change that behavior, they”re still at some level screen readers, and they”re still broken.

  12. DOCTYPE switching is not the way to go because, even when I first started designing web pages in 1997, even the frame-based ones (oooh, did I just type that?), I always (occassional exceptions) stated the DOCTYPE which means that your suggestion would not work on my pages.

    Personally, I feel that screenreaders—primarily built for the blind—should ignore CSS entirely (aural properties being the exceptions). (The content property is a bit of an issue: I have never liked the property because it allowed the designer to add content that wasn’t in the HTML.) On the other hand (getting back to screenreaders), there are some people who use them who are not blind: dyslexic users benefit from them and perhaps visually impaired persons too.

  13. screenreaders should honour CSS media types and media attributes in LINK elements and stay away from things that are meant for screen/print/etc

  14. While the short term of “making it speak” is a worthwhile and important goal it makes good sense to place it in the context of a longer term goal.

    Long Term Goal – Total Accessibility – 100% Usability

    A good test case then might be someone who is very old and has multiple disabilities. They are adjusting to the latest disability, blindness, and are not on the technological cutting edge.

    The user would need able to easily do the following using a public browser:

    Choose a vehicle of transmission (how the user communicates with the browser. (spoken, keyboard, signs, sounds, gestures / movement, binary interface (sip & puff switch / big red button ;)

    Choose a language
    ie. English

    Choose a Dialect (perhaps less important / optional?)
    ie. Canadian

    Choose the data reception vehicle / mode
    Spoken (in this case)
    But it might be
    Gesture (optional?)
    Pictures / icons

    Then the user might require and the site might offer varying degrees of complexity or detail depending on the users abilities and desires.

    Data Detail a.k.a. “view”
    In Depth

    So we are talking about a mode of transmitting information.By designing it for “multiple disabilities” we end up making accessible to everyone and super flexible for all users.

    Hope this is of interest to you tech headz…

  15. […] t 1 From Roger Johansson at 456 Berea Street: To-do list for the WaSP ATF From Eric Meyer: Don’t Read, Speak! From Joe Clark: ATF: Not ‘Alcohol, Tobacco and Firea […]

Leave a Comment

Management reserves the right to edit or remove any comment, especially when abusive or irrelevant to the topic at hand. HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <em> <i> <q cite=""> <s> <strong> <pre class=""> <kbd>

Comment Preview

If you're satisfied with what you've written, then go ahead...