Acid Redux
Published 16 years, 9 months pastSo the feeds I read have been buzzing the past few days with running commentary of the WebKit and Opera teams’ race to be the first to hit 100/100 on Acid3, and then after that the effort to get a pixel-perfect match with the reference image. Last I saw, Opera claimed to have gotten to 100 first but it looked like WebKit had gotten both with something publicly available, but I haven’t verified any of this for myself. Nor do I have any particular plans to do so.
Because as lovely as it is to see that you can, in fact, get one or more browser implementation teams to jump in a precisely defined sequence through a series of cunningly (one might say sadistically) placed hoops, half of which are on fire and the other half lined with razor wire, it doesn’t strike me as the best possible use of the teams’ time and energy.
No, I don’t hate standards, though I may hate freedom (depends on who’s asking). What I disagree with is the idea that if you cherry-pick enough obscure and difficult corners of a bunch of different specifications and mix them all together into a spicy meatball of difficulty, it constitutes a useful test of the specifications you cherry-picked. Because the one does not automatically follow from the other.
For example, suppose I told you that WebKit had implemented just the bits of SMIL-related SVG needed to pass the test, and that in doing so they exposed a woefully incomplete SVG implementation, one that gets something like 2% pass rates on actual SMIL/SVG tests. Laughable, right? Yes, well.
Of course, that’s in a nightly build and they might totally support SMIL by the time the corresponding final version is released and we’ll all look back on this and laugh the carefree laugh of children in springtime. Maybe. The real point here is that the Acid3 test isn’t a broad-spectrum standards-support test. It’s a showpiece, and something of a Potemkin village at that. Which is a shame, because what’s really needed right now is exhaustive test suites for specifications– XHTML, CSS, DOM, SVG, you name it. We’ve been seeing more of these emerge recently, but they’re not enough. I’d have been much more firmly in the cheering section had the effort that went into Acid3 had gone into, say, an obssessively thorough DOM test suite.
I’d had this post in mind for a while now, really ever since Acid3 was released. Then the horse race started to develop, and I told myself I really needed to get around to writing that post—and I got overtaken. Well, that’s being busy for you. It’s just as well I waited, really, because much of what I was going to say got covered by Mike Shaver in his piece explaining why Firefox 3 isn’t going to hit 100% on Acid3. For example:
Ian’s Acid3, unlike its predecessors, is not about establishing a baseline of useful web capabilities. It’s quite explicitly about making browser developers jump… the Acid tests shouldn’t be fair to browsers, they should be fair to the web; they should be based on how good the web will be as a platform if all browsers conform, not about how far any given browser has to stretch to get there.
That’s no doubt more concisely and clearly stated than I would have managed, so it’s all for the best that he got to say it first.
By the by, I was quite intrigued by this part of Mike’s post:
You might ask why Mozilla’s not racking up daily gains, especially if you’re following the relevant bugs and seeing that people have produced patches for some issues that are covered by Acid3.
The most obvious reason is Firefox 3. We’re in the end-game of building what I really do believe is the best browser the web has ever known, and we expect to be putting it in the hands of more than 170 million users in a pretty short period of time. We’re still taking fixes for important issues, but virtually none of the issues on the Acid3 list are important enough for us to take at this stage. We don’t want to be rushing fixes in, or rushing out a release, only to find that we’ve broken important sites or regressed previous standards support, or worse introduced a security problem. Every API that’s exposed to content needs to be tested for compliance and security and reliability… We think these remaining late-stage patches are worth the test burden, often because they help make the web platform much more powerful, and reflect real-web compatibility and capability issues. Acid3’s contents, sadly, are not as often of that nature.
You know, it’s weird, but that seems really familiar, like I’ve heard or read something like that before. Now if only I could remember… Oh yeah! It’s basically what the IE team said about not passing Acid2 when the IE7 betas came out, for which they were promptly excoriated.
Huh.
Well, never mind that now. Of course it was a totally different set of circumstances and core motivations, and I’m sure there’s absolutely no parallel to be drawn between the two situations. At all.
Returning to the main point here: I’m a little bit sad, to tell the truth. The original acid test was a prefect example of what I think makes for a good stress test. Recall that the test’s original name, before it got shorthanded, was the “Box Model Acid Test”. It was a test of CSS box model handling, including floats. That’s all it was designed to do. It did that fairly well for its time, considering it was part of a CSS1 test suite. It didn’t try to combine box model testing with tests for PNG support, HTML parse error recovery, and DOM scripting.
To me, the ideal CSS test suite is one that has a bunch of basic property/value tests, like the ones I’ve been responsible for creating (1, 2), along with a bunch of acid tests for specific areas or concepts in that specification. So an acidified CSS test suite would have individual acid tests for the box model, positioning, fonts, selectors, table layout, and so on. It would not involve scripting or markup parsing (beyond what’s needed to handle selectors). It would not use animated SVG icons. Hell, it probably wouldn’t even use PNGs, except possibly alphaed PNGs when testing opacity and RGBA colors. And maybe not even then.
So in a DOM test suite, you’d have one test page for each method or attribute, and then build some acid tests out of related bits (say, on an entire interface or set of closely related interfaces). And maybe, at the end, you’d build an overarching acid test that rolled verything in the DOM spec into one fiendishly difficult test. But it would be just about the DOM and whatever absolute minimum of other stuff you needed, like text rendering and maybe GIF support. (Similarly, the CSS tests had to assume some basic HTML and CSS selector support, or else everything else fell down.)
And then, after all those test suites have been built up and a series of acid tests woven into them, with each one culminating in its own spec-spanning acid test, you might think about taking those end-point acid tests and slamming them all together into one super-ultra-hyper-mega acid test, something that even the xenomorphs from the Alien series would look at and say, “That’s gonna sting”. That would be awesome. But that’s not what we have.
I fully acknowledge that a whole lot of very clever thinking went into the construction of Acid3 (as was true of Acid2), and that a lot of very smart people have worked very hard to pass it. Congratulations all around, really. I just can’t help feeling like some broader and more important point has been missed. To me, it’s kind of like meeting the general challenge of finding an economical way to loft broadband transceivers to an altitude of 25,000 feet (in order to get full coverage of large metropolitan areas while avoiding the jetstream) by daring a bunch of teams to plant a transceiver near the summit of Mount Everest—and then getting them to do it. Progress toward the summit can be demonstrated and kudos bestowed afterward, but there’s a wider picture that seems to have been overlooked in the process.
Comments (35)
Thank you for putting into words exactly why I’ve found this whole race-to-Acid-3 less than captivating.
Awesome post title, too. Less than an hour after the post, it’s on the first page of Google results for that phrase and after 1,000 people link to this tonight and tomorrow it’ll almost certainly be #1.
I’ve always thought of this too – why can’t there be a way to have an exhaustive test? I thought this might be a bit of a show-stopper: how does one test certain aspects of the specification that leave it up to the browser for interpretation? Do we not include these pieces, or leave it there to have a comparison between browsers? Perhaps leaving it out there could promote a “standard” in how these pieces are implemented?
Interesting read.
WIthout having studied the Acid3 test yet, I’d have to agree that it might be more useful if the browser devs implemented full sets of standards, not just those that happen to be in the latest Acid test.
This is what I noticed. I see Firefox, Opera and Safari working hard to accomplish a goal. To compete. To make the best possible product.
Then I look at Microsoft …
Of course, one can always criticize ACID3 or any other test suite. No test can be everything to everybody. Mr. Meyer and the Mozilla community can always design a test that comes closer to their understanding of a good test.
I think ACID3 is good because it drives web development. It will take a few years before we know how helpful ACID3 actually was, but right now, I am glad to see that the test provides an incentive for developers.
As long as browsers don’t commit to Acid tests they are indeed not testing functionality relevant to the Web. The same goes for standards. Kind of a self-fulfilling prophecy.
And if you’d like to see more DOM tests: make them! :-)
Eric might be overestimating the kind of effort that is needed to pass Acid 3 when a browser is already ‘almost there’. Also, it is just fun for developers to spend some time on a friendly little race with possible PR benefits. I wonder how the current nay sayers would have reacted if the Firefox developers had gotten to 100/100 first…
That SMIL issue in Safari is annoying, because it seems to trick people into thinking these Acid tests are pointless. But would IE8 have gotten support for DATA urls, proper OBJECT fallback, proper transparency if they hadn’t been packaged together like this? Likewise, passing Acid 3 will require support for SVG animations. I find it highly unlikely Safari will keep it at this, and I can understand them wanting to score a few points by enabling their SMIL work-in-progress in nightly builds before it is really useful.
And there *are* more complete test suites out there. Opera and Safari do really good on the Selectors tests as well for example, the Firefox devs don’t seem to be so interested in getting that right. Would they spend effort in this decade on getting SVG animation support if it doesn’t get included in such a PR-friendly Acid test?
Lastly, several of the fixes Safari and Opera had to do were useful anyway because it makes their handling of edge cases more compatible with Firefox… I don’t see the harm in that. All those little edge cases might seem trivial for casual observers, and at the same time cause real sites not to work because developers inadvertently rely on them.
@Rob:
I disagree – and reading Eric’s post/article, I am not sure that’s what thinks either.
I read it like this: WebKit and Opera shouldn’t be focusing on completing the Acid3 test, but instead do what Mozilla are doing: making a great browser.
I believe that the IE Team are working on that as well, but obviously they work a bit slower as their user base are a lot bigger and they don’t want to/can’t ruin all those peoples experience with the web, because everybody will be saying “This browser doesn’t work” instead of “This website doesn’t work” – just like a lot of people did when Firefox/Opera/Safari first popped on to the scene.
As I twittered yesterday; I do not understand why people take an irrational pride in their browser engine of choice being _first_ to pass a test. Why does that matter?
As for the ACID3 test itself, it’s a useful novelty, but as has been pointed out many times, it’s just an overview of a browser capabilities, it is not a thorough test. Which is fine in an of itself, but it’s very bad when the test is taken to mean ‘full and awesome support for the latest standards’ when it is in fact nothing of the sort. It’s become a trophy, and in so doing that’s robbed it of the significance it ought to have (because it no longer indicates full and complete support, it just indicated support for the bits that ACID3 tests).
I’m more inclined to go with Mozilla’s approach. Get a great browser out the door, make sure what you have already works exactly like it’s supposed to, and concentrate on edge cases (which is what ACID 3 largely is) later.
If 100/100 ACID score comes at the price of stability, I’m not interested.
Microsoft wrote an entirely new rendering engine for IE 8, and worked out a way for the browser to still include the IE 7 rendering engine for people who relied on its quirks.
If that”s not working hard to accomplish a goal, I really don”t know what is.
I think it’s kind of a twofold deal. You need to make a great browser (for the less literate web users) and you need to make a standards compliant browser for the power users and and developers.
Internet Explorer despite popular belief does it’s job for the less techy users but fails in the developer category, while (I personally believe) that Opera & Safari have a higher learning curve.
Firefox seems to be in the middle of them all in my opinion. It supports standards to a point (not to prove a point like other browsers) while still being easy enough to use (auto update feature is one example of how smoothly it all comes together).
In a perfect world all browsers would have the same backend i.e rendering engine, js code etc. and they all would just be a different interface. This could be accomplished if they all followed the standards EXACTLY the same, although nearly impossible from a business perspective.
Pingback ::
signal response» Blog Archive » Eric Meyer on Acid 3.
[…] been very excited about who’s winning and who’s losing the race for Acid 3, Eric Meyer makes a very important point that I had wondered about myself. It’s all well and good that Acid 3 requires at least some […]
Most ‘standards’ overlook two of the most basic of ‘standards’: Keep things simple and stupid. If it ain’t broke, don’t be trying to fix it. The biggest problem in this business is that too many don’t set aside that ‘pocket protector’ often enough.
Unless the browser developers are out-and-out cheating by detecting the Acid 3 page and throwing out a precomputed image, what’s the problem? So it isn’t exhaustive, but passing it gets us closer to the goal.
A decade ago I was with a search engine company that in fact did out-and-out cheat: We were able to infer the pet test searches of various tech journalists from their previous writings, and we hand-crafted results pages just for those searches, leaving other search results at the mercy of our rather flaky relevancy ranking algorithms. :-)
I’m not remotely near a level where I comment on the technical usefulness of these tests, as a designer of webpages I’m amateur at best, but I do take an interest. To me, the more standards that a browser complies to the better – as long as they’re not noticeably lacking in lower level standards.
The overall quality of this test will probably be determined down the line when we have the chance to see how the progress has continued, when we can see what comes next. I don’t think you can argue that the progress in recent weeks has been pretty hectic, albeit in very specific areas but if acid3 helps to push forward development then I’m not going to knock it. On its own, without anything to build on it then yes it could be better.
Personally I think the iPhone will be the biggest pusher towards overall increases in web browser development. It uses Safari and although it can display full webpages people are also developing pages specifically for it, to suit the UI. With other browsers, in a more traditional desktop environment it takes far longer for new things to be adopted since web developers aren’t going to add some new thing that most users won’t be able to see. If Apple add something to Safari then iPhone specific sites can take advantage of it knowing full well that every person who visits that version of the site will see it. The success of the iPhone could then dramatically boost adoption of things in the rest of the web.
Unfortunately, Anne, when browsers commit to passing acid tests, they’re still not testing functionality relevant to the Web.
And telling me I should write DOM tests is like telling me I should write medical-doctor certification exams. I’m about as qualified to do either one. In areas where I’m qualified, I’ve done a lot of test authoring, as I’m sure you recall. Beyond that, all I can do is advocate that those with the appropriate knowledge step up to the plate and invest their efforts in writing tests: because really good, broad, deep test suites are the fastest path to widespread implementation.
You mean like the two test suites I did write, ADAXL?
Pingback ::
сила против ловкости — очередной раунд — software simian's typewritings
[…] что буквально через пару часов я прочел запись Эрика Майера «Acid Redux», в которой он говорит очень похожие вещи. Но даже […]
So, as far as I understand you, what actually bothers you is not the Acid3 itself but that it can’t stop a browser from “cheating”.
Those eccentric tests in Acid3 like SVG animation are as eccentric as some CSS properties or base64 images in Acid2.
More or less, Frederico; more that it’s almost designed (consciously or not) to encourage “cheating”. I felt the same way about Acid2 when it was released, but I didn’t realize at the time it was going to establish a trend and so didn’t make a big deal out of it. Perhaps I should have.
Why do they not? Are you asserting that @font-face and DOM Level 2 Events for instance are not relevant?
Anne: Yes and no, in that order. Thanks for asking.
Pingback ::
Implementeringsjuks for ACID3 godkjenning - bza.no
[…] er egentlig 100/100-stempelet i ACID3-testen et bevis på implementering av standardene. Som Eric Meyer påpeker er for det første utvalget av egenskaper som testes i ACID3 begrenset i forhold til standardens […]
It’s a good point. People always howl at the IE team but let the FF team get away with the exact same behaviour(s). Personally I’ve noticed a lot of hypocritical actions from the Firefox fan squad. They’re happy for a site to ignore other non-IE browsers so long as it works in FF; they’re happy for FF to ignore an acid test when they screamed about IE7; etc.
My read on that is that Opera (not sure about WebKit) aren’t focussed on Acid3 to the exclusion of making a great browser. They’re actually doing a good enough job of their core business that they can also spend time on Acid3.
The Firefox project doesn’t appear to have that luxury. It looks like they have to ignore the test to make sure they get FF3 out as soon as possible. The honest truth is that Firefox 2 is in a bad way, with memory leaks and so on – the new version is far more stable and it’s important to get that out into the market.
Yes, it’s the same result, I just don’t think think Firefox project’s approach is based on a laudable decision – just necessity. The community didn’t cut IE any slack on those grounds so I see no reason FF should get a pass ;)
Frankly no browser team should really get beaten up over this sort of stuff so long as they’re genuinely working on improvements.
Eric, have you seen the CSS3.info Selectors test?
578 individual (but put in sequence via JS so that it’s not 578 pages) tests of CSS selectors, including the CSS3 ones. Sounds to me like that’s exactly the kind of thing you’re looking for, but I reckon you may not have seen it or you probably would’ve linked to it, no?
(btw, don’t know about the other browsers but Webkit passes it 100%)
It’s almost the sort of thing I’m looking for, Faruk.
It bothers me a that a CSS test would be that dependent on JavaScript– a user agent could potentially support one without the other (or be configured that way for a reason). Furthermore, it’s set up so that without JavaScript, I can’t run through the tests manually, nor even see them to check for accuracy. I’m also forced to analyze the JS itself to see if it could have any effect on the individual results or the reporting of those results. For example, does the way it puts things in sequence have any possibility whatsoever of spoiling one or more of the results? And what if the JS uses some feature that breaks in one browser under unusual but reasonable conditions?
In effect, it isn’t a CSS test suite; it’s a CSS and DOM test suite, even though it claims to be otherwise.
This is why I think absolutely minimal dependency is a virtue. If the CSS3.info selectors test were something I could go through manually, plus it offered the JS quick-summary function as an optional feature, I’d be a lot happier with it.
While working for Opera, Ian Hickson made reduced individual tests out of Acid2. If one looks at the bug reports from Acid3 (here is one from me) they become individual reduced tests as well.
When writing unit tests for back end code, I am sure glad there are ways to make a test harness so I can run multiple tests in a batch. What I am trying to say is do not be too afraid of automated tests. Acid3 serve a purpose. Now we need to raise awareness about the formal test suites as well. Anyone who wishes may help out adding info to my Wikipedia article in the making. On that page I reference to formal W3C selectors test suite, BTW.
Eric: The Selectors test was done by someone in their spare time, and its not been updated in a while. Your suggestions sound like good ones. You’re welcome (or anyone else) helping us improve it, if you do so wish. I’d like to do this for other modules as well. I think Background and Borders is probably the most important at the moment (due to developer interest) and Basic UI (due to the lack of test suite to progress the module forward).
For browser vendors the javascript isn’t really needed as we can plug the tests into our automated test harnesses, but automating the tests on the site increases the visibility hugely – Not many people are willing to sit through clicking each test, but are willing to press a button and look at the results. Allowing both, with a note about js dependancies for the automated would probably be best.
Faruk: Konqueror was first to pass the test, shortly followed by Opera. Safari pssed recently, but I’ve no idea if that credit goes to the fine folks at KDE or if it is a different implementation done by the Apple guys. Gecko and whatever the new IE engine is called, have yet to pass.
David: aye, I thought it was something like that but I didn’t have the time to check anymore other than in the browser I had open.
Eric: good points you make; it would definitely have been better if the tests were all available as separate files with no JS whatsoever, and the existing test would serve simply as a shortcut / added feature.
Good feedback :-)
Most of your points about testing are dead on, but I think some of them stop at “I don’t understand why” instead of actually trying to understand and go forward.
IE7+Acid2 being treated differently than FF+Acid3? In the exact same post, you point out some weaknesses of the Acid3 test — maybe Acid2 was a more important test, and ignoring it was a more powerful statement? Maybe web developers felt that the quality of IE’s CSS support was behind the industry standard and that supporting Acid2 seemed like a goal clearly in line with the IE team’s promise to improve CSS support?
So let’s ask, why isn’t there a comprehensive browser testing suite that is as widely known as Acid2? Is there a suite yet, and it’s just not widely known? Then let’s work on promoting it. Is there no suite yet? Then let’s work on making one. But I’m tired of feeling like little league parents yelling from the stands.
Dave: There are a number of test suites, although more are always needed. The CSS3.info selectors test has been mentioned in the comments. Daniel Glazman also has an interesting Selectors test here (Opera is the only browser I know that passes, though Konqueror may, and WebKit are improving). David Baron supplied a number of CSS3 Color tests here (If you discount flavor and color-profile which may be dropped, all None-IE browsers fare well – Opera having added HSLA, RGBA and CSS3 transparent in the ACID3 build). There is a full CSS2.1 test suite, but there are many test cases, so I’m not went through them all by hand. SVG1.1 has a full test suite, with results here. SVG 1.2 Tiny also has a beta test suite, but the implementation report only mentions one browser.
I don’t really know many good ECMAScript or DOM tests.
With what little I know, Acid tests seem born out of pain points web developers have and pushing those standards we’ve all been waiting for. And I can only see that as a good thing, exhaustive tests would be great too, but I am very glad for the Acid tests and the progress we’ve seen lately.
It’s almost like a standards marketing campaign, and I feel the web is better for it. Exhaustive test suites will probably never make headlines, they don’t put out the same amount of pressure.
Honestly, the only browser I worry about hoop jumping is IE. I trust WebKit and Opera and Firefox to make something out of any initial hoop jumping they do. And AcidN can hold them to it. Microsoft, not so much.
Pingback ::
Alexander Sperl (dorkydesign) — Blog — Blog Archive » Acid3
[…] http://meyerweb.com/eric/thoughts/2008/03/27/acid-redux/ […]
There are actually exhaustive tests already exist, for example the plethora of test suites available in W3C, just that it seems no one publicizes them too much since there are too many and no one can dream of passing them all in the forseeable future.
To me any test that includes a call to addEventListener (thus forcing IE to implement it if they want to pass) is driving the web forward.
I see a lot of people that underestimate the amount of time that goes into fixing up the little edge cases. If Acid 3 helps fix some of those cases it will be a big win!