Liberal vs. Conservative
Published 19 years, 7 months pastSo it turns out that crackers can mess up your Web site with nothing more than a malformed HTTP packet. You might think something as simple as HTTP would be basically risk-free, but no, I’m afraid not. All it takes is interaction between programs that handle HTTP data slightly differently, and hey presto, you’ve got a security hole.
Ben Laurie weighed in on this:
“It is interesting that being liberal in what you accept is the base cause of this misbehaviour,” Laurie says. “Perhaps it is time the idea was revisited.”
That’s a reference to the late Jon Postel‘s dictum (from RFC 793) of “be conservative in what you do, be liberal in what you accept from others”. This is done in the name of robustness: if you’re liberal in what you accept, you can recover from data corruption caused by unanticipated problems.
Laurie’s right. The problem is that being liberal in what you accept inevitably leads to a systemic corruption. Look at the display layer of the Web. For years, browsers have been liberal in what markup they accept. What did it get us? Tag soup. The minute browsers allowed authors to be lazy, authors were lazy. The tools written to help authors encoded that laziness. Browsers had to make sure they could deal with even more laziness, and the tools kept up. Just to get CSS out of that death spiral, we (as a field) had to invent, implement, and explain DOCTYPE switching.
In XML, it’s defined that a user agent must throw an error on malformed markup and stop. No error recovery attempts, just a big old “this is broken” message. Gecko already does this, if you get it into full-on XML mode. It won’t do it on HTML and XHTML served as text/html
, though, because too many Web pages would just break. If you serve up XHTML as application/xml+xhtml
, and it’s malformed, you’ll be treated to an error message. Period.
And would that be so bad, even for HTML? After all, if IE did it, you can be sure that people would fix their markup. If browsers had done it from the beginning, markup would not have been malformed in the first place. (Weird and abnormal, perhaps, but not actually malformed.) Håkon said five years ago that “be liberal in what you accept” is what broke the Web, markup- and style-wise. It’s been a longer fight than that to start lifting it out of that morass, and the job isn’t done.
Authors of feed aggregators have similar dilemmas. If someone subscribes to a feed, thus indicating their interest in it, and the feed is malformed, what do you do? Do you undertake error recovery in an attempt to give the user what they want, or do you just throw up an error message? If you go the error route, what happens when a competitor does the error recovery, and thus gets a reputation as being a better program, even though you know it’s actually worse? That righteous knowledge won’t pay the heating bills, come winter.
“So what?” you may shrug. “It’s not like RSS feeds can be used to breach security”.
Which is just what anyone would have said about HTTP, until very recently.
In the end, the real problem is that liberal acceptance of data will always be used. Even if every single HTTP implementor in the world got together and made sure all their implementations did exactly the same strictly correct conservatively defined thing, there would still be people sending out malformed data. They’d be crackers, script kiddies—the people who have incentive to not be conservative in what they send. The only way to stop them from sending out that malformed data is to be conservative in what your program accepts.
Even then, it might be possible to exploit loopholes, but at least they’d be flaws in the protocol itself. Finding and fixing those is important. Attempting to cope with the twisted landscape of bizarrely interacting error-recovery routines is a fool’s errand at best. Unfortunately, it’s an errand we’re all running.
Comments (21)
Fair warning: anyone thinking about twisting this post into an argument about political ideologies is heading the right way for an edited comment. And “edited” can easily mean “deleted”.
The article talks about a proxy server in the middle presented malformed HTTP requests. This is really a “man-in-the-middle” exploit, correct? The proxy server in the middle could as easily replace the content entirely. This has nothing to do with Postel’s law.
Even if the HTTP spec were updated to force refusal of a packet with 2 Content-Length headers, the script kiddie that you speak of will just drop the 2 Content-Length headers, and just put out theirs, with their content to match. That doesn’t change the fact that he is in the middle of the conversation.
Remember, Postel’s Law says to be ‘liberal’ in what you accept from others, it doesn’t say to be a doormat.
Scott, as outlined in the paper by Watchfire, HTTP Request Smuggling is an umbrella term for a class of attacks that share some core ideas. However, none of them is exactly a Man-in-the-Middle attack. Here’s a brief example of one of the attacks:
Here we have a web server, and a proxy server. Both you and I use the same proxy server to access the web server (perhaps we use the same ISP, work in the same company, or the proxy acts as a reverse proxy for the web server). The web server acts as a virtual host for two sites, Good Site, run by some trustworthy third-party, and Evil Site, run by me. If certain conditions are met, by sending a special HTTP request to the web server via the proxy server, I can poison the cache of the proxy server, making the proxy server think that some page at the Good Site is actually identical to some page at the Evil Site.
Now, these cache poisoning attacks are not a new idea, but previously they relied on bugs in the server implementation. The highlight of HTTP Request Smuggling is that both the web server and the proxy server may work as designed, and yet the problem wouldn’t be there if either the web server or the proxy server would be switched to some other product. It’s the combination of them and the differences in the way they differ from the spec or handle borderline cases and slightly againt-the-spec input that causes the problem.
Usually you can work around HTTP Request Smuggling by disabling some HTTP features or changing the server configuration. It can be mitigated by using SSL or TLS, provided that some conditions are met. However, the best solution would require that all the software implementing HTTP would implement it strictly, accepting no malformed input.
To apply the same idea to XHTML, if either my XHTML sanitizer or your browser has a liberal parser, some third party might be able to submit XHTML that conducts a Cross Site Scripting attack, even though the attack wouldn’t be possible if both of them shared the same parser or had a strict parser.
I used to think HTML was revolutionary for allowing errors – it would still display something. Then I learnt XHTML and cursed all pages with missing end tags and broken markup. Now I see that even broken XML should be capable of being parsed. Why? Because on the web there are many conditions when it is desirable not to throw an error and simply stop (the Yellow Screen Of Death). Consider you have a massive page of XML. Now what happens when:
a) the page stops loading half-way?
b) the page loads with an error
c) the page sticks during loading due to network congestion
d) the user hits STOP half-way through loading
e) some other fault occurs that means the document is not fully-formed
In all cases, the parser should attempt to rescue the document, if it can. Modern browsers are superb at fixing bad HTML, so why not XML?
Because the code won’t be correct? Yes, but at least the user will see something. Also, it will allow browsers to display documents as they load (as opposed to waiting for the whole file to load before showing a single word). XML files can get pretty big and not everyone is on broadband.
The only concern is that it doesn’t lead to people writing XHTML and XML in the way early web pioneers butchered HTML. But let’s face it – does it really matter if someone misses off a doctype? The browser can just assume a default one, like happens today. If the page is displayed wrongly as a result, then it’s the author’s job to add the doctype.
I know someone who preached about XHTML but is now experimenting with condensing HTML by leaving out as much as he can, including the html and body tags, if I recall. The funny thing is, it still displays! Indeed, you don’t even have to have any HTML in your page at all. Type some text and save it as an HTML file and it will still be readable.
Chris, there are shades of gray between including special tag soup parsing code and rejecting all malformed content. For example, the browser might cope with congestion or Stop command by implicitly closing any open elements and displaying the resulting tree. The standard should describe the specifics of how this must be done, the browser should discard any incomplete element that is not static content, and the parser shouldn’t attempt to guess omitted parts.
I’ll have to confess that I’m not sure if this is enough. If even incremental loading is a security risk, I’m not sure where that leaves us.
Anyway, in plain text or strictly presentational language, “anything goes” is a fine strategy. However, (X)HTML has interactive and programmatic features and not being strict with them means that harmful content may slip through. One might argue that this is a shortcoming with SGML and XML, but a language with inviolatable bounds is necessarily either strictly parsed or always manipulated with well-behaving software tools.
These types of attacks may seem insignificant now, compared havoc created by Windows worms and phishing, but I’d prefer making the foundation solid now instead of five years from now.
Eric, thanks for using the correct, unambiguous term for a criminal network intruder — “cracker” — even though the source article used a confusingly wrong term.
http://en.wikipedia.org/wiki/Hacker
Exactly. It isn’t Postel’s Law that’s at fault. The problem is what this broken software is doing with the input.
If IE did it, the flow of users to Firefox would become a full-on stampede.
If browsers had done it from the beginning, the WWW would consist of a collection of about 10,000 physics papers.
The web is successful because from the beginning, it has always been very easy to create content. The language of the web matches the web’s main use cases. Obviously the hardcore geeks want to do open up more sophisticated use cases, and a more rigorous syntax would make this easier. However, for the vast, vast majority of existing use cases — searching for stuff, reading stuff, submitting form data so you can buy stuff — the display layer does not need to be rigorous. Forcing rigor into the display layer would be like forcing UNIX sys admins to replace all their shell scripts with Java JARs. Shell scripts do the job. Java would be massive overkill. Pick your tools wisely.
I disagree. In the past eleven years, I’ve seen browsers tighten up their behavior on multiple occasions, and on every occasion, authors made the necessary corrections. Users didn’t stampede anywhere, even when there were choices. They didn’t have to, because sites didn’t stay broken for very long.
I think that’s just a wee bit of an overstatement.
Should do somehing like pop up block thing. At the top of the page say the HTML is malformed and Gecko is doing its best to display it. THen the user can see the page and if worried knows it is malformed. The user then has the choice of what they want to do
Chris,
The scenarios you pointed out are not really valid, IMO. If a document fails to load completely for any reason, it is the same situation as if the document had been transmitted in full, but was incomplete in itself. I for one would not like to see the client app make any “best guesses” about the rest of the document’s content.
If you liken the document content to a recipe (in that it contains instructions for what the client application displays / processes), you would not want it to get to an instruction to “put the thing into a hot oven”, and then not be aware of how long to leave it there, or how to take it out. In HTML, the error-correction is not so bad – there aren’t many instances where a misinterpreted block of code will cause undesirable behaviour – but if you are expecting an XML document of application configuration data, or some other set of instructions, the implications of a missing tag or block of data are enormous.
I think that Anne has a very good point about handling of malformed information: handling them is important, but shouldn’t break the user experience (XML is therefore not really good). The standards should define very precisely how error handling of malformed content is done to allow consistence in implementation (from what I read, the problem of the HTTP exploit is the differences between various nodes of a network in handling malformed packets/headers), therefore getting very strict languages as far as the user agents are concerned (strict definition of the use cases, and how to handle them), but much more forgiving from a user/coder standpoint (errors are handled neatly, you don’t just get a massive Yellow Screen of Death breaking everything, even though some kind of light warning would be a good thing).
Forcing well-formed (let alone valid) HTML on the world would be orders of magnitude bigger than any of the small changes that people have made in the past. It would break well over 99% of the sites on the web. Even if webmasters took the time to upgrade their markup to be valid, there’d etill be a huge problem – very few backend systems are designed to enforce wellformedness. That would make creating DOS-style attacks very very easy – just get the page to include some content in the wrong character set or find a way to include some unclosed tags. Sure, it wouldn’t work everywhere but I bet a huge number of systems have this kind of problem somewhere simply because very few server-side applications have been designed to ensure well-formedness (for example by using a compliant XML parser).
Indeed the fact that content must be well-formed is, IMHO, responsible for the current faliure of XML on the web. Until XML gets CSS-style well-defined error handling it is unlikely to gain any sort of widespread adoption. Indeed I still feel that it’s more likely that a vendor will release a ‘liberal’ XML-parser than XML in its current form will be widely used for human-consumed documents.
Indeed. And if it changed now, all that would be left is about 1000 blogs.
So, if I load a large document, have it incrementally render, notice that the part I’m interested in has loaded, and press stop, you think that the document should be replaced by the YSOD? Or you think that incremental rendering of XML documents should be forbidden?
What I love about the comments so far is that everyone’s acting like I advocated forcing rigor onto the display layer starting tomorrow, and damn the consequences. Did I? No. What I said was that if the display layer had been rigorous from the outset, we’d be in a much better position today. No, the Web would not be 10,000 physics papers, either. It would be just as varied and commercial as it is today, because everyone would have learned valid markup as they went. It would be second nature by now.
After all, it isn’t like well-formed HTML is substantially more difficult to grasp than is tag-soup HTML. It’s a pretty simple language either way; I’d even argue it’s simpler with rigor than without.
But the actual point of the post was that I believe the principles behind Postel’s Law need to be re-examined. Perhaps they’re still sound, but they’re taken much too far (as Scott said, “Remember, Postel”s Law… doesn”t say to be a doormat”). Perhaps they’re no longer appropriate for the Internet. Perhaps they aren’t taken far enough. I personally think it’s one of the first two, but without a serious contemplation of how the Law has been applied and the consequences of doing so, we’ll never even have a chance to do better.
Another example of where liberalness in what you accept causes the well-known mess: SVG.
There”s a lot of broken SVG out there on web pages now, because Adobe”s XML parser is non-compliant and accepts non-well-formed XML, without namespaces and using undefined namespace prefixes, unclosed tags, etc.
But, I think that given that Firefox 1.1 will require the SVG to be well-formed for it to work (yeah and Opera too), that Firefox will be able to force this problem in the right direction, and cause people fix their SVG. After all, Firefox is the first major browser distribution with SVG built-in.
(and no, I don”t consider Opera a major browser distribution right now, and besides it”s only got SVG Tiny, and that not without issues).
You”re absolutely right :).
I tried to send a trackback using an online tool (because my blog can”t do trackbacks), but that didn”t really work. So an additional comment it is then :).
I elaborated a little on well-formedness in XHTML and SVG on my weblog.
~Grauw
But by saying “if IE did [start to enforce wellformedness], you can be sure that people would fix their markup”, you did rather invite the discussion of what the actual consequences of such a move would be.
I’m also curious as to why you limit your discussion to welformedness – after all it’s not like the parser is the only UA layer that is affcted by the content recieved. Something like
<table><h1>Foo <li> </li> </h1> </table>
is totally wellformed yet utterly invalid code. So in the context of your overall thesis I assume you would have browsers reject any html that contained this type of code. With that in mind, I find it very hard to accept that “it isn”t like well-formed HTML is substantially more difficult to grasp than is tag-soup HTML.”. For static, hand authored pages, sure, it’s not that much more difficult. The real difference is that server side tools that can be relied on to produce valid (or even, much easier, well-formed) markup are much harder to produce. SSI and PHP – the two easiest ways in to building dynamic sites would never have been produced because messing with strings isn’t reliable enough where the choice is between correctness and total faliure. Instead people would be forced to use more complex frameworks like XSLT (clearly this didn’t exist at the time but the point is that this is the type of technology that would have developed instead of the simple technologies that are prevalent today). The additional complexity of making sure that a site used valid code would require effort to be spent on that rather mundane task rather than improving the actual content. The combination of a lack of accessible technologies and high difficulty of site building would, I expect, have killed the kind of pioneer-style low budget startup that drove innovation in the early commercial web. That, not the difficulty of hand authoring valid code, is why a web that enforced validity would have been closer to Evan’s 10,000 physics papers than the web of today.
In any case, expecting the browser makers to produce a browser that enforces validity is like expecting a pencil to balance on it’s tip – sure it’s an equlibrium but it’s highly unstable to small peturbations. In this case, a small peturbation might be a bug – say a browser continuing rendering when an
<img>
element with noalt
attribute was encountered. Since, as experience shows, people test browsers rather than checking specs, there would be significant pressure for other browser makers to implement the quirk (this still happens of course, with almost every oddity in IE being replicated in code for all other browsers). Worse, there’s a significant competitive advantage in being able to render pages that no other browser can render since users care more about sites rendering than the browser they use. The people making browsers know this and have used it to their competitive advantage. So it’s practically impossible to imagine how strict error checking could have been enforced through the explosive growth of the web.So, whilst I agree that we’d all be better off today if people used correct markup, I’m deeply sceptical that this would have been desirable in the past and furthermore don’t believe, captalism being what it is, that such a situation could have been maintained. I prefer to think of Postel’s law as a dictum for authors: Clients will be liberal in what they accept but not all clients are created equal so you should be conservative in what you send.
/me whistles quietly.
Trackback ::
Grauw.nl
Well-formedness
Eric Meyer wrote a post on his weblog that being liberal in what you accept as is currently done with HTML is bad. I of course wholeheartedly agree. This post elaborates on well-formedness in XHTML and SVG.
I think people are missing something here; XML only requires that the parser throw a fatal error and stop processing the stream. It doesn’t require applications to throw away what they have received from the parser thus far. If a browser wants to display half a page, it can.
As I’ve said before, Postel’s Law is only beneficial when both sides keep up their end of the deal. Web authors aren’t conservative in what they send, so accepting it liberally (“being a doormat”) causes chaos. XML’s rule is obviously an attempt to stop being a doormat.
Whether something is well-formed or not is a question of syntax. It’s difficult to derive structure from a syntactically broken document. The job of transforming syntax into structure is entirely the XML parser’s job, and it should happen the same way for all document types.
Whether a document is valid or not is a question of structure. It may or may not be possible to recover from structural errors – whether it’s possible or not is application-dependent, so it makes no sense to apply a general rule to all XML document types.
Validity and well-formedness are two different things and are properties of two entirely different conceptual layers. Assuming that what applies to one also applies to the other is a mistake.
The web is successful because from the beginning, it has always been very easy to create content. The language of the web matches the web”s main use cases. Obviously the hardcore geeks want to do open up more sophisticated use cases, and a more rigorous syntax would make this easier. However, for the vast, vast majority of existing use cases — searching for stuff, reading stuff, submitting form data so you can buy stuff — the display layer does not need to be rigorous. Forcing rigor into the display layer would be like forcing UNIX sys admins to replace all their shell scripts with Java JARs. Shell scripts do the job. Java would be massive overkill. Pick your tools wisely
The problem is that being liberal in what you accept inevitably leads to a systemic corruption. Look at the display layer of the Web. For years, browsers have been liberal in what markup they accept. What did it get us? Tag soup