Thursday, June 16, 2011

It's Hard to Build a Web Site Secure Against Untrusted User Input

Recently it was reported that a group of hackers broke into Citibank. One security expert was quoted saying, "It would have been hard to prepare for this type of vulnerability." Maybe so, but the problems are hardly new if this part is true:

They simply logged on to the part of the group's site reserved for credit card customers - and substituted their account numbers which appeared in the browser's address bar with other numbers.


I'm sure someone else has stumbled onto this problem several times (it would be hard not to because there are so many websites with sequential numbers and unauthenticated data in user input fields), but I know of at least two research projects that looked into the technical issues behind such vulnerabilities.

Prof. Nickolai Zeldovich at MIT innovated the RESIN research project to help web application developers more clearly specify assertions that pertain to security. For example, a common web application error is forgetting to include an authentication check. Pretty simple error, but pervasive and cumbersome to prevent without a way to express such assertions. The RESIN language helps programmers control information flow.

A second project is my own, so I am more familiar with it. Back in 2001 (when we said "web site" rather than "website"), our small team analyzed the authentication mechanisms in several websites. It was hard not to find problems where websites were susceptible to impersonation of the user. Relevant to the Citibank incident, there were cases where websites assumed the user would not change certain elements of the query string or HTTP POST request. Embedded in these requests were sequential identifiers. Here's one example from a talk several years ago:




After prodding from people working on web application toolkits, I took a new look at web authentication in 2004. Alas, even the web toolkits had authentication flaws. In one product used by large chain retailers, it was possible for a would-be thief to impersonate others to download retail receipts by changing hidden HTML code. I wonder if this problem is similar to the flaw at Citibank. This work was never published, but did appear in the WSJ.

For more information, I encourage the interested reader to consume these documents:


There are probably other historical technical documents of relevance from OWASP or the RISKS Digest. Happy searching!