Jun 29, 2009

Posted by in Articles | 1 Comment

Preventing semantic URL attacks in Web applications

security_mini

Damn pragmatists...

The problem

A semantic URL attack is a type of security exploit that takes advantage of the fact that a web browser cannot infer any semantics from an HTTP request. Take the following URL for example:

http://www.mywebapplication.com/viewPaymentDetails?receiptId=7892

Let us assume that when the URL is processed, the specified receiptId is used to query some receipts table and then display its details to the user (1). Now, what would stop someone from simply changing the value of receiptId and retrieving the details of someone else’s receipt? Surely that shouldn’t happen. But, it can and it does. The browser does not care at all about the request’s semantics, so it ends up blindly trusting it and therein lies the vulnerability.

The example above would allow an attacker to retrieve and display arbitrary details, but realize that the issue can be much more serious than that when an application depends on parameters passed in through an HTTP request to do, say, branching or updates to or deletes from tables, which is why there is a whole other discussion about the importance of always assuming that user input is unsafe to begin with and the need to sanitize/validate/filter it before anything is done with it.

A solution

Our solution should be two-fold:

  1. Ensure that requests that are only sent from our application’s forms are checked and confirmed to have come from our forms
  2. Ensure that requests that are sent via hyperlink-clicks, which contain parameters, are checked and deemed permissible for the logged-in user to access

Tokens, tokens, tokens

Our problem is that our browser has no way of knowing if a request is being 1) formed and sent by our application’s forms, or 2) arbitrarily by the user. Our solution should therefore involve providing extra information to allow the application to differentiate between the two types of requests. One solution is to use tokens, which is simply a metaphor that involves generating random and unique numbers to pair users to requests. Here is how it works:

In every page where we have a form, we generate a unique and random number (a token) and then

  • Set it to the session, and
  • Add it as a hidden input field to the page containing the form

Then, in the server-side class that takes care of processing the form (Action class, form handler, etc), simply check to make sure that the value of the token that is set to the session is equal to the value of the token that was sent in the HTTP request (the hidden field). If so, then we can consider the token to be valid; if not, we can take an alternative course of action such as logging the user out. Finally, we reset the token.

Since the token is regenerated every time the page that includes the form loads and is reset both in the session and in the HTML page’s source, we can always be sure that the user will not be able to proceed if he or she directly accesses a URL, modified or not, without arriving at it through a form since no token will be set in the HTTP request and the token in the session may possibly have expired.

For your eyes only

Our second issue exists if we, say, have a table of receipts for a particular user with each receipt number linked to a URL similar to the one shown above. On submission, we need to make sure that our request handler verifies that the receipt number passed to it is in fact a receipt number that belongs to the logged-in user as identified by his credentials in the session. Therefore, what we need to do is to add logic to query the database every single time an HTTP request is received and prior to its parameters being used for anything to ensure that the logged-in user has the necessary privileges to access the parameters.

Encrypting URLs or individual parameters may also be an option, albeit it remains a limited one. Using algorithms that allow for two-way encryption/decryption may leave the application vulnerable in the case that the encryption/decryption algorithm and possibly the salt is figured out by an attacker. And encryption-only algorithms are only useful if the application knows what set of values a particular parameter is a subset of, meaning that if we’re going to encrypt a parameter using something like, say, MD5, we need to have a fairly small set of values to encrypt on the server-side to compare hashes and know when we have a match.

Code

Instead of rewriting code that others have already written, I’ll point you to two places where you can find sample code listings of how to implement tokens in Struts and PHP. For the former, check out Romain Guay’s article here and for the latter, check out the section titled Safeguarding Against CSRF in Chris Shiflett’s article here. I would strongly recommend reading Chris’ blog posts and book (referenced below) for succinct and clear descriptions of security issues to look out for in Web applications along with the best practices for mitigating them.

For completeness, note that to check the validity of a token in PHP in your form handler, a simple guard such as the following would suffice:

if(isset($_SESSION['token']) && $_POST['token'] == $_SESSION['token']) {
   //token is valid
}

Footnotes and references

(1) The URL need not be available as such in the application; it may in fact be the case that “receiptId” is sent via a POST request from a form to viewReceipts.do, but by simply viewing the source code, an attacker could construct such a URL and either execute it directly or conceal it in an image by setting it to the image’s src attribute. In Java or PHP, HttpServletRequest and $_REQUEST, respectively, return parameters regardless of whether they were submitted via POST or GET. Also note that although using POST instead of GET, the latter of which rewrites the URL, to transfer parameters may give the impression of added security since parameters are not visible in the URL, one should keep in mind that it is only slightly more inconvenient to view or modify POST variables by using applications such as Fiddler. Nevertheless, it is always best to use POST for forms that perform actions other than simple retrievals of data as per RFC 2616 [Shiflett].

[Shiflett] Chris Shiflett, Essential PHP Security, O’Reilly, 2006

  1. Hey,

    thanks for sharing, very helpful!

    - felipi

Leave a Reply