What I learned about Cookies this week

A lot of people reading this are probably using frameworks that have decent session storage and don’t often need to worry about cookies. But over at CIM the thought of session state for many tens of millions of requests per day is daunting. As a result we use a lot of cookies. Probably too many, but that’s a story for another day.

Ever read the RFC for cookies? Neither had I. I probably wouldn’t have either until our company decided to standardize on Apache Tomcat for our application containers. What we thought would be an easy conversion turned out to take a few days time because some of our important cookies apparently violate the RFC, but no browser or web server had complained yet. What rules did we break?

The RFC is a lot to wade through, but thanks to some information from Tomcat’s issue tracker it was made pretty clear. Basically a cookie definition boils down to:


   av-pairs    =     av-pair *(";" av-pair)
   av-pair     =     attr ["=" value]              ; optional value
   attr        =     token
   value       =     token | quoted-string

Where the definition for token and quoted-string are provided in the HTTP/1.1 spec.


    token          = 1*<any CHAR except CTLs or separators>
    separators     = "(" | ")" | "<" | ">" | "@"
                   | "," | ";" | ":" | "\" | <">
                   | "/" | "[" | "]" | "?" | "="
                   | "{" | "}" | SP | HT

    quoted-string  = ( <"> *(qdtext | quoted-pair ) <"> )
    qdtext         = <any TEXT except <">>
    quoted-pair    = "\" CHAR

So there you have it. And most web servers and browsers don’t seem to complain. But if you ever find yourself looking at your server logs or a debugger wondering where you cookies went or why they’re truncated. Then it might be time to double check the RFC.

Oh, and the fix? Well ideally we’ll fix our cookies to comply with the RFC, but that will take some time since we don’t control the creation of all the cookies in question. So the plan for now is to patch Tomcat’s Cookie class to be a bit less strict when it’s parsing the request.