tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Boynes <jboy...@apache.org>
Subject Using HttpParser for Cookie header?
Date Mon, 20 Jan 2014 21:38:13 GMT
I started to look at using HttpParser for the Cookie header but there are some differences
in the way it works compared to the existing parser in Cookies that I wanted to check direction
before getting too far in.

The area I’m concerned about is the need to copy the bytes in order to parse the header.
The Cookies parser relies heavily on MessageBytes and avoids copying to a String as far as
possible. HttpParser, however, operates on a StringReader which requires converting to a String
before parsing.

After digging into the usage of Cookies I think there are only two places that read them:
1) Request#getCookies(), which needs to copy to Strings anyway in order to create the Cookie
instances it returns
2) CoyoteAdapter#parseSessionCookiesId(), which parses the header and compares names as MessageBytes,
only allocating a String for the value if the session cookie is found

It’s this second one that has me concerned about switching to HttpParser as this gets called
for every request. If we switch then there is going to be allocation and copying of the header
that we currently don’t do. 

Having said that, the current parse relies heavily on the assumption that the header is US-ASCII
encoded and that it is only dealing with 7-bit characters (it freely casts bytes to chars).
The cookie change proposal has us supporting UTF-8 as specified by HTML5 which means a more
robust decoder will be needed and the copy may not be avoidable.

My plan here is to KISS and implement a parser similar to the others in HttpParser assuming
the header has already been decoded so it can just deal with the chars. Then if we notice
any performance degradation we can focus on improving HttpParser which will have the benefit
of working for the other header parsers as well. I’ll implement this alongside the existing
code (actually, in the parser package) to make it easier to do an A-B comparison.

There would likely be some follow-on changes from such a change. Cookies and ServerCookie
are recyclable objects associated with the request. By moving away from MessageBytes these
could be replaced by basic String values and may not be needed e.g. Request already caches
the array of Cookie values returned from getCookies() and that could be now populated directly
from the parse. These classes may end up going away.

Thanks
Jeremy


Mime
View raw message