tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Boynes <jboy...@apache.org>
Subject Support RFC6265 cookie processing
Date Tue, 24 Dec 2013 01:21:50 GMT
In comments on issue #55917, there was suggestion for refactoring cookie support along the
lines described in RFC6265. Reading this RFC, it appears to be more of an effort to standardize
the actual behaviour seen on the Internet for different browser and server implementations.
The observation is the RFC2109 has received limited adoption and RFC2965 virtually none at
all, with most implementations falling back to the original specification released by Netscape
that contains certain ambiguities. 

The Servlet spec’s JavaDoc for Cookie refers to RFC2109 behaviour with caveats around interoperability.
It defines version 0 as complying with Netscape’s original specification and version 1 as
complying RFC2109 (with the note “Since RFC 2109 is still somewhat new, consider version
1 as experimental; do not use it yet on production sites”).

The current implementation uses a number of system properties to control how cookies are validated.
In implementing RFC6265 I hope that some of these can be eliminated. If not, I would propose
to add configuration options on the Connector or Host objects to allow the configuration to
be set separately for different host domains.

RFC6265 has separate sections in respect for generating and parsing cookie headers. It follows
the practice that generation be strict but parsing be more tolerant of invalid input. Our
current implementation generally follows that trend by suppressing invalid input data (after
logging). However, for some input data, primary CTLs, it throws an IllegalArgumentException
from the connector which does not allow the application to recover. In refactoring, I would
propose to simply ignore that input thereby allowing the application to handle it, for example
by parsing the header field manually. Cookie parsing in particular needs to be tolerant of
cookies set by other sources, including different servers handling other parts of the domain
and JavaScript or other client-side code setting values in the browser.

In light of this, I propose separating the “Set-Cookie” generation side from the “Cookie”
parsing side.

Generation
==========
The general principle here would be to use the version property of Cookie to determine the
level of verification to perform: if 0 follow RFC6265, if 1 use RFC2109. The primary verification
point would be in HttpServletRequest#addCookie() which would use the version in the Cookie
instance. Characters will always be converted to octets using the ISO-8859-1 charset; unmappable
values will result in an IAE.

The Servlet spec requires an IAE be thrown in Cookie’s constructor if the name is not valid
pre RFC2109. Both RFC6265 and RFC2109 define the name to be a “token” (per RFC2616 HTTP/1.1)
so I would propose to always validate by those rules; this would allow US-ASCII characters
except CTLs and separators. This will different from the current implementation that slash
“/“ would be treated as a separator which would not be allowed in a name by default; this
is consistent with the RFC’s and Glassfish’s implementation and I’m assuming that allowing
it in our current implementation is a hangover from where we enabled use of “/“ in values.


The spec allows vendors to provide "a configuration option that allows cookie names conforming
to the original Netscape Cookie Specification to be accepted” and I propose to retain the
system property “org.apache.tomcat.util.http.ServerCookie.STRICT_NAMING” for that. If
explicitly set to false, it will verify names using Netscape’s rules and allow "a sequence
of characters excluding semi-colon, comma and white space” but also excluding “=“ and
CTLs per RFC2616; note this *would* allow 8-bit ISO-8859-1 characters in the name and relax
the RFC2109 constraint that "NAMEs that begin with $ are reserved for other uses and must
not be used by applications.” 

The value would not be checked until addCookie() was called and the cookie version is known.
This would in principle use RFC6265’s “cookie-value” rule if version == 0 or RFC2109’s
“value” rules if version == 1; values that do not conform would result in an IAE from
addCookie(). Unlike the current implementation, this would not automatically upgrade the version
or add quotes around RFC2109 “values” that did not match the “token” rule.

If STRICT_SERVLET_COMPLIANCE is set, the rule for version 0 values would be relaxed to allow
any value conforming to the Netscape specification except CTLs; this would effectively add
DQUOTE, backslash, and 0x80-0xFF. For more granular control, I propose adding the system property
“org.apache.tomcat.util.http.ServerCookie.ALLOW_IN_VALUE” which would take one of the
following enum values to determine what octets were allowed:
* Netscape
* RFC2616_token
* RFC2109_value
* RFC6265_cookie_octet
* Netscape_restricted (limits the permitted characters as recommended in the Servlet spec)
* RFC6265_ISO-8859-1 (adds 0x80-0xff to cookie_octet)

RFC6265 does allow value to be omitted so if value is null then a name-only cookie will be
produced. This will contain the “=“ character required by the “cookie-pair” rule.
RFC2109 does not allow the value to be omitted so a null value will result in an IAE unless
“org.apache.tomcat.util.http.ServerCookie.ALLOW_NAME_ONLY” is set to true.

Max-Age and Expires will always be sent.

Parsing
=======
RFC6265 says the user-agent MUST send only a single Cookie header, and RFC2109 is written
to imply that assumption. Netscape says “a line” is added to the request. Our current
implementation processes all Cookie headers in a request independently which leads to a difference
in behaviour if the headers are folded by an intermediate proxy. Specifically any $Version
value specified in one header is lost when processing the next whereas if the headers were
folded the version information would apply when processing the values from the second header.
To avoid this inconsistency, I propose only processing the first header received.

RFC2109 requires the header start with “cookie-version” which can be used to discriminate
between RFC2109 and RFC6265/Netscape formats. Specifically, if the line starts with “$Version”
then it would be processed using RFC2109 rules, otherwise processing would use RFC6265 rules.
The analysis behind RFC6265 indicates that most user agents simply ignore any “Version”
attribute in a RFC2109 Set-Cookie so we would expect most requests to simply contain a RFC6265-format
header.

When parsing a RFC2109 Cookie header, we will assume conformance to the specification. Any
invalid “cookie-value” will simply be dropped and the parser will attempt to recover at
the next potential “cookie-value” based on the current parse state (i.e. in a quoted-string
or not). A missing VALUE will be considered invalid unless “org.apache.tomcat.util.http.ClientCookie.ALLOW_NAME_ONLY”
is set to true. The version, path and domain properties in Cookie will be set based on attributes
in the cookie-value.

When parsing a RFC6265 Cookie header, we will assume it is comprised of “cookie-pair”
separated by the sequence “;” SP. A “cookie-pair” must start with a valid “cookie-name”
followed by an “=“ character. The name must be a valid “token” unless “org.apache.tomcat.util.http.ClientCookie.STRICT_NAMING”
is explicitly set to false. The will would be considered valid if it complies with the “cookie-value”
rule unless “org.apache.tomcat.util.http.ClientCookie.ALLOW_IN_VALUE” is set to one of
the alternatives above. This will allow name-only cookies (in the form "name=“). It will
also mean cookies with unencoded JSON values will normally be suppressed so any application
expecting such a value would need to parser the Cookie header directly. Any invalid cookie-pair
will not be returned to the application and will not cause an exception to be thrown by the
parser; a user-data error will be logged. To recover from an invalid cookie-pair the parser
will look for the next “;” SP sequence. The path and domain properties in Cookie will
always be null and the version will always be 0.

======

I plan to start looking at this for trunk/TC8.0 starting with a cleanup of the current tests.
It should be possible to back-port to TC7.0.x if that is desirable.

Thanks
Jeremy


Mime
View raw message