Return-Path: X-Original-To: apmail-tomcat-dev-archive@www.apache.org Delivered-To: apmail-tomcat-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6991610351 for ; Tue, 24 Dec 2013 01:22:27 +0000 (UTC) Received: (qmail 31323 invoked by uid 500); 24 Dec 2013 01:22:26 -0000 Delivered-To: apmail-tomcat-dev-archive@tomcat.apache.org Received: (qmail 31189 invoked by uid 500); 24 Dec 2013 01:22:26 -0000 Mailing-List: contact dev-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Developers List" Delivered-To: mailing list dev@tomcat.apache.org Received: (qmail 31180 invoked by uid 99); 24 Dec 2013 01:22:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Dec 2013 01:22:26 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [209.85.220.47] (HELO mail-pa0-f47.google.com) (209.85.220.47) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 Dec 2013 01:22:18 +0000 Received: by mail-pa0-f47.google.com with SMTP id kq14so5930737pab.20 for ; Mon, 23 Dec 2013 17:21:57 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:content-type:subject:message-id:date :to:mime-version; bh=JraOxaH6X4YN2YBothpPq65jl65/ZdXQkEl2X3vzzf8=; b=c2XUwMmQY5GKxbVfaKCtAIGGNO+WMhP+9WYE5ENWDc5vlbjnYUpgePNdoud39WA1DE C/wiw65WQAL6myxtgQbr0sqreHl3pDtYuXPzFcnQv+8qZeN3eauvTCLLmR6Mse62VkXw WTMdWisBmprlmOm2HMCKlgvtrmf86h81wPdHKIje5zJ+2vrRWiX+W7wnBFZ94zbscUzz rPaZpOdJr3drHP7DtI+AOVS55ifLUFM258hF7c+GI2cYvGgSS9ROzVttqVCAJHofvFUq Jb4QOamfdIHay12bD9n8jkPLfL4DJLU21fACflYD2GsSXHH8pKcJAqcps0gRnJZtEU1O 0dzA== X-Gm-Message-State: ALoCoQm1TQyaftYvNtFV4ctutny7MPpVItCt7Df5gFrKbAWiemIHGx2DB/5CaXjCHL1qyULgXJAU X-Received: by 10.68.172.65 with SMTP id ba1mr29452312pbc.18.1387848117088; Mon, 23 Dec 2013 17:21:57 -0800 (PST) Received: from [10.0.1.22] (c-24-16-133-248.hsd1.wa.comcast.net. [24.16.133.248]) by mx.google.com with ESMTPSA id i10sm48942172pat.11.2013.12.23.17.21.52 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 23 Dec 2013 17:21:53 -0800 (PST) Sender: Jeremy Boynes From: Jeremy Boynes Content-Type: multipart/signed; boundary="Apple-Mail=_AA1010B2-8391-4B45-865C-B346E70B1274"; protocol="application/pgp-signature"; micalg=pgp-sha512 Subject: Support RFC6265 cookie processing Message-Id: <13137FAE-FED3-44B7-BDAB-CCEC51DD8AD3@apache.org> Date: Mon, 23 Dec 2013 17:21:50 -0800 To: Tomcat Developers List Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\)) X-Mailer: Apple Mail (2.1827) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_AA1010B2-8391-4B45-865C-B346E70B1274 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 In comments on issue #55917, there was suggestion for refactoring cookie = support along the lines described in RFC6265. Reading this RFC, it = appears to be more of an effort to standardize the actual behaviour seen = on the Internet for different browser and server implementations. The = observation is the RFC2109 has received limited adoption and RFC2965 = virtually none at all, with most implementations falling back to the = original specification released by Netscape that contains certain = ambiguities.=20 The Servlet spec=92s JavaDoc for Cookie refers to RFC2109 behaviour with = caveats around interoperability. It defines version 0 as complying with = Netscape=92s original specification and version 1 as complying RFC2109 = (with the note =93Since RFC 2109 is still somewhat new, consider version = 1 as experimental; do not use it yet on production sites=94). The current implementation uses a number of system properties to control = how cookies are validated. In implementing RFC6265 I hope that some of = these can be eliminated. If not, I would propose to add configuration = options on the Connector or Host objects to allow the configuration to = be set separately for different host domains. RFC6265 has separate sections in respect for generating and parsing = cookie headers. It follows the practice that generation be strict but = parsing be more tolerant of invalid input. Our current implementation = generally follows that trend by suppressing invalid input data (after = logging). However, for some input data, primary CTLs, it throws an = IllegalArgumentException from the connector which does not allow the = application to recover. In refactoring, I would propose to simply ignore = that input thereby allowing the application to handle it, for example by = parsing the header field manually. Cookie parsing in particular needs to = be tolerant of cookies set by other sources, including different servers = handling other parts of the domain and JavaScript or other client-side = code setting values in the browser. In light of this, I propose separating the =93Set-Cookie=94 generation = side from the =93Cookie=94 parsing side. Generation =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D The general principle here would be to use the version property of = Cookie to determine the level of verification to perform: if 0 follow = RFC6265, if 1 use RFC2109. The primary verification point would be in = HttpServletRequest#addCookie() which would use the version in the Cookie = instance. Characters will always be converted to octets using the = ISO-8859-1 charset; unmappable values will result in an IAE. The Servlet spec requires an IAE be thrown in Cookie=92s constructor if = the name is not valid pre RFC2109. Both RFC6265 and RFC2109 define the = name to be a =93token=94 (per RFC2616 HTTP/1.1) so I would propose to = always validate by those rules; this would allow US-ASCII characters = except CTLs and separators. This will different from the current = implementation that slash =93/=93 would be treated as a separator which = would not be allowed in a name by default; this is consistent with the = RFC=92s and Glassfish=92s implementation and I=92m assuming that = allowing it in our current implementation is a hangover from where we = enabled use of =93/=93 in values.=20 The spec allows vendors to provide "a configuration option that allows = cookie names conforming to the original Netscape Cookie Specification to = be accepted=94 and I propose to retain the system property = =93org.apache.tomcat.util.http.ServerCookie.STRICT_NAMING=94 for that. = If explicitly set to false, it will verify names using Netscape=92s = rules and allow "a sequence of characters excluding semi-colon, comma = and white space=94 but also excluding =93=3D=93 and CTLs per RFC2616; = note this *would* allow 8-bit ISO-8859-1 characters in the name and = relax the RFC2109 constraint that "NAMEs that begin with $ are reserved = for other uses and must not be used by applications.=94=20 The value would not be checked until addCookie() was called and the = cookie version is known. This would in principle use RFC6265=92s = =93cookie-value=94 rule if version =3D=3D 0 or RFC2109=92s =93value=94 = rules if version =3D=3D 1; values that do not conform would result in an = IAE from addCookie(). Unlike the current implementation, this would not = automatically upgrade the version or add quotes around RFC2109 =93values=94= that did not match the =93token=94 rule. If STRICT_SERVLET_COMPLIANCE is set, the rule for version 0 values would = be relaxed to allow any value conforming to the Netscape specification = except CTLs; this would effectively add DQUOTE, backslash, and = 0x80-0xFF. For more granular control, I propose adding the system = property =93org.apache.tomcat.util.http.ServerCookie.ALLOW_IN_VALUE=94 = which would take one of the following enum values to determine what = octets were allowed: * Netscape * RFC2616_token * RFC2109_value * RFC6265_cookie_octet * Netscape_restricted (limits the permitted characters as recommended in = the Servlet spec) * RFC6265_ISO-8859-1 (adds 0x80-0xff to cookie_octet) RFC6265 does allow value to be omitted so if value is null then a = name-only cookie will be produced. This will contain the =93=3D=93 = character required by the =93cookie-pair=94 rule. RFC2109 does not allow = the value to be omitted so a null value will result in an IAE unless = =93org.apache.tomcat.util.http.ServerCookie.ALLOW_NAME_ONLY=94 is set to = true. Max-Age and Expires will always be sent. Parsing =3D=3D=3D=3D=3D=3D=3D RFC6265 says the user-agent MUST send only a single Cookie header, and = RFC2109 is written to imply that assumption. Netscape says =93a line=94 = is added to the request. Our current implementation processes all Cookie = headers in a request independently which leads to a difference in = behaviour if the headers are folded by an intermediate proxy. = Specifically any $Version value specified in one header is lost when = processing the next whereas if the headers were folded the version = information would apply when processing the values from the second = header. To avoid this inconsistency, I propose only processing the first = header received. RFC2109 requires the header start with =93cookie-version=94 which can be = used to discriminate between RFC2109 and RFC6265/Netscape formats. = Specifically, if the line starts with =93$Version=94 then it would be = processed using RFC2109 rules, otherwise processing would use RFC6265 = rules. The analysis behind RFC6265 indicates that most user agents = simply ignore any =93Version=94 attribute in a RFC2109 Set-Cookie so we = would expect most requests to simply contain a RFC6265-format header. When parsing a RFC2109 Cookie header, we will assume conformance to the = specification. Any invalid =93cookie-value=94 will simply be dropped and = the parser will attempt to recover at the next potential =93cookie-value=94= based on the current parse state (i.e. in a quoted-string or not). A = missing VALUE will be considered invalid unless = =93org.apache.tomcat.util.http.ClientCookie.ALLOW_NAME_ONLY=94 is set to = true. The version, path and domain properties in Cookie will be set = based on attributes in the cookie-value. When parsing a RFC6265 Cookie header, we will assume it is comprised of = =93cookie-pair=94 separated by the sequence =93;=94 SP. A =93cookie-pair=94= must start with a valid =93cookie-name=94 followed by an =93=3D=93 = character. The name must be a valid =93token=94 unless = =93org.apache.tomcat.util.http.ClientCookie.STRICT_NAMING=94 is = explicitly set to false. The will would be considered valid if it = complies with the =93cookie-value=94 rule unless = =93org.apache.tomcat.util.http.ClientCookie.ALLOW_IN_VALUE=94 is set to = one of the alternatives above. This will allow name-only cookies (in the = form "name=3D=93). It will also mean cookies with unencoded JSON values = will normally be suppressed so any application expecting such a value = would need to parser the Cookie header directly. Any invalid cookie-pair = will not be returned to the application and will not cause an exception = to be thrown by the parser; a user-data error will be logged. To recover = from an invalid cookie-pair the parser will look for the next =93;=94 SP = sequence. The path and domain properties in Cookie will always be null = and the version will always be 0. =3D=3D=3D=3D=3D=3D I plan to start looking at this for trunk/TC8.0 starting with a cleanup = of the current tests. It should be possible to back-port to TC7.0.x if = that is desirable. Thanks Jeremy --Apple-Mail=_AA1010B2-8391-4B45-865C-B346E70B1274 Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=signature.asc Content-Type: application/pgp-signature; name=signature.asc Content-Description: Message signed with OpenPGP using GPGMail -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQEcBAEBCgAGBQJSuOGuAAoJEKVK0I6noCM8IiMH/jm0t9baVVbHxJQOLKOsS4Gb ytLaas+ZXtYyBZvoR2tZaHOJ1HjIT7jNQFFCz0SZ8I/IKZp96z7dw7sETgOgxdx0 aRi6GhUocb4vyfTA9NcgsK5sizawzwT0FhMBvzxSCIWq69pdqk/O2l+LMlys6XR6 AQyCRKTizJSWeKkQ25ZFPtS70lsM5Og8bPdtPepLjESh/qX2qTDPpOy0uUFUzm8U VNR3JF1+bNiQL55Fx4ezYNrMZVcrjjmTGqEOoewXdU+tCGB5toBTaD4DoJlKJRyQ YZlVcSylldYXRPu1wWUmAX5DtgpgIqL/SU5NXBIkhDHej/YjhN41+a89ZBetXNA= =e7kN -----END PGP SIGNATURE----- --Apple-Mail=_AA1010B2-8391-4B45-865C-B346E70B1274--