Return-Path: X-Original-To: apmail-hc-dev-archive@www.apache.org Delivered-To: apmail-hc-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F185611F92 for ; Fri, 22 Aug 2014 10:47:43 +0000 (UTC) Received: (qmail 9878 invoked by uid 500); 22 Aug 2014 10:47:43 -0000 Delivered-To: apmail-hc-dev-archive@hc.apache.org Received: (qmail 9838 invoked by uid 500); 22 Aug 2014 10:47:43 -0000 Mailing-List: contact dev-help@hc.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "HttpComponents Project" Delivered-To: mailing list dev@hc.apache.org Received: (qmail 9825 invoked by uid 99); 22 Aug 2014 10:47:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Aug 2014 10:47:43 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dirkx@webweaving.org designates 148.251.234.245 as permitted sender) Received: from [148.251.234.245] (HELO gargamel.webweaving.org) (148.251.234.245) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Aug 2014 10:47:36 +0000 Received: from [10.11.0.104] (a83-163-239-115.adsl.xs4all.nl [83.163.239.115]) (authenticated bits=0) by gargamel (8.14.9/8.14.9) with ESMTP id s7MAl0An028656 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 22 Aug 2014 10:47:15 GMT (envelope-from dirkx@webweaving.org) X-Authentication-Warning: gargamel.webweaving.org: Host a83-163-239-115.adsl.xs4all.nl [83.163.239.115] claimed to be [10.11.0.104] Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.0 \(1972.3\)) Subject: Re: CVE-2014-3577 postmortem From: Dirk-Willem van Gulik In-Reply-To: <9FF97A53-E481-4914-A2D9-1D625861F4F9@webweaving.org> Date: Fri, 22 Aug 2014 12:47:00 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <24699C1E-A868-4662-8AF5-85CE548853F6@webweaving.org> References: <7523C7C4-57DF-4788-9FE6-9EC32E1565D4@webweaving.org> <1407946481.19147.6.camel@ubuntu> <1408088529.3857.7.camel@ubuntu> <61F7BA1C-AE91-4ADE-A2BA-2B741D342E4B@webweaving.org> <1408095242.5749.4.camel@ubuntu> <7DDFAB5F-B4E1-4720-A7F0-FC63ED4C9B2D@webweaving.org> <1408103128.7084.0.camel@ubuntu> <53EDFDAE.4020106@apache.org> <1408354584.17162.5.camel@ubuntu> <6720030B-70C7-4364-AAC6-F1E37F0573D2@webweaving.org> <1408627601.28732.4.camel@ubuntu> <10129193-0E8A-44C9-9097-4D1AC1741029@webweaving.org> <78DB1EF3-09BF-4FD9-BD6E-5EE7C7A37368@webweaving.org> <1408690737.2644.9.camel@ubuntu> <9FF97A53-E481-4914-A2D9-1D625861F4F9@webweaving.org> To: HttpComponents Project X-Mailer: Apple Mail (2.1972.3) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (gargamel [148.251.234.245]); Fri, 22 Aug 2014 10:47:15 +0000 (UTC) X-Virus-Checked: Checked by ClamAV on apache.org >> Found that some of below are indeed able to hang the regex stack = (e.g. # 2). However the more elaborate regex-es are blocked by: >>=20 >> private final static Pattern WILDCARD_PATTERN =3D = Pattern.compile( "^[a-z0-9\\-\\*]+(\\.[a-z0-9\\-]+){2,}$", = Pattern.CASE_INSENSITIVE); >> .. >> WILDCARD_PATTERN.matcher(identity).matches() >>=20 >> which we apply to the subjectAltName, CN, etc. So that is not too bad = then - assuming that that regep does not let them through. Which is = likely - as the only dangerous thing I see in there is a *. >>=20 >=20 > Thank you so much for your feedback. What I could do is validate both > the identity and the subjectAltName pattern by making sure they = consist > of characters legal for domain names (alphanumeric, dash and asterisk = in > case of subjectAltName) prior to doing regexp matching with them. Right - but I am wondering if that means we end up in a rear guard = battle. As we then find IPv6 addresses containing =E2=80=9A:=E2=80=99 = and god knows what new TLDs may do 5+ years hence. Now *all* that is allowed are =E2=80=9A*=E2=80=99 =E2=80=94 and as far = as I know - only in string (and not IPv4/IPv6) based entries. So perhaps it is an option to compare things from the TLD down with a = very very simple loop. if (starts with a star) then @a =3D array of FQDN split on =E2=80=9A.' @b =3D array of FQDN split on =E2=80=9A.=E2=80=99 if not right lenghts - bail working from the topmost side working to last but one bail if not the same. check if we have left just one entry on a and a wildcard = on b. i.e. avoid wildcards completely. > Obviously - as we get into UTF8 internationalized domain names - we = may accidentally break that protection at some point. >=20 > Would not internationalized domain names be Punycode encoded instead? Yes - but I am worried about the easy to make mistakes; and the lack of = normalization* some people my accidentally not apply. The CA=E2=80=99s = do not have a good track record. But none of this is very urgent/key - just more a robustness thing. Dw. *: e.g. things like invisible unicode chars to do a =E2=80=9Abackspace=E2=80= =99 or visually wipe text.= --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org For additional commands, e-mail: dev-help@hc.apache.org