Return-Path: X-Original-To: apmail-hc-dev-archive@www.apache.org Delivered-To: apmail-hc-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5E6439B48 for ; Tue, 26 Jun 2012 07:46:49 +0000 (UTC) Received: (qmail 88756 invoked by uid 500); 26 Jun 2012 07:46:49 -0000 Delivered-To: apmail-hc-dev-archive@hc.apache.org Received: (qmail 88474 invoked by uid 500); 26 Jun 2012 07:46:47 -0000 Mailing-List: contact dev-help@hc.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "HttpComponents Project" Delivered-To: mailing list dev@hc.apache.org Received: (qmail 88393 invoked by uid 99); 26 Jun 2012 07:46:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jun 2012 07:46:44 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [217.150.250.48] (HELO kalnich.nine.ch) (217.150.250.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jun 2012 07:46:35 +0000 Received: from [192.168.42.9] (unknown [213.55.184.209]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by kalnich.nine.ch (Postfix) with ESMTPSA id 61F52B80227 for ; Tue, 26 Jun 2012 09:46:14 +0200 (CEST) Message-ID: <1340696770.5475.6.camel@ubuntu> Subject: Re: URLEncodeUtils - change in format behaviour since 4.2 From: Oleg Kalnichevski To: HttpComponents Project Date: Tue, 26 Jun 2012 09:46:10 +0200 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.2.3-0ubuntu6 Content-Transfer-Encoding: 7bit Mime-Version: 1.0 On Tue, 2012-06-26 at 02:00 +0100, sebb wrote: > The escaping of non-alphabetic characters by the format methods is no > longer quite the same as that done by java.net.URLEncoder.encode. > > The former allows the chars in ".-*_!'()" to pass through without > conversion, whereas the latter only allows ".-*_" unchanged. > The latter is also how browsers behave when escaping form fields. > > I think the behaviour should be consistent with URLEncoder and browsers. > That was in fact the behaviour with 4.2, which delegated the escaping > to URLEncoder. > I think the code should revert to using URLEncoder/URLDecoder. > > There is still a need for the extended path, query and fragment > escape/unescape methods, but perhaps these belong in URIBuilder? > If not, maybe they should be in a separate class anyway? > Would not that lead to inconsistent behavior when the same query form gets encoded differently depending on whether it is enclosed in the request URI or in the request body? Browsers do a lot of silly stuff to maximize compatibility with all sorts of broken software out there. I am not sure we need to do likewise. Oleg > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org > For additional commands, e-mail: dev-help@hc.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org For additional commands, e-mail: dev-help@hc.apache.org