hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <seb...@gmail.com>
Subject Re: URLEncodeUtils - change in format behaviour since 4.2
Date Tue, 26 Jun 2012 10:41:23 GMT
On 26 June 2012 08:46, Oleg Kalnichevski <olegk@apache.org> wrote:
> On Tue, 2012-06-26 at 02:00 +0100, sebb wrote:
>> The escaping of non-alphabetic characters by the format methods is no
>> longer quite the same as that done by java.net.URLEncoder.encode.
>>
>> The former allows the chars in ".-*_!'()" to pass through without
>> conversion, whereas the latter only allows ".-*_" unchanged.
>> The latter is also how browsers behave when escaping form fields.
>>
>> I think the behaviour should be consistent with URLEncoder and browsers.
>> That was in fact the behaviour with 4.2, which delegated the escaping
>> to URLEncoder.
>> I think the code should revert to using URLEncoder/URLDecoder.
>>
>> There is still a need for the extended path, query and fragment
>> escape/unescape methods, but perhaps these belong in URIBuilder?
>> If not, maybe they should be in a separate class anyway?
>>
>
> Would not that lead to inconsistent behavior when the same query form
> gets encoded differently depending on whether it is enclosed in the
> request URI or in the request body?

I don't think so, I think encodeFormFields could use a different safe
character set without problems, so long as the safe set is a subset of
all possible safe query characters. In fact the UNRESERVED BitSet is
only currently used in URLEncodedUtils#encodeFormFields(), so I don't
see how changing encodeFormFields to use a different safe set can
affect anything.

Besides, AFAIK 4.2 did not have a problem with using a more limited safe set.

> Browsers do a lot of silly stuff to maximize compatibility with all
> sorts of broken software out there. I am not sure we need to do
> likewise.

Well-written software will be able to deal with form data that has
some additional safe characters encoded, so I don't think there is any
problem in playing safe here.

[But if we do decide to change the safe list from the one previously
used, it needs to be flagged up in the release notes.]

> Oleg
>
>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
>> For additional commands, e-mail: dev-help@hc.apache.org
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> For additional commands, e-mail: dev-help@hc.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Mime
View raw message