Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C3549200BEA for ; Tue, 27 Dec 2016 16:12:07 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id C2074160B31; Tue, 27 Dec 2016 15:12:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CE520160B23 for ; Tue, 27 Dec 2016 16:12:06 +0100 (CET) Received: (qmail 32375 invoked by uid 500); 27 Dec 2016 15:12:03 -0000 Mailing-List: contact dev-help@hc.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "HttpComponents Project" Delivered-To: mailing list dev@hc.apache.org Received: (qmail 32272 invoked by uid 99); 27 Dec 2016 15:12:03 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Dec 2016 15:12:03 +0000 Received: from ok2c (unknown [213.55.184.154]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id A1C181A00A8 for ; Tue, 27 Dec 2016 15:12:02 +0000 (UTC) Message-ID: <1482851517.23225.2.camel@apache.org> Subject: Re: Problem parsing non-ASCII in query component From: Oleg Kalnichevski To: HttpComponents Project Date: Tue, 27 Dec 2016 16:11:57 +0100 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.9-1+b1 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit archived-at: Tue, 27 Dec 2016 15:12:07 -0000 On Sat, 2016-12-24 at 18:26 -0500, Jaime Hablutzel Egoavil wrote: > Currently something like this: > > public class ProblemWithNonAscii { > public static void main(String[] args) { > List pairs = URLEncodedUtils.parse("foo=bár", > StandardCharsets.UTF_8); > System.out.println(pairs); > } > } > > produces this output: > > [foo=b�r] > > Where the 'á' character has been scrambled. > > I can see that this is related to the following narrowing primitive > conversion, > https://github.com/apache/httpclient/blob/4.5.2/httpclient/src/main/java/org/apache/http/client/utils/URLEncodedUtils.java#L570 > . > > Is this a bug isn't it?. > Jaime, URL encoded content is not supposed to have non-ASCII characters in the first place, is it not? Oleg --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org For additional commands, e-mail: dev-help@hc.apache.org