Return-Path: X-Original-To: apmail-hc-dev-archive@www.apache.org Delivered-To: apmail-hc-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 25941D8AD for ; Tue, 30 Oct 2012 10:40:20 +0000 (UTC) Received: (qmail 40146 invoked by uid 500); 30 Oct 2012 10:40:19 -0000 Delivered-To: apmail-hc-dev-archive@hc.apache.org Received: (qmail 38099 invoked by uid 500); 30 Oct 2012 10:40:15 -0000 Mailing-List: contact dev-help@hc.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "HttpComponents Project" Delivered-To: mailing list dev@hc.apache.org Received: (qmail 36743 invoked by uid 99); 30 Oct 2012 10:40:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Oct 2012 10:40:12 +0000 Date: Tue, 30 Oct 2012 10:40:12 +0000 (UTC) From: "Jon Moore (JIRA)" To: dev@hc.apache.org Message-ID: <1875028430.44020.1351593612298.JavaMail.jiratomcat@arcas> In-Reply-To: <1277503271.40003.1351533852594.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HTTPCLIENT-1257) Header location automatically converted to ASCII even though location can contain UTF-8 encoded urls MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HTTPCLIENT-1257?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId= =3D13486784#comment-13486784 ]=20 Jon Moore commented on HTTPCLIENT-1257: --------------------------------------- Can you add the wire log section for the GET request HttpClient sends initi= ally to receive the 303 response? I'm curious to see what the wire log has = for the path segment of the initial request. =20 > Header location automatically converted to ASCII even though location can= contain UTF-8 encoded urls > -------------------------------------------------------------------------= --------------------------- > > Key: HTTPCLIENT-1257 > URL: https://issues.apache.org/jira/browse/HTTPCLIENT-125= 7 > Project: HttpComponents HttpClient > Issue Type: Bug > Components: HttpClient > Affects Versions: 4.2.2 > Reporter: Thibaut > Original Estimate: 1h > Remaining Estimate: 1h > > I'm trying to fetch: > http://handheld.vn/content.php?4052-=C4=90=C3=A1nh-gi=C3=A1-m=C3=A1y-t=C3= =ADnh-b=E1=BA=A3ng-Kindle-Fire-HD-7-inch > Which returns: > 2012-10-29 18:54:29,355 DEBUG http.wire: << "HTTP/1.1 303 See Other[\r][\= n]" [main] > 2012-10-29 18:54:29,355 DEBUG http.wire: << "Date: Mon, 29 Oct 2012 17:55= :57 GMT[\r][\n]" [main] > 2012-10-29 18:54:29,355 DEBUG http.wire: << "Server: Apache[\r][\n]" [mai= n] > 2012-10-29 18:54:29,355 DEBUG http.wire: << "Expires: Thu, 19 Nov 1981 08= :52:00 GMT[\r][\n]" [main] > 2012-10-29 18:54:29,356 DEBUG http.wire: << "Cache-Control: no-store, no-= cache, must-revalidate, post-check=3D0, pre-check=3D0[\r][\n]" [main] > 2012-10-29 18:54:29,356 DEBUG http.wire: << "Pragma: no-cache[\r][\n]" [m= ain] > 2012-10-29 18:54:29,356 DEBUG http.wire: << "Set-Cookie: bb_lastactivity= =3D0; expires=3DTue, 29-Oct-2013 17:55:57 GMT; path=3D/[\r][\n]" [main] > 2012-10-29 18:54:29,356 DEBUG http.wire: << "Location: http://handheld.vn= /content/4052-????nh-gi??-m??y-t??nh-b???ng-Kindle-Fire-HD-7-inch[\r][\n]" = [main] > 2012-10-29 18:54:29,357 DEBUG http.wire: << "Content-Length: 0[\r][\n]" [= main] > 2012-10-29 18:54:29,357 DEBUG http.wire: << "Connection: close[\r][\n]" [= main] > 2012-10-29 18:54:29,357 DEBUG http.wire: << "Content-Type: text/html[\r][= \n]" [main] > 2012-10-29 18:54:29,357 DEBUG http.wire: << "[\r][\n]" [main] > 2012-10-29 18:54:29,357 DEBUG conn.DefaultClientConnection: Receiving res= ponse: HTTP/1.1 303 See Other [main] > 2012-10-29 18:54:29,357 DEBUG http.headers: << HTTP/1.1 303 See Other [ma= in] > 2012-10-29 18:54:29,358 DEBUG http.headers: << Date: Mon, 29 Oct 2012 17:= 55:57 GMT [main] > 2012-10-29 18:54:29,358 DEBUG http.headers: << Server: Apache [main] > 2012-10-29 18:54:29,358 DEBUG http.headers: << Expires: Thu, 19 Nov 1981 = 08:52:00 GMT [main] > 2012-10-29 18:54:29,358 DEBUG http.headers: << Cache-Control: no-store, n= o-cache, must-revalidate, post-check=3D0, pre-check=3D0 [main] > 2012-10-29 18:54:29,358 DEBUG http.headers: << Pragma: no-cache [main] > 2012-10-29 18:54:29,358 DEBUG http.headers: << Set-Cookie: bb_lastactivit= y=3D0; expires=3DTue, 29-Oct-2013 17:55:57 GMT; path=3D/ [main] > 2012-10-29 18:54:29,358 DEBUG http.headers: << Location: http://handheld.= vn/content/4052-=C3=84=C2=90=C3=83=C2=A1nh-gi=C3=83=C2=A1-m=C3=83=C2=A1y-t= =C3=83=C2=ADnh-b=C3=A1=C2=BA=C2=A3ng-Kindle-Fire-HD-7-inch [main] > 2012-10-29 18:54:29,358 DEBUG http.headers: << Content-Length: 0 [main] > 2012-10-29 18:54:29,358 DEBUG http.headers: << Connection: close [main] > 2012-10-29 18:54:29,359 DEBUG http.headers: << Content-Type: text/html [m= ain] > Unfortunately I can't get the resolve Url through the following code: > Header locationHeader =3D response.getFirstHeader("location"); > which will return http://handheld.vn/content/4052-=C3=84=C2=90=C3=83=C2= =A1nh-gi=C3=83=C2=A1-m=C3=83=C2=A1y-t=C3=83=C2=ADnh-b=C3=A1=C2=BA=C2=A3ng-K= indle-Fire-HD-7-inch > The header has already been extracted in the wrong content encoding. I wi= ll never be able to get the redirect url! > I understand that this is not RFC normalised behavior, but the above url = and redirect works fine in all browsers. > Is it possible to access the raw header (byte array) so that I can chose = the encoding on my own? This would help a lot. Or a parameter to optionally= specify the encoding when fetching a header value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org For additional commands, e-mail: dev-help@hc.apache.org