hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thibaut (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HTTPCLIENT-1257) Header location automatically converted to ASCII even though location can contain UTF-8 encoded urls
Date Mon, 12 Nov 2012 18:35:12 GMT

    [ https://issues.apache.org/jira/browse/HTTPCLIENT-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495483#comment-13495483
] 

Thibaut commented on HTTPCLIENT-1257:
-------------------------------------

Thanks a lot! 
                
> Header location automatically converted to ASCII even though location can contain UTF-8
encoded urls
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HTTPCLIENT-1257
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1257
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>    Affects Versions: 4.2.2
>            Reporter: Thibaut
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I'm trying to fetch:
> http://handheld.vn/content.php?4052-Đánh-giá-máy-tính-bảng-Kindle-Fire-HD-7-inch
> Which returns:
> 2012-10-29 18:54:29,355 DEBUG http.wire: << "HTTP/1.1 303 See Other[\r][\n]" [main]
> 2012-10-29 18:54:29,355 DEBUG http.wire: << "Date: Mon, 29 Oct 2012 17:55:57 GMT[\r][\n]"
[main]
> 2012-10-29 18:54:29,355 DEBUG http.wire: << "Server: Apache[\r][\n]" [main]
> 2012-10-29 18:54:29,355 DEBUG http.wire: << "Expires: Thu, 19 Nov 1981 08:52:00
GMT[\r][\n]" [main]
> 2012-10-29 18:54:29,356 DEBUG http.wire: << "Cache-Control: no-store, no-cache,
must-revalidate, post-check=0, pre-check=0[\r][\n]" [main]
> 2012-10-29 18:54:29,356 DEBUG http.wire: << "Pragma: no-cache[\r][\n]" [main]
> 2012-10-29 18:54:29,356 DEBUG http.wire: << "Set-Cookie: bb_lastactivity=0; expires=Tue,
29-Oct-2013 17:55:57 GMT; path=/[\r][\n]" [main]
> 2012-10-29 18:54:29,356 DEBUG http.wire: << "Location: http://handheld.vn/content/4052-????nh-gi??-m??y-t??nh-b???ng-Kindle-Fire-HD-7-inch[\r][\n]"
[main]
> 2012-10-29 18:54:29,357 DEBUG http.wire: << "Content-Length: 0[\r][\n]" [main]
> 2012-10-29 18:54:29,357 DEBUG http.wire: << "Connection: close[\r][\n]" [main]
> 2012-10-29 18:54:29,357 DEBUG http.wire: << "Content-Type: text/html[\r][\n]" [main]
> 2012-10-29 18:54:29,357 DEBUG http.wire: << "[\r][\n]" [main]
> 2012-10-29 18:54:29,357 DEBUG conn.DefaultClientConnection: Receiving response: HTTP/1.1
303 See Other [main]
> 2012-10-29 18:54:29,357 DEBUG http.headers: << HTTP/1.1 303 See Other [main]
> 2012-10-29 18:54:29,358 DEBUG http.headers: << Date: Mon, 29 Oct 2012 17:55:57
GMT [main]
> 2012-10-29 18:54:29,358 DEBUG http.headers: << Server: Apache [main]
> 2012-10-29 18:54:29,358 DEBUG http.headers: << Expires: Thu, 19 Nov 1981 08:52:00
GMT [main]
> 2012-10-29 18:54:29,358 DEBUG http.headers: << Cache-Control: no-store, no-cache,
must-revalidate, post-check=0, pre-check=0 [main]
> 2012-10-29 18:54:29,358 DEBUG http.headers: << Pragma: no-cache [main]
> 2012-10-29 18:54:29,358 DEBUG http.headers: << Set-Cookie: bb_lastactivity=0; expires=Tue,
29-Oct-2013 17:55:57 GMT; path=/ [main]
> 2012-10-29 18:54:29,358 DEBUG http.headers: << Location: http://handheld.vn/content/4052-Đánh-giá-máy-tính-bảng-Kindle-Fire-HD-7-inch
[main]
> 2012-10-29 18:54:29,358 DEBUG http.headers: << Content-Length: 0 [main]
> 2012-10-29 18:54:29,358 DEBUG http.headers: << Connection: close [main]
> 2012-10-29 18:54:29,359 DEBUG http.headers: << Content-Type: text/html [main]
> Unfortunately I can't get the resolve Url through the following code:
> Header locationHeader = response.getFirstHeader("location");
> which will return http://handheld.vn/content/4052-Đánh-giá-máy-tính-bảng-Kindle-Fire-HD-7-inch
> The header has already been extracted in the wrong content encoding. I will never be
able to get the redirect url!
> I understand that this is not RFC normalised behavior, but the above url and redirect
works fine in all browsers.
> Is it possible to access the raw header (byte array) so that I can chose the encoding
on my own? This would help a lot. Or a parameter to optionally specify the encoding when fetching
a header value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Mime
View raw message