hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject Re: Content-Encoding header is missing in httpclient's response
Date Tue, 17 Dec 2013 08:50:57 GMT
On Tue, 2013-12-17 at 12:16 +0530, Dhruvakumar P G wrote:
> On 12/9/2013 7:43 PM, Oleg Kalnichevski wrote:
> > On Mon, 2013-12-09 at 19:15 +0530, Dhruvakumar P G wrote:
> >> On 12/9/2013 4:41 PM, Oleg Kalnichevski wrote:
> >>> On Mon, 2013-12-09 at 13:09 +0530, Dhruvakumar P G wrote:
> >>>> Hello,
> >>>>
> >>>> I'm in the middle of upgrading Httpclient, mime, core libraries to
> >>>> latest version. I haven't been able to figure out any solution to the
> >>>> following problem.
> >>>> When Httpclient downloads a text file(icité Àâqë-withmultibytechars.txt)
> >>>> which contains multibyte characters from another server and sends it
to
> >>>> the browser.
> >>>> *The server returns the response headers as below :*
> >>>>
> >>>> HTTP/1.1 200 OK
> >>>> X-Powered-By: Servlet/2.5
> >>>> Content-Disposition: attachment;       filename="icité
> >>>> Àâqë-withmultibytechars.txt"
> >>>> Content-Type: application/octet-stream
> >>>> Content-Length: 162
> >>>> *
> >>>> **Browser receives the headers as below and shows the filename rightly
:*
> >>>>
> >>>> Content-Disposition    attachment; filename="icité
> >>>> Àâqë-withmultibytechars.txt"
> >>>> Content-Type    application/octet-stream
> >>>> Transfer-Encoding    chunked
> >>>>
> >>>> When Httpclient downloads an image file(ウェ.jpg) from another server
> >>>> and sends it to the browser.
> >>>> *The server returns the response headers as below : *
> >>>> HTTP/1.1 200 OK
> >>>> X-Powered-By: Servlet/2.5
> >>>> Content-Disposition: attachment; filename="ウェ.jpg"
> >>>> Content-Encoding: gzip
> >>>> Content-Type: application/octet-stream
> >>>> Transfer-Encoding: chunked
> >>>>
> >>>> Even though  "Content-Encoding: gzip" header is returned by the server,
> >>>> the response object doesn't have this header.
> >>>> Somehow this header has been removed from the response when the request
> >>>> gets executed,  _response = _httpClient.execute(_httpHost, _httpMethod,
> >>>> _httpContext);
> >>>>
> >>>> *Browser will not receive this header, non-ascii characters aren't
> >>>> recognized in the filename of download dialogue, it just shows empty
> >>>> characters:*
> >>>> Content-Disposition    attachment; filename="   .jpg"
> >>>> Content-Type    application/octet-stream
> >>>> Transfer-Encoding    chunked, chunked
> >>>>
> >>>> Am I missing something here ? How do I make sure that the Httpclient
> >>>> doesn't ignore this header and browser get to show the filename rightly
?
> >>>>
> >>> HTTP message headers may not have non-ASCII per requirements of the HTTP
> >>> protocol. The target server is in violation of the HTTP specification.
> >> Yes indeed,  the target server should return encoded filename :
> >> *Content-disposition: attachment; filename="=?utf-8?B?44Km44KnLmpwZw==?="*
> >> But instead it is returning unencoded filename : Content-Disposition:
> >> attachment; filename="ウェ.jpg"
> >> Can't I resolve my issue unless target server returns encoded filename ?
> >>
> >> Thanks,
> >> Dhruva
> >>> One can force HttpClient, though, to use a non-standard charset for HTTP
> >>> messages by using a custom ConnectionConfig.
> >>>
> >>> Oleg
> >>>
> >> I have set the charset to UTF-8,
> >> connectionConfigBuilder.setCharset(Consts.UTF_8)
> >> Will Setting charset to any other make httpclient to not to lose
> >> 'Content-Encoding' response header ?
> >>
> > I am not aware of a single confirmed case of HttpClient losing headers.
> > You can use wire / context logging to see what data packets are
> > transmitted across the wire.
> >
> > Oleg
> Hello,
> To narrow down the problem, I have disabled the compression in target 
> server. Now the target server doesn't return Content-Encoding header.
> 
> Given that the target server always returns Non-ASCII filename without 
> being encoded in MIME header(Content-Disposition: attachment; filename=" 
> ウェ.jpg") which is a violation to the HTTP specification. My 
> requirement here is to show the multibyte character file name when user 
> downloads the attachment across all the browsers without losing any 
> character in the name.
> 
> With earlier version of HttpClient(4.0.1), when target server returns 
> the non-ascii filename without being encoded as below :
> Content-Disposition: attachment; filename="ウェ - multibyte.txt"
> Content-Type: text/plain;charset=utf-8
> 
> Filename will be kind of encoded in the response of HttpClient(4.0.1) as 
> below :
> Content-Disposition    attachment; filename="ウェ - multibyte.txt"
> Content-Type    text/plain;charset=utf-8
> 
> And as a result of the above behaviour, browser is able to decode the 
> filename and show correctly in the download dialogue.
> 
> But in the response of HttpClient(4.3.1), filename will be exactly same 
> as what we got from target server. Not changed into any encoded form 
> unlike in HttpClient(4.0.1) :
> Content-Disposition: attachment; filename="ウェ - multibyte.txt"
> Content-Type: text/plain;charset=utf-8
> 
> And as a result of the above behaviour, browser is not able to show the 
> filename rightly. Download dialogue shows *'- multibyte.txt*' and
> response headers in Firebug shows:
> Content-Disposition 	|attachment; filename=" - multibyte.txt"|
> Content-Type 	|text/plain;charset=utf-8|
> 
> 
> 
> Is the above change-in-behaviour from 4.0 to 4.3 expected ?
> If so,*How do I make sure that the multibyte character filename is 
> displayed correctly across all the browsers given that the target server 
> always returns it in unencoded form* ?
> 
> 

As I said in my previous message you need to force HttpClient to use
non-standard charset for protocol elements (either for all connections
or connections to a specific host only)

---
ConnectionConfig connectionConfig = ConnectionConfig.custom()
        .setCharset(Consts.UTF_8)
        .build();
CloseableHttpClient httpclient = HttpClients.custom()
        .setDefaultConnectionConfig(connectionConfig)
        .build();
}
---

Oleg

> Thanks & Regards,
> Dhruva
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> > For additional commands, e-mail: dev-help@hc.apache.org
> >
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Mime
View raw message