hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Kalnichevski <ol...@apache.org>
Subject Re: [HTTPClient 3.0.1] Bug: Multipart posts with files named using UTF-8 characters
Date Thu, 19 Oct 2006 13:36:03 GMT
On Thu, 2006-10-19 at 14:29 +0200, Tumidajewicz, Przemyslaw wrote:
> Hello everyone,
> 
> First post here, hope I'm doing it right ;)
> 
> I've been having problems with sending multipart posts containing files 
> named using UTF-8 characters - all non-ASCII characters are turned into 
> question marks. I've tried to specify the charset when creating the 
> FilePart like this
> 
> FilePart fp = new FilePart(name, file, null, "UTF-8");
> 
> as well as setting the charset later on like this
> 
> fp.setCharSet("UTF-8");
> 
> with no result. So I took a deeper look at the HttpClient code (thank 
> god for open source!) and found that the loss of special characters 
> happens in the FilePart.sendDispositionHeader method, at line
> 
> out.write(EncodingUtil.getAsciiBytes(filename));
> 
> where the filename is forced to fit into the US-ASCII charset.
> 

Przemyslaw,

This behavior is in line with the requirements of the MIME specification
as outlined in RFC 1521 and RFC 1522. The use of non-ASCII characters in
MIME headers is not permitted. One is supposed to escape non-ASCII
characters using BASE64 or Quoted-Printable encoding. 

See this feature request for details

https://issues.apache.org/jira/browse/HTTPCLIENT-293  

Oleg


> My workaround for this problem is to substitute the above line with a 
> charset-aware version:
> 
> out.write(EncodingUtil.getBytes(filename, getCharSet()));
> 
> but I'm not sure if it's the correct way to do it.
> 
> What I'm quite sure of at this point is that it works for UTF-8 and 
> results are consistent with what I get out of IE6 when posting the same 
> file using a form like this:
> 
> <form action="http://localhost:1235" method="POST" 
> enctype="multipart/form-data" accept-charset="UTF-8">
> <input type="file" name="file"></input>
> <input type="submit"></input>
> </form>
> 
> It's also parsed correctly by FileUpload 1.1.
> 
> I've had a look at the HTTPClient 3.1-alpha1 source and the problematic 
> line in FilePart looks the same - which means that either my fix is a 
> heresy and/or there is a better way of doing this - or that this bug has 
> not been reported before (I failed to find anything on this in the archive).
> 
> Please let me know if this is the right way of fixing this problem and 
> if so, will this fix make it into HTTPClient 3.1
> 
> Thanks and best regards!
> --Przemek
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Mime
View raw message