hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ortwin Gl├╝ck <...@odi.ch>
Subject Re: [HTTPClient 3.0.1] Bug: Multipart posts with files named using UTF-8 characters
Date Thu, 19 Oct 2006 13:45:23 GMT
Guys,

Look at RFC 2047 which updates RFC 1521. This method is quite popular in 
E-Mail traffic. Maybe real-world HTTP servers and clients support it?

Odi


Oleg Kalnichevski wrote:
> On Thu, 2006-10-19 at 14:29 +0200, Tumidajewicz, Przemyslaw wrote:
>> Hello everyone,
>>
>> First post here, hope I'm doing it right ;)
>>
>> I've been having problems with sending multipart posts containing files 
>> named using UTF-8 characters - all non-ASCII characters are turned into 
>> question marks. I've tried to specify the charset when creating the 
>> FilePart like this
>>
>> FilePart fp = new FilePart(name, file, null, "UTF-8");
>>
>> as well as setting the charset later on like this
>>
>> fp.setCharSet("UTF-8");
>>
>> with no result. So I took a deeper look at the HttpClient code (thank 
>> god for open source!) and found that the loss of special characters 
>> happens in the FilePart.sendDispositionHeader method, at line
>>
>> out.write(EncodingUtil.getAsciiBytes(filename));
>>
>> where the filename is forced to fit into the US-ASCII charset.
>>
> 
> Przemyslaw,
> 
> This behavior is in line with the requirements of the MIME specification
> as outlined in RFC 1521 and RFC 1522. The use of non-ASCII characters in
> MIME headers is not permitted. One is supposed to escape non-ASCII
> characters using BASE64 or Quoted-Printable encoding. 
> 
> See this feature request for details
> 
> https://issues.apache.org/jira/browse/HTTPCLIENT-293  
> 
> Oleg
> 
> 
>> My workaround for this problem is to substitute the above line with a 
>> charset-aware version:
>>
>> out.write(EncodingUtil.getBytes(filename, getCharSet()));
>>
>> but I'm not sure if it's the correct way to do it.
>>
>> What I'm quite sure of at this point is that it works for UTF-8 and 
>> results are consistent with what I get out of IE6 when posting the same 
>> file using a form like this:
>>
>> <form action="http://localhost:1235" method="POST" 
>> enctype="multipart/form-data" accept-charset="UTF-8">
>> <input type="file" name="file"></input>
>> <input type="submit"></input>
>> </form>
>>
>> It's also parsed correctly by FileUpload 1.1.
>>
>> I've had a look at the HTTPClient 3.1-alpha1 source and the problematic 
>> line in FilePart looks the same - which means that either my fix is a 
>> heresy and/or there is a better way of doing this - or that this bug has 
>> not been reported before (I failed to find anything on this in the archive).
>>
>> Please let me know if this is the right way of fixing this problem and 
>> if so, will this fix make it into HTTPClient 3.1
>>
>> Thanks and best regards!
>> --Przemek
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org
> 

-- 
[web]  http://www.odi.ch/
[blog] http://www.odi.ch/weblog/
[pgp]  key 0x81CF3416
        finger print F2B1 B21F F056 D53E 5D79 A5AF 02BE 70F5 81CF 3416

---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpclient-dev-help@jakarta.apache.org


Mime
View raw message