tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier ...@ice-sa.com>
Subject Re: [ win xp and win server 2003 ] tomcat utf8 encoding
Date Fri, 08 Apr 2011 15:50:46 GMT
Tomislav Brkljačić wrote:
> The remote machine gives the wrong "result".
> 
> I wrote on the mailing list of the BPM software, the discussion is still
> alive.
> 
> Maybe i could try to force a CharacterEncodingFilter filter on tomcat.
> Something like 
> http://www.onthoo.com/blog/programming/2005/07/characterencodingfilter.html
> this .

Don't do that.  Your problem is with the file *name*, not with the file content.
Filters work on the content.
I think you could make a real mess of everything by adding a content filter.  I don't 
think that Tomcat would use it in this case, but if it does, it will filter the whole 
multi-part body (headers and contents), which is certainly not what you want here.

A question : how exactly is the file name retrieved and used by that BPM upload module ?
I mean, can you see if it gets it as a byte array or as a String ?

And what about that "locale=default" query parameter ?
What is it supposed to mean, in the BPM documentation ?


> 
> I will definitely try with Wireshark.
> 
> thx
> 
> 
> Christopher Schultz-2 wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Tom,
>>
>> On 4/8/2011 4:19 AM, Tomislav Brkljačić wrote:
>>> Ok, this is what i did.
>>>
>>> 1. updated the java runtime so they match on both machines
>> Not a bad idea, but probably didn't affect anything.
>>
>>> Tried to run the examples, but still the same result.
>>>
>>> 2. installed livehttpheaders for firefox and ran the examples upload.
>>> This is the output from livehttp  from my local machine (the same is on
>>> the
>>> server machine) :
>> So... is the local machine the one that does or does not work? Comparing
>> the two that DO work would be a good idea.
>>
>>> Content-Type: multipart/form-data;
>>>     boundary=---------------------------55652821543
>> Note the lack of a character encoding (in the main request header). This
>> is appropriate for multipart/form-data content.
>>
>>> Content-Disposition: form-data; name="attach_file";
>>> filename="pričuva.txt"
>>> Content-Type: text/plain
>>>
>>> asdasdasd
>>> -----------------------------55652821543--
>> A couple of things:
>>
>> 1. I'm surprised that no Content-Length was sent along with the file.
>>
>> 2. Note that the filename has non-US-ASCII characters shown there.
>>    I wonder if that's LiveHttpHeaders's interpretation of the header
>>    (and in what encoding) or if that's what's on the wire.
>>
>>
>> I suspect that ff is just using utf-8 to send the filename. Tomcat may
>> interpret it as US-ASCII and give you an odd result. Actually... for
>> multipart, Tomcat shouldn't be involved: this may be a problem with the
>> library you are using for file uploads. You should definitely ask on the
>> BPM mailing list.
>>
>> Here's one thing you can do:
>>
>> String brokenString = part.getFilename();  // or whatever
>>
>> String fixedString
>>    = new String(brokenString.getBytes("US-ASCII"), "UTF-8"));
>>
>> That will re-encode the bytes sent from the client UTF-8. This wil only
>> work if:
>>
>> 1. The client actually sent the data in UTF-8
>>
>> 2. Your multipart handler actually assumed that US-ASCII was correct
>>
>> 3. No alteration of the bytes has occurred by the interpretation
>>    as US-ASCII
>>
>> If any of the above are NOT true, you are basically stuck.
>>
>> It would be worth it to look at the bytes are they are traversing the
>> network -- say, with Wireshark -- to determine whether the filename is
>> actually encoded in UTF-8 or some other encoding.
>>
>> Hope that helps,
>> - -chris
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.10 (MingW32)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>>
>> iEYEARECAAYFAk2fHlAACgkQ9CaO5/Lv0PAJpwCeLrK7QVnL8bEkyfXow8Thj6UD
>> TpEAoJgmtujwwN+VvvCHQzUHZsf9e2qO
>> =9LWc
>> -----END PGP SIGNATURE-----
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message