tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: [ win xp and win server 2003 ] tomcat utf8 encoding
Date Thu, 07 Apr 2011 16:26:25 GMT
Tomislav Brkljačić wrote:
> awarnier wrote:
>> Tomislav Brkljačić wrote:
>>> Hi to all,
>>> this is my scenario and problem. 
>>> Situation 1. - local machine, win xp
>>> I have a web app deployed to tomcat, and the app has a webform for
>>> uploading
>>> attachments.
>>> Attachments can have funny letters (š,ć,čćžđ ) in the filename.
>>> I have set the file.encoding=UTF8 and UriEncoding = UTF8 for jvm and
>>> inside
>>> the server.xml.
>>> Everything works as expected, no anomalies in displaying the filenames of
>>> the uploaded files.
>>> Situation 2. - client machine, win server 2003
>>> Same webapp as in Situation 1, same tomcat configuration in all matters.
>>> But there is  aproblem.
>>> After i upload the files with funny names through the app, the filenames
>>> are
>>> scrambled and garbled.
>>> I checked the location of the files in the file system, and of course 
>>> uploadaed filenames are
>>> acrambled in the file system too.
>>> Obviously there is some other setting i need to check and syncronize, but
>>> it
>>> eludes me so far..
>>> Any help is very appreciated.
>> Hi.
>> Can you provide the *exact* versions of Java, Tomcat, and whichever file
>> uploading 
>> mechanism you are using ?
>> (meaning : to process the multi-part POST with the file upload, your
>> webapp uses some 
>> additional mechanism; which is it ?)
> 1.Situation - local win xp machine
> Java : java version "1.6.0_22"
> Tomcat : 6.0.29
> This is the scenario where everything works as expected.
> 2. Situation - customer win server 2003 machine
> Java : java version "1.6.0_20"
> Tomcat : 6.0.29
> The deployed web application is developed with Bonita open Solution (BPM
> framework).
> I'm not that fluent in the java world but looking at the downloaded source
> code, i guess it
> would be a basic fileupload servlet. 

Right. But that may be the important part.
Are you familiar with the format in which a browser sends a multipart/form-data POST ?
(MIME multipart, similar to the basic .eml format of an email with attachments)
Briefly : the data is sent by the browser in a format like :

request line (POST)
header "Content-type: multipart/form-data; boundary="----xyz--"
(blank line)
header of part 1
(blank line)
body of part 1
header of part 2
(blank line)
body of part 2

where each "part" is one of the inputs of the <form>.  One of these parts is your uploaded

file, and it has a special header which specifies the file type, encoding, file name etc..

The job of the "fileupload servlet" (actually, it is a library capable of reading such a 
POST and separating it into parts), is to read these headers and bodies, and make sense 
out of them.  One of these things that it reads is the filename, and of course it 
interprets that according to some character set.
For that, it uses some kind of java stream, and if it does it right, tells it the 
character set to use to decode the input.
And it is possible that it does /not/ do it right in some cases (maybe even depending on 
which JVM version it runs under). For example, if it does not specify the character set to

use to decode the input, Java may use the platform default, which may be different on 
these two systems.  And if that is the case, it may wrongly decode the filename header, 
and produce garbage.

What I am saying is that, since you have the same Tomcat version on both systems, the code

which works differently is unlikely to be in Tomcat itself.  To my recollection (maybe 
wrong), Tomcat 6.0 does not include any code that can deal with a multi-part POST.
(I think that Tomcat 7.0 does).
So the code which acts differently on your two servers above, is either the file-upload 
library used by that servlet, or the JVM functions that it itself uses.

In other words again, my first stop for a solution would be whatever support list is 
available for the "Bonita open Solution (BPM framework)".

(you may further narrow down the problem first by updating your 2003 server to java 
version "1.6.0_22").

Now for another more general comment :
According to your explanation, you upload a file from a browser, and then try to write it

to the local filesystem using the name which it had on the original workstation.
In my view, this is always a bad idea, in general.
One reason is the one you already found.
The other is that if 2 users upload a file with the same name, the second one will 
overwrite the first.
The third is that you are leaving yourself open for all kinds of nasty things, such as a 
user uploading a file with spaces in the name (always a problem at some point), or with 
characters in the name that may be very dangerous (think of a file named "> /etc/passwd"

or "some.file|rm *").
So, if you have a chance to do that, give each uploaded file a name that you create, and 
keep the original filename in some separate place if you need it, for display only.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message