commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Priest <>
Subject [FileUpload] Unicode Encoding for a Form
Date Wed, 17 Sep 2003 15:19:19 GMT
Hello all,

I have a simple html form which has an <INPUT TYPE="FILE"/> field in it.

Now when I select a file that contains Scandanavian characters (such as
umlauts) it is not being URL encoded properly before being sent. As a
result,  my jsp page which accepts posts of files via the FileUpload package
is not interpreting the file name correctly.

Has anyone seen this problem, first? And does anyone have a solution for
this issue?

For example, if I select a file say:

filename="C:\Documents and Settings\Robert.Priest\Desktop\รครครค.txt"

what is sent in the request is:

C:\Documents and Settings\Robert.Priest\Desktop\???.txt"

and what is seen by if you do a FileItem.getName() is:

C:\Documents and Settings\Robert.Priest\Desktop\???.txt

So the method FileUploadBase.getFileName(Map /* String, String */ headers)
does not see the correct filename when it executes: 

 if (start != -1 && end != -1)
                fileName = cd.substring(start + 10, end).trim();

The following is the multipart requests that IE sends using such a file
(with umlauts) in the name:

POST /jsp/upload.jsp HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
powerpoint, application/, application/msword,
e-flash, */*
Referer: http://localhost:8080/roberttest/rptest.html
Accept-Language: en-us
Content-Type: multipart/form-data;
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)
Host: localhost:2000
Content-Length: 349
Connection: Keep-Alive
Cache-Control: no-cache

Content-Disposition: form-data; name="oFile1"; filename="C:\Documents and
Content-Type: application/octet-stream

Content-Disposition: form-data; name="TestValue"


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message