commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brent Kynaston" <bkynas...@trivir.com>
Subject Re: FileUpload: Failure to parse non-ascii character sets
Date Tue, 21 Jun 2005 00:49:32 GMT
Ronald,

That did it!  I had previously tried setting the encoding type on the
PortletFileUpload object without success - but actually pulling the
FileItem string value by specifying encoding type worked like a champ!

Thanks a million!

--Brent

>>> ronald-freebsd8@klop.yi.org 6/20/2005 8:26:42 PM >>>
On Tue, 21 Jun 2005 01:10:06 +0200, Brent Kynaston
<bkynaston@trivir.com>  
wrote:

> Ron,
>
> Below you'll find the headers for two separate tests.  The first test
 
> does not user the PortletFileUpload class, it simply uses a standard 

> form with DocTitle data (and other form data) coming in as  
> x-www-form-urlencoded.  Note the values after "ztest3".  These values
 
> are Russian Cyrillic characters that were posted successfully and
stored  
> in a database.
>
> In the second test, we used the PortletFileUpload API to parse the  
> multipart/form-data.  Here we insert ztest5 into the DocTitle field, 

> followed by Cyrillic characters again.  This time however, we to not 

> receive the proper DocTitle value from the FileUpload parser.
>
> Here is the header and data from the first test:
> *----------------------------------------------
> No.     Time        Source                Destination          
Protocol  
> Info
>     332 22.560146   192.168.189.1         192.168.189.201       HTTP 
    
> POST  
>
/GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
 
> HTTP/1.1 (application/x-www-form-urlencoded)
>
> Frame 332 (1035 bytes on wire, 1035 bytes captured)
> Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff
> Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr:
 
> 192.168.189.201 (192.168.189.201)
> Transmission Control Protocol, Src Port: 3293 (3293), Dst Port: http 

> (80), Seq: 1, Ack: 1, Len: 981
> Hypertext Transfer Protocol
>     POST  
>
/GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
 
> HTTP/1.1\r\n
>         Request Method: POST
>         Request URI:  
>
/GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
>         Request Version: HTTP/1.1
>     Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,  
> application/x-shockwave-flash, */*\r\n
>     Referer:  
>
http://192.168.189.201/GLPNetPortal/portal/portlet/COPDocuments?novl-inst=c373e902f9802206764b000c296f1d50\r\n

>     Accept-Language: en-us,ru;q=0.5\r\n
>     Content-Type: application/x-www-form-urlencoded\r\n
>     Accept-Encoding: gzip, deflate\r\n
>     User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1;  
> .NET CLR 1.1.4322)\r\n
>     Host: 192.168.189.201\r\n
>     Content-Length: 207\r\n
>     Connection: Keep-Alive\r\n
>     Cache-Control: no-cache\r\n
>     Cookie: JSESSIONID=994b099daff9dcf29b1956bfa65f9a57\r\n
>     \r\n
> Line-based text data: application/x-www-form-urlencoded
>    
DocTitle=ztest3%D1%84%D1%8B%D0%B2%D0%B0%D1%84%D1%8B%D0%B2%D0%B0&fileData=&DocDesc=Please+add+a+description&viewable=on&peer-review=on&DocID=c373e90485927c3d323c000c29ccbaff&charset=UTF-8&updateWebLink=Submit
> *----------------------------------------------
>
> Here is the header and data for the second test (using  
> multipart/form-data):
> *----------------------------------------------
> No.     Time        Source                Destination          
Protocol  
> Info
>    1362 466.405031  192.168.189.1         192.168.189.201       HTTP 
    
> POST  
>
/GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
 
> HTTP/1.1
>
> Frame 1362 (866 bytes on wire, 866 bytes captured)
> Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff
> Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr:
 
> 192.168.189.201 (192.168.189.201)
> Transmission Control Protocol, Src Port: 3533 (3533), Dst Port: http 

> (80), Seq: 1, Ack: 1, Len: 812
> Hypertext Transfer Protocol
>     POST  
>
/GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
 
> HTTP/1.1\r\n
>         Request Method: POST
>         Request URI:  
>
/GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
>         Request Version: HTTP/1.1
>     Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,  
> application/x-shockwave-flash, */*\r\n
>     Referer:  
>
http://192.168.189.201/GLPNetPortal/portal/portlet/COPDocuments?novl-inst=c373e902f9802206764b000c296f1d50\r\n

>     Accept-Language: en-us,ru;q=0.5\r\n
>     Content-Type: multipart/form-data;  
> boundary=---------------------------7d53891713065c\r\n
>     Accept-Encoding: gzip, deflate\r\n
>     User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1;  
> .NET CLR 1.1.4322)\r\n
>     Host: 192.168.189.201\r\n
>     Content-Length: 991\r\n
>     Connection: Keep-Alive\r\n
>     Cache-Control: no-cache\r\n
>     Cookie: JSESSIONID=994b099daff9dcf29b1956bfa65f9a57\r\n
>     \r\n
>
> No.     Time        Source                Destination          
Protocol  
> Info
>    1363 466.405043  192.168.189.1         192.168.189.201       HTTP 
    
> Continuation or non-HTTP traffic
>
> Frame 1363 (1045 bytes on wire, 1045 bytes captured)
> Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff
> Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr:
 
> 192.168.189.201 (192.168.189.201)
> Transmission Control Protocol, Src Port: 3533 (3533), Dst Port: http 

> (80), Seq: 813, Ack: 1, Len: 991
> Hypertext Transfer Protocol
>     Data (991 bytes)
>
> 0000  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
----------------
> 0010  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35  
-------------7d5
> 0020  33 38 39 31 37 31 33 30 36 35 63 0d 0a 43 6f 6e  
3891713065c..Con
> 0030  74 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e  
tent-Disposition
> 0040  3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d   : form-data;
nam
> 0050  65 3d 22 44 6f 63 54 69 74 6c 65 22 0d 0a 0d 0a  
e="DocTitle"....
> 0060  7a 74 65 73 74 35 d1 84 d1 8b d0 b2 d0 b0 d1 84  
ztest5..........
> 0070  d0 b2 d1 8b d0 b0 d1 84 d1 8b d0 b2 d0 b0 0d 0a  
................
> 0080  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
----------------
> 0090  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35  
-------------7d5
> 00a0  33 38 39 31 37 31 33 30 36 35 63 0d 0a 43 6f 6e  
3891713065c..Con
> 00b0  74 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e  
tent-Disposition
> 00c0  3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d   : form-data;
nam
> 00d0  65 3d 22 66 69 6c 65 44 61 74 61 22 3b 20 66 69   e="fileData";
fi
> 00e0  6c 65 6e 61 6d 65 3d 22 22 0d 0a 43 6f 6e 74 65  
lename=""..Conte
> 00f0  6e 74 2d 54 79 70 65 3a 20 61 70 70 6c 69 63 61   nt-Type:
applica
> 0100  74 69 6f 6e 2f 6f 63 74 65 74 2d 73 74 72 65 61  
tion/octet-strea
> 0110  6d 0d 0a 0d 0a 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d  
m......---------
> 0120  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
----------------
> 0130  2d 2d 2d 2d 37 64 35 33 38 39 31 37 31 33 30 36  
----7d5389171306
> 0140  35 63 0d 0a 43 6f 6e 74 65 6e 74 2d 44 69 73 70  
5c..Content-Disp
> 0150  6f 73 69 74 69 6f 6e 3a 20 66 6f 72 6d 2d 64 61   osition:
form-da
> 0160  74 61 3b 20 6e 61 6d 65 3d 22 44 6f 63 44 65 73   ta;
name="DocDes
> 0170  63 22 0d 0a 0d 0a 50 6c 65 61 73 65 20 61 64 64   c"....Please
add
> 0180  20 61 20 64 65 73 63 72 69 70 74 69 6f 6e 0d 0a    a
description..
> 0190  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
----------------
> 01a0  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35  
-------------7d5
> 01b0  33 38 39 31 37 31 33 30 36 35 63 0d 0a 43 6f 6e  
3891713065c..Con
> 01c0  74 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e  
tent-Disposition
> 01d0  3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d   : form-data;
nam
> 01e0  65 3d 22 76 69 65 77 61 62 6c 65 22 0d 0a 0d 0a  
e="viewable"....
> 01f0  6f 6e 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
on..------------
> 0200  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
----------------
> 0210  2d 37 64 35 33 38 39 31 37 31 33 30 36 35 63 0d  
-7d53891713065c.
> 0220  0a 43 6f 6e 74 65 6e 74 2d 44 69 73 70 6f 73 69  
.Content-Disposi
> 0230  74 69 6f 6e 3a 20 66 6f 72 6d 2d 64 61 74 61 3b   tion:
form-data;
> 0240  20 6e 61 6d 65 3d 22 70 65 65 72 2d 72 65 76 69   
name="peer-revi
> 0250  65 77 22 0d 0a 0d 0a 6f 6e 0d 0a 2d 2d 2d 2d 2d  
ew"....on..-----
> 0260  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
----------------
> 0270  2d 2d 2d 2d 2d 2d 2d 2d 37 64 35 33 38 39 31 37  
--------7d538917
> 0280  31 33 30 36 35 63 0d 0a 43 6f 6e 74 65 6e 74 2d  
13065c..Content-
> 0290  44 69 73 70 6f 73 69 74 69 6f 6e 3a 20 66 6f 72   Disposition:
for
> 02a0  6d 2d 64 61 74 61 3b 20 6e 61 6d 65 3d 22 44 6f   m-data;
name="Do
> 02b0  63 49 44 22 0d 0a 0d 0a 63 33 37 33 65 39 30 34  
cID"....c373e904
> 02c0  38 35 39 32 37 63 33 64 33 32 33 63 30 30 30 63  
85927c3d323c000c
> 02d0  32 39 63 63 62 61 66 66 0d 0a 2d 2d 2d 2d 2d 2d  
29ccbaff..------
> 02e0  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
----------------
> 02f0  2d 2d 2d 2d 2d 2d 2d 37 64 35 33 38 39 31 37 31  
-------7d5389171
> 0300  33 30 36 35 63 0d 0a 43 6f 6e 74 65 6e 74 2d 44  
3065c..Content-D
> 0310  69 73 70 6f 73 69 74 69 6f 6e 3a 20 66 6f 72 6d   isposition:
form
> 0320  2d 64 61 74 61 3b 20 6e 61 6d 65 3d 22 64 6f 63   -data;
name="doc
> 0330  46 69 6c 65 4e 61 6d 65 22 0d 0a 0d 0a 4b 45 59  
FileName"....KEY
> 0340  53 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
S..-------------
> 0350  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
----------------
> 0360  37 64 35 33 38 39 31 37 31 33 30 36 35 63 0d 0a  
7d53891713065c..
> 0370  43 6f 6e 74 65 6e 74 2d 44 69 73 70 6f 73 69 74  
Content-Disposit
> 0380  69 6f 6e 3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20   ion:
form-data;
> 0390  6e 61 6d 65 3d 22 75 70 64 61 74 65 57 65 62 4c  
name="updateWebL
> 03a0  69 6e 6b 22 0d 0a 0d 0a 53 75 62 6d 69 74 0d 0a  
ink"....Submit..
> 03b0  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d  
----------------
> 03c0  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35  
-------------7d5
> 03d0  33 38 39 31 37 31 33 30 36 35 63 2d 2d 0d 0a     
3891713065c--..
> *----------------------------------------------
>
> Thanks,
>
> Brent
>
>>>> ronald-freebsd8@klop.yi.org 6/20/2005 6:36:33 PM >>>
> On Tue, 21 Jun 2005 00:24:52 +0200, Brent Kynaston
<bkynaston@trivir.com>
> wrote:
>
>> Ronald,
>>
>> Thanks for the quick response.
>>
>> Here is the HTTP header (captured by Ethereal) from the post where
I've
>> inserted some Finnish data for one of the fields:
>>
>> *-----------------------------------------------------
>> Frame 161 (856 bytes on wire, 856 bytes captured)
>> Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff
>> Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst
Addr:
>> 192.168.189.201 (192.168.189.201)
>> Transmission Control Protocol, Src Port: 2631 (2631), Dst Port:
http
>> (80), Seq: 1, Ack: 1, Len: 802
>> Hypertext Transfer Protocol
>>     POST
>>
/GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
>> HTTP/1.1\r\n
>>         Request Method: POST
>>         Request URI:
>>
/GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
>>         Request Version: HTTP/1.1
>>     Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
>> application/x-shockwave-flash, */*\r\n
>>     Referer:
>>
http://192.168.189.201/GLPNetPortal/portal/portlet/COPDocuments?novl-inst=c373e902f9802206764b000c296f1d50\r\n

>>     Accept-Language: en-us\r\n
>>     Content-Type: multipart/form-data;
>> boundary=---------------------------7d522e2ec0e8a\r\n
>>     Accept-Encoding: gzip, deflate\r\n
>>     User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;
SV1;
>> .NET CLR 1.1.4322)\r\n
>>     Host: 192.168.189.201\r\n
>>     Content-Length: 968\r\n
>>     Connection: Keep-Alive\r\n
>>     Cache-Control: no-cache\r\n
>>     Cookie: JSESSIONID=aa88969a240158baa93362f89c55e4f3\r\n
>>     \r\n
>> *-----------------------------------------------------
>>
>> Thanks,
>>
>> Brent
>>
>>>>> ronald-freebsd8@klop.yi.org 6/20/2005 5:31:19 PM >>>
>> On Mon, 20 Jun 2005 22:29:34 +0200, Brent Kynaston  
>> <bkynaston@trivir.com>
>> wrote:
>>
>>> I'm trying to post a multi-part form with file data and text input
>>> files.  The Portlet FileUpload code is able to successfully parse
the
>>> file data and text fields, except for when I change my keyboard
type to
>>> Finnish, Arabic, or any foreign language for that matter.
>>>
>>> I've specified an http meta-equiv with UTF-8:
>>> META http-equiv="Content-Type" content="text/html; charset=UTF-8
>>>
>>> I've tried setting the PortletFileUpload class instance to various
>>> encoding types, and have not been able to get it to work.  Is this
>>> broken in the current builds of commons-fileupload-1.1-dev.jar?
>>
>> Post a dump of the headers going over the wire. (See ngrep, ethereal
or
>> another network sniffer.)
>
> A multipart/form-data post contains more headers in the body in the
> request. Those are the interesting ones.
> It's best seen with no file or a very small file upload.


Did you try FileItem.getString(String encoding)? getString("UTF-8") in 

this case.


-- 
  Ronald Klop
  Amsterdam, The Netherlands

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org 
For additional commands, e-mail: commons-user-help@jakarta.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message