commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brent Kynaston" <bkynas...@trivir.com>
Subject Re: FileUpload: Failure to parse non-ascii character sets
Date Mon, 20 Jun 2005 23:10:06 GMT
Ron,

Below you'll find the headers for two separate tests.  The first test does not user the PortletFileUpload
class, it simply uses a standard form with DocTitle data (and other form data) coming in as
x-www-form-urlencoded.  Note the values after "ztest3".  These values are Russian Cyrillic
characters that were posted successfully and stored in a database.

In the second test, we used the PortletFileUpload API to parse the multipart/form-data.  Here
we insert ztest5 into the DocTitle field, followed by Cyrillic characters again.  This time
however, we to not receive the proper DocTitle value from the FileUpload parser.  

Here is the header and data from the first test:
*----------------------------------------------
No.     Time        Source                Destination           Protocol Info
    332 22.560146   192.168.189.1         192.168.189.201       HTTP     POST /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
HTTP/1.1 (application/x-www-form-urlencoded)

Frame 332 (1035 bytes on wire, 1035 bytes captured)
Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff
Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr: 192.168.189.201 (192.168.189.201)
Transmission Control Protocol, Src Port: 3293 (3293), Dst Port: http (80), Seq: 1, Ack: 1,
Len: 981
Hypertext Transfer Protocol
    POST /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
HTTP/1.1\r\n
        Request Method: POST
        Request URI: /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
        Request Version: HTTP/1.1
    Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash,
*/*\r\n
    Referer: http://192.168.189.201/GLPNetPortal/portal/portlet/COPDocuments?novl-inst=c373e902f9802206764b000c296f1d50\r\n

    Accept-Language: en-us,ru;q=0.5\r\n
    Content-Type: application/x-www-form-urlencoded\r\n
    Accept-Encoding: gzip, deflate\r\n
    User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)\r\n
    Host: 192.168.189.201\r\n
    Content-Length: 207\r\n
    Connection: Keep-Alive\r\n
    Cache-Control: no-cache\r\n
    Cookie: JSESSIONID=994b099daff9dcf29b1956bfa65f9a57\r\n
    \r\n
Line-based text data: application/x-www-form-urlencoded
    DocTitle=ztest3%D1%84%D1%8B%D0%B2%D0%B0%D1%84%D1%8B%D0%B2%D0%B0&fileData=&DocDesc=Please+add+a+description&viewable=on&peer-review=on&DocID=c373e90485927c3d323c000c29ccbaff&charset=UTF-8&updateWebLink=Submit
*----------------------------------------------

Here is the header and data for the second test (using multipart/form-data):
*----------------------------------------------
No.     Time        Source                Destination           Protocol Info
   1362 466.405031  192.168.189.1         192.168.189.201       HTTP     POST /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
HTTP/1.1

Frame 1362 (866 bytes on wire, 866 bytes captured)
Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff
Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr: 192.168.189.201 (192.168.189.201)
Transmission Control Protocol, Src Port: 3533 (3533), Dst Port: http (80), Seq: 1, Ack: 1,
Len: 812
Hypertext Transfer Protocol
    POST /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
HTTP/1.1\r\n
        Request Method: POST
        Request URI: /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
        Request Version: HTTP/1.1
    Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash,
*/*\r\n
    Referer: http://192.168.189.201/GLPNetPortal/portal/portlet/COPDocuments?novl-inst=c373e902f9802206764b000c296f1d50\r\n

    Accept-Language: en-us,ru;q=0.5\r\n
    Content-Type: multipart/form-data; boundary=---------------------------7d53891713065c\r\n
    Accept-Encoding: gzip, deflate\r\n
    User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)\r\n
    Host: 192.168.189.201\r\n
    Content-Length: 991\r\n
    Connection: Keep-Alive\r\n
    Cache-Control: no-cache\r\n
    Cookie: JSESSIONID=994b099daff9dcf29b1956bfa65f9a57\r\n
    \r\n

No.     Time        Source                Destination           Protocol Info
   1363 466.405043  192.168.189.1         192.168.189.201       HTTP     Continuation or non-HTTP
traffic

Frame 1363 (1045 bytes on wire, 1045 bytes captured)
Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff
Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr: 192.168.189.201 (192.168.189.201)
Transmission Control Protocol, Src Port: 3533 (3533), Dst Port: http (80), Seq: 813, Ack:
1, Len: 991
Hypertext Transfer Protocol
    Data (991 bytes)

0000  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   ----------------
0010  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35   -------------7d5
0020  33 38 39 31 37 31 33 30 36 35 63 0d 0a 43 6f 6e   3891713065c..Con
0030  74 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e   tent-Disposition
0040  3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d   : form-data; nam
0050  65 3d 22 44 6f 63 54 69 74 6c 65 22 0d 0a 0d 0a   e="DocTitle"....
0060  7a 74 65 73 74 35 d1 84 d1 8b d0 b2 d0 b0 d1 84   ztest5..........
0070  d0 b2 d1 8b d0 b0 d1 84 d1 8b d0 b2 d0 b0 0d 0a   ................
0080  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   ----------------
0090  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35   -------------7d5
00a0  33 38 39 31 37 31 33 30 36 35 63 0d 0a 43 6f 6e   3891713065c..Con
00b0  74 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e   tent-Disposition
00c0  3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d   : form-data; nam
00d0  65 3d 22 66 69 6c 65 44 61 74 61 22 3b 20 66 69   e="fileData"; fi
00e0  6c 65 6e 61 6d 65 3d 22 22 0d 0a 43 6f 6e 74 65   lename=""..Conte
00f0  6e 74 2d 54 79 70 65 3a 20 61 70 70 6c 69 63 61   nt-Type: applica
0100  74 69 6f 6e 2f 6f 63 74 65 74 2d 73 74 72 65 61   tion/octet-strea
0110  6d 0d 0a 0d 0a 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d   m......---------
0120  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   ----------------
0130  2d 2d 2d 2d 37 64 35 33 38 39 31 37 31 33 30 36   ----7d5389171306
0140  35 63 0d 0a 43 6f 6e 74 65 6e 74 2d 44 69 73 70   5c..Content-Disp
0150  6f 73 69 74 69 6f 6e 3a 20 66 6f 72 6d 2d 64 61   osition: form-da
0160  74 61 3b 20 6e 61 6d 65 3d 22 44 6f 63 44 65 73   ta; name="DocDes
0170  63 22 0d 0a 0d 0a 50 6c 65 61 73 65 20 61 64 64   c"....Please add
0180  20 61 20 64 65 73 63 72 69 70 74 69 6f 6e 0d 0a    a description..
0190  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   ----------------
01a0  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35   -------------7d5
01b0  33 38 39 31 37 31 33 30 36 35 63 0d 0a 43 6f 6e   3891713065c..Con
01c0  74 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e   tent-Disposition
01d0  3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d   : form-data; nam
01e0  65 3d 22 76 69 65 77 61 62 6c 65 22 0d 0a 0d 0a   e="viewable"....
01f0  6f 6e 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   on..------------
0200  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   ----------------
0210  2d 37 64 35 33 38 39 31 37 31 33 30 36 35 63 0d   -7d53891713065c.
0220  0a 43 6f 6e 74 65 6e 74 2d 44 69 73 70 6f 73 69   .Content-Disposi
0230  74 69 6f 6e 3a 20 66 6f 72 6d 2d 64 61 74 61 3b   tion: form-data;
0240  20 6e 61 6d 65 3d 22 70 65 65 72 2d 72 65 76 69    name="peer-revi
0250  65 77 22 0d 0a 0d 0a 6f 6e 0d 0a 2d 2d 2d 2d 2d   ew"....on..-----
0260  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   ----------------
0270  2d 2d 2d 2d 2d 2d 2d 2d 37 64 35 33 38 39 31 37   --------7d538917
0280  31 33 30 36 35 63 0d 0a 43 6f 6e 74 65 6e 74 2d   13065c..Content-
0290  44 69 73 70 6f 73 69 74 69 6f 6e 3a 20 66 6f 72   Disposition: for
02a0  6d 2d 64 61 74 61 3b 20 6e 61 6d 65 3d 22 44 6f   m-data; name="Do
02b0  63 49 44 22 0d 0a 0d 0a 63 33 37 33 65 39 30 34   cID"....c373e904
02c0  38 35 39 32 37 63 33 64 33 32 33 63 30 30 30 63   85927c3d323c000c
02d0  32 39 63 63 62 61 66 66 0d 0a 2d 2d 2d 2d 2d 2d   29ccbaff..------
02e0  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   ----------------
02f0  2d 2d 2d 2d 2d 2d 2d 37 64 35 33 38 39 31 37 31   -------7d5389171
0300  33 30 36 35 63 0d 0a 43 6f 6e 74 65 6e 74 2d 44   3065c..Content-D
0310  69 73 70 6f 73 69 74 69 6f 6e 3a 20 66 6f 72 6d   isposition: form
0320  2d 64 61 74 61 3b 20 6e 61 6d 65 3d 22 64 6f 63   -data; name="doc
0330  46 69 6c 65 4e 61 6d 65 22 0d 0a 0d 0a 4b 45 59   FileName"....KEY
0340  53 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   S..-------------
0350  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   ----------------
0360  37 64 35 33 38 39 31 37 31 33 30 36 35 63 0d 0a   7d53891713065c..
0370  43 6f 6e 74 65 6e 74 2d 44 69 73 70 6f 73 69 74   Content-Disposit
0380  69 6f 6e 3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20   ion: form-data; 
0390  6e 61 6d 65 3d 22 75 70 64 61 74 65 57 65 62 4c   name="updateWebL
03a0  69 6e 6b 22 0d 0a 0d 0a 53 75 62 6d 69 74 0d 0a   ink"....Submit..
03b0  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d   ----------------
03c0  2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35   -------------7d5
03d0  33 38 39 31 37 31 33 30 36 35 63 2d 2d 0d 0a      3891713065c--..
*----------------------------------------------

Thanks,

Brent

>>> ronald-freebsd8@klop.yi.org 6/20/2005 6:36:33 PM >>>
On Tue, 21 Jun 2005 00:24:52 +0200, Brent Kynaston <bkynaston@trivir.com>  
wrote:

> Ronald,
>
> Thanks for the quick response.
>
> Here is the HTTP header (captured by Ethereal) from the post where I've  
> inserted some Finnish data for one of the fields:
>
> *-----------------------------------------------------
> Frame 161 (856 bytes on wire, 856 bytes captured)
> Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff
> Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr:  
> 192.168.189.201 (192.168.189.201)
> Transmission Control Protocol, Src Port: 2631 (2631), Dst Port: http  
> (80), Seq: 1, Ack: 1, Len: 802
> Hypertext Transfer Protocol
>     POST  
> /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
 
> HTTP/1.1\r\n
>         Request Method: POST
>         Request URI:  
> /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff
>         Request Version: HTTP/1.1
>     Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,  
> application/x-shockwave-flash, */*\r\n
>     Referer:  
> http://192.168.189.201/GLPNetPortal/portal/portlet/COPDocuments?novl-inst=c373e902f9802206764b000c296f1d50\r\n

>     Accept-Language: en-us\r\n
>     Content-Type: multipart/form-data;  
> boundary=---------------------------7d522e2ec0e8a\r\n
>     Accept-Encoding: gzip, deflate\r\n
>     User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1;  
> .NET CLR 1.1.4322)\r\n
>     Host: 192.168.189.201\r\n
>     Content-Length: 968\r\n
>     Connection: Keep-Alive\r\n
>     Cache-Control: no-cache\r\n
>     Cookie: JSESSIONID=aa88969a240158baa93362f89c55e4f3\r\n
>     \r\n
> *-----------------------------------------------------
>
> Thanks,
>
> Brent
>
>>>> ronald-freebsd8@klop.yi.org 6/20/2005 5:31:19 PM >>>
> On Mon, 20 Jun 2005 22:29:34 +0200, Brent Kynaston <bkynaston@trivir.com>
> wrote:
>
>> I'm trying to post a multi-part form with file data and text input
>> files.  The Portlet FileUpload code is able to successfully parse the
>> file data and text fields, except for when I change my keyboard type to
>> Finnish, Arabic, or any foreign language for that matter.
>>
>> I've specified an http meta-equiv with UTF-8:
>> META http-equiv="Content-Type" content="text/html; charset=UTF-8
>>
>> I've tried setting the PortletFileUpload class instance to various
>> encoding types, and have not been able to get it to work.  Is this
>> broken in the current builds of commons-fileupload-1.1-dev.jar?
>
> Post a dump of the headers going over the wire. (See ngrep, ethereal or
> another network sniffer.)

A multipart/form-data post contains more headers in the body in the  
request. Those are the interesting ones.
It's best seen with no file or a very small file upload.

Ronald.

-- 
  Ronald Klop
  Amsterdam, The Netherlands

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org 
For additional commands, e-mail: commons-user-help@jakarta.apache.org 


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message