Return-Path: Delivered-To: apmail-jakarta-commons-user-archive@www.apache.org Received: (qmail 41116 invoked from network); 21 Jun 2005 00:26:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 21 Jun 2005 00:26:57 -0000 Received: (qmail 20723 invoked by uid 500); 21 Jun 2005 00:26:50 -0000 Delivered-To: apmail-jakarta-commons-user-archive@jakarta.apache.org Received: (qmail 20706 invoked by uid 500); 21 Jun 2005 00:26:49 -0000 Mailing-List: contact commons-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Help: List-Post: List-Id: "Jakarta Commons Users List" Reply-To: "Jakarta Commons Users List" Delivered-To: mailing list commons-user@jakarta.apache.org Received: (qmail 20693 invoked by uid 99); 21 Jun 2005 00:26:49 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Jun 2005 17:26:49 -0700 X-ASF-Spam-Status: No, hits=0.1 required=10.0 tests=NORMAL_HTTP_TO_IP X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [195.241.79.177] (HELO smtp-out2.tiscali.nl) (195.241.79.177) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Jun 2005 17:26:50 -0700 Received: from guido.klop.ws (unknown [82.171.39.195]) by smtp-out2.tiscali.nl (Postfix) with SMTP id CC270B0000D5 for ; Tue, 21 Jun 2005 02:26:44 +0200 (CEST) Received: (qmail 1667 invoked from network); 21 Jun 2005 00:26:44 -0000 Received: from localhost (HELO outgoing.local) (127.0.0.1) by localhost with SMTP; 21 Jun 2005 00:26:44 -0000 To: "Jakarta Commons Users List" Subject: Re: FileUpload: Failure to parse non-ascii character sets References: Message-ID: Date: Tue, 21 Jun 2005 02:26:42 +0200 From: "Ronald Klop" Content-Type: text/plain; format=flowed; delsp=yes; charset=iso-8859-1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Opera M2/8.0 (FreeBSD, build 1095) X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N On Tue, 21 Jun 2005 01:10:06 +0200, Brent Kynaston wrote: > Ron, > > Below you'll find the headers for two separate tests. The first test > does not user the PortletFileUpload class, it simply uses a standard > form with DocTitle data (and other form data) coming in as > x-www-form-urlencoded. Note the values after "ztest3". These values > are Russian Cyrillic characters that were posted successfully and stored > in a database. > > In the second test, we used the PortletFileUpload API to parse the > multipart/form-data. Here we insert ztest5 into the DocTitle field, > followed by Cyrillic characters again. This time however, we to not > receive the proper DocTitle value from the FileUpload parser. > > Here is the header and data from the first test: > *---------------------------------------------- > No. Time Source Destination Protocol > Info > 332 22.560146 192.168.189.1 192.168.189.201 HTTP > POST > /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff > HTTP/1.1 (application/x-www-form-urlencoded) > > Frame 332 (1035 bytes on wire, 1035 bytes captured) > Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff > Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr: > 192.168.189.201 (192.168.189.201) > Transmission Control Protocol, Src Port: 3293 (3293), Dst Port: http > (80), Seq: 1, Ack: 1, Len: 981 > Hypertext Transfer Protocol > POST > /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff > HTTP/1.1\r\n > Request Method: POST > Request URI: > /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff > Request Version: HTTP/1.1 > Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, > application/x-shockwave-flash, */*\r\n > Referer: > http://192.168.189.201/GLPNetPortal/portal/portlet/COPDocuments?novl-inst=c373e902f9802206764b000c296f1d50\r\n > Accept-Language: en-us,ru;q=0.5\r\n > Content-Type: application/x-www-form-urlencoded\r\n > Accept-Encoding: gzip, deflate\r\n > User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; > .NET CLR 1.1.4322)\r\n > Host: 192.168.189.201\r\n > Content-Length: 207\r\n > Connection: Keep-Alive\r\n > Cache-Control: no-cache\r\n > Cookie: JSESSIONID=994b099daff9dcf29b1956bfa65f9a57\r\n > \r\n > Line-based text data: application/x-www-form-urlencoded > DocTitle=ztest3%D1%84%D1%8B%D0%B2%D0%B0%D1%84%D1%8B%D0%B2%D0%B0&fileData=&DocDesc=Please+add+a+description&viewable=on&peer-review=on&DocID=c373e90485927c3d323c000c29ccbaff&charset=UTF-8&updateWebLink=Submit > *---------------------------------------------- > > Here is the header and data for the second test (using > multipart/form-data): > *---------------------------------------------- > No. Time Source Destination Protocol > Info > 1362 466.405031 192.168.189.1 192.168.189.201 HTTP > POST > /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff > HTTP/1.1 > > Frame 1362 (866 bytes on wire, 866 bytes captured) > Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff > Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr: > 192.168.189.201 (192.168.189.201) > Transmission Control Protocol, Src Port: 3533 (3533), Dst Port: http > (80), Seq: 1, Ack: 1, Len: 812 > Hypertext Transfer Protocol > POST > /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff > HTTP/1.1\r\n > Request Method: POST > Request URI: > /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff > Request Version: HTTP/1.1 > Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, > application/x-shockwave-flash, */*\r\n > Referer: > http://192.168.189.201/GLPNetPortal/portal/portlet/COPDocuments?novl-inst=c373e902f9802206764b000c296f1d50\r\n > Accept-Language: en-us,ru;q=0.5\r\n > Content-Type: multipart/form-data; > boundary=---------------------------7d53891713065c\r\n > Accept-Encoding: gzip, deflate\r\n > User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; > .NET CLR 1.1.4322)\r\n > Host: 192.168.189.201\r\n > Content-Length: 991\r\n > Connection: Keep-Alive\r\n > Cache-Control: no-cache\r\n > Cookie: JSESSIONID=994b099daff9dcf29b1956bfa65f9a57\r\n > \r\n > > No. Time Source Destination Protocol > Info > 1363 466.405043 192.168.189.1 192.168.189.201 HTTP > Continuation or non-HTTP traffic > > Frame 1363 (1045 bytes on wire, 1045 bytes captured) > Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff > Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr: > 192.168.189.201 (192.168.189.201) > Transmission Control Protocol, Src Port: 3533 (3533), Dst Port: http > (80), Seq: 813, Ack: 1, Len: 991 > Hypertext Transfer Protocol > Data (991 bytes) > > 0000 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ---------------- > 0010 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35 -------------7d5 > 0020 33 38 39 31 37 31 33 30 36 35 63 0d 0a 43 6f 6e 3891713065c..Con > 0030 74 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e tent-Disposition > 0040 3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d : form-data; nam > 0050 65 3d 22 44 6f 63 54 69 74 6c 65 22 0d 0a 0d 0a e="DocTitle".... > 0060 7a 74 65 73 74 35 d1 84 d1 8b d0 b2 d0 b0 d1 84 ztest5.......... > 0070 d0 b2 d1 8b d0 b0 d1 84 d1 8b d0 b2 d0 b0 0d 0a ................ > 0080 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ---------------- > 0090 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35 -------------7d5 > 00a0 33 38 39 31 37 31 33 30 36 35 63 0d 0a 43 6f 6e 3891713065c..Con > 00b0 74 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e tent-Disposition > 00c0 3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d : form-data; nam > 00d0 65 3d 22 66 69 6c 65 44 61 74 61 22 3b 20 66 69 e="fileData"; fi > 00e0 6c 65 6e 61 6d 65 3d 22 22 0d 0a 43 6f 6e 74 65 lename=""..Conte > 00f0 6e 74 2d 54 79 70 65 3a 20 61 70 70 6c 69 63 61 nt-Type: applica > 0100 74 69 6f 6e 2f 6f 63 74 65 74 2d 73 74 72 65 61 tion/octet-strea > 0110 6d 0d 0a 0d 0a 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d m......--------- > 0120 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ---------------- > 0130 2d 2d 2d 2d 37 64 35 33 38 39 31 37 31 33 30 36 ----7d5389171306 > 0140 35 63 0d 0a 43 6f 6e 74 65 6e 74 2d 44 69 73 70 5c..Content-Disp > 0150 6f 73 69 74 69 6f 6e 3a 20 66 6f 72 6d 2d 64 61 osition: form-da > 0160 74 61 3b 20 6e 61 6d 65 3d 22 44 6f 63 44 65 73 ta; name="DocDes > 0170 63 22 0d 0a 0d 0a 50 6c 65 61 73 65 20 61 64 64 c"....Please add > 0180 20 61 20 64 65 73 63 72 69 70 74 69 6f 6e 0d 0a a description.. > 0190 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ---------------- > 01a0 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35 -------------7d5 > 01b0 33 38 39 31 37 31 33 30 36 35 63 0d 0a 43 6f 6e 3891713065c..Con > 01c0 74 65 6e 74 2d 44 69 73 70 6f 73 69 74 69 6f 6e tent-Disposition > 01d0 3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20 6e 61 6d : form-data; nam > 01e0 65 3d 22 76 69 65 77 61 62 6c 65 22 0d 0a 0d 0a e="viewable".... > 01f0 6f 6e 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d on..------------ > 0200 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ---------------- > 0210 2d 37 64 35 33 38 39 31 37 31 33 30 36 35 63 0d -7d53891713065c. > 0220 0a 43 6f 6e 74 65 6e 74 2d 44 69 73 70 6f 73 69 .Content-Disposi > 0230 74 69 6f 6e 3a 20 66 6f 72 6d 2d 64 61 74 61 3b tion: form-data; > 0240 20 6e 61 6d 65 3d 22 70 65 65 72 2d 72 65 76 69 name="peer-revi > 0250 65 77 22 0d 0a 0d 0a 6f 6e 0d 0a 2d 2d 2d 2d 2d ew"....on..----- > 0260 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ---------------- > 0270 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35 33 38 39 31 37 --------7d538917 > 0280 31 33 30 36 35 63 0d 0a 43 6f 6e 74 65 6e 74 2d 13065c..Content- > 0290 44 69 73 70 6f 73 69 74 69 6f 6e 3a 20 66 6f 72 Disposition: for > 02a0 6d 2d 64 61 74 61 3b 20 6e 61 6d 65 3d 22 44 6f m-data; name="Do > 02b0 63 49 44 22 0d 0a 0d 0a 63 33 37 33 65 39 30 34 cID"....c373e904 > 02c0 38 35 39 32 37 63 33 64 33 32 33 63 30 30 30 63 85927c3d323c000c > 02d0 32 39 63 63 62 61 66 66 0d 0a 2d 2d 2d 2d 2d 2d 29ccbaff..------ > 02e0 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ---------------- > 02f0 2d 2d 2d 2d 2d 2d 2d 37 64 35 33 38 39 31 37 31 -------7d5389171 > 0300 33 30 36 35 63 0d 0a 43 6f 6e 74 65 6e 74 2d 44 3065c..Content-D > 0310 69 73 70 6f 73 69 74 69 6f 6e 3a 20 66 6f 72 6d isposition: form > 0320 2d 64 61 74 61 3b 20 6e 61 6d 65 3d 22 64 6f 63 -data; name="doc > 0330 46 69 6c 65 4e 61 6d 65 22 0d 0a 0d 0a 4b 45 59 FileName"....KEY > 0340 53 0d 0a 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d S..------------- > 0350 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ---------------- > 0360 37 64 35 33 38 39 31 37 31 33 30 36 35 63 0d 0a 7d53891713065c.. > 0370 43 6f 6e 74 65 6e 74 2d 44 69 73 70 6f 73 69 74 Content-Disposit > 0380 69 6f 6e 3a 20 66 6f 72 6d 2d 64 61 74 61 3b 20 ion: form-data; > 0390 6e 61 6d 65 3d 22 75 70 64 61 74 65 57 65 62 4c name="updateWebL > 03a0 69 6e 6b 22 0d 0a 0d 0a 53 75 62 6d 69 74 0d 0a ink"....Submit.. > 03b0 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d ---------------- > 03c0 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 2d 37 64 35 -------------7d5 > 03d0 33 38 39 31 37 31 33 30 36 35 63 2d 2d 0d 0a 3891713065c--.. > *---------------------------------------------- > > Thanks, > > Brent > >>>> ronald-freebsd8@klop.yi.org 6/20/2005 6:36:33 PM >>> > On Tue, 21 Jun 2005 00:24:52 +0200, Brent Kynaston > wrote: > >> Ronald, >> >> Thanks for the quick response. >> >> Here is the HTTP header (captured by Ethereal) from the post where I've >> inserted some Finnish data for one of the fields: >> >> *----------------------------------------------------- >> Frame 161 (856 bytes on wire, 856 bytes captured) >> Ethernet II, Src: 00:50:56:c0:00:08, Dst: 00:0c:29:cc:ba:ff >> Internet Protocol, Src Addr: 192.168.189.1 (192.168.189.1), Dst Addr: >> 192.168.189.201 (192.168.189.201) >> Transmission Control Protocol, Src Port: 2631 (2631), Dst Port: http >> (80), Seq: 1, Ack: 1, Len: 802 >> Hypertext Transfer Protocol >> POST >> /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff >> HTTP/1.1\r\n >> Request Method: POST >> Request URI: >> /GLPNetPortal/portal/portlet/COPDocuments?urlType=Action&novl-inst=c373e902f9802206764b000c296f1d50&wsrp-mode=view&wsrp-windowstate=normal&action=updateDoc&DocID=c373e90485927c3d323c000c29ccbaff >> Request Version: HTTP/1.1 >> Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, >> application/x-shockwave-flash, */*\r\n >> Referer: >> http://192.168.189.201/GLPNetPortal/portal/portlet/COPDocuments?novl-inst=c373e902f9802206764b000c296f1d50\r\n >> Accept-Language: en-us\r\n >> Content-Type: multipart/form-data; >> boundary=---------------------------7d522e2ec0e8a\r\n >> Accept-Encoding: gzip, deflate\r\n >> User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; >> .NET CLR 1.1.4322)\r\n >> Host: 192.168.189.201\r\n >> Content-Length: 968\r\n >> Connection: Keep-Alive\r\n >> Cache-Control: no-cache\r\n >> Cookie: JSESSIONID=aa88969a240158baa93362f89c55e4f3\r\n >> \r\n >> *----------------------------------------------------- >> >> Thanks, >> >> Brent >> >>>>> ronald-freebsd8@klop.yi.org 6/20/2005 5:31:19 PM >>> >> On Mon, 20 Jun 2005 22:29:34 +0200, Brent Kynaston >> >> wrote: >> >>> I'm trying to post a multi-part form with file data and text input >>> files. The Portlet FileUpload code is able to successfully parse the >>> file data and text fields, except for when I change my keyboard type to >>> Finnish, Arabic, or any foreign language for that matter. >>> >>> I've specified an http meta-equiv with UTF-8: >>> META http-equiv="Content-Type" content="text/html; charset=UTF-8 >>> >>> I've tried setting the PortletFileUpload class instance to various >>> encoding types, and have not been able to get it to work. Is this >>> broken in the current builds of commons-fileupload-1.1-dev.jar? >> >> Post a dump of the headers going over the wire. (See ngrep, ethereal or >> another network sniffer.) > > A multipart/form-data post contains more headers in the body in the > request. Those are the interesting ones. > It's best seen with no file or a very small file upload. Did you try FileItem.getString(String encoding)? getString("UTF-8") in this case. -- Ronald Klop Amsterdam, The Netherlands --------------------------------------------------------------------- To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-user-help@jakarta.apache.org