Return-Path: Delivered-To: apmail-tomcat-users-archive@www.apache.org Received: (qmail 1746 invoked from network); 27 Nov 2008 12:53:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Nov 2008 12:53:43 -0000 Received: (qmail 49847 invoked by uid 500); 27 Nov 2008 12:53:42 -0000 Delivered-To: apmail-tomcat-users-archive@tomcat.apache.org Received: (qmail 49820 invoked by uid 500); 27 Nov 2008 12:53:42 -0000 Mailing-List: contact users-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Users List" Delivered-To: mailing list users@tomcat.apache.org Received: (qmail 49800 invoked by uid 99); 27 Nov 2008 12:53:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Nov 2008 04:53:42 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [212.85.38.174] (HELO popeye.combios.es) (212.85.38.174) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Nov 2008 12:52:14 +0000 Received: from [192.168.245.129] (p549EA50F.dip0.t-ipconnect.de [84.158.165.15]) (authenticated bits=0) by popeye.combios.es (8.13.8/8.13.8/Debian-3) with ESMTP id mARCjqlG027006 for ; Thu, 27 Nov 2008 13:45:54 +0100 Message-ID: <492E9679.5020104@ice-sa.com> Date: Thu, 27 Nov 2008 13:45:45 +0100 From: =?ISO-8859-1?Q?Andr=E9_Warnier?= User-Agent: Thunderbird 2.0.0.16 (Windows/20080708) MIME-Version: 1.0 To: Tomcat Users List Subject: Re: Odd encoding of servlet parameters References: <11e852b50811270217n3f743fdfr1a28e754ef084402@mail.gmail.com> In-Reply-To: <11e852b50811270217n3f743fdfr1a28e754ef084402@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on popeye.combios.es X-Virus-Scanned: ClamAV 0.92.1/8685/Thu Nov 27 02:55:04 2008 on popeye.combios.es X-Virus-Status: Clean X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-97.9 required=2.5 tests=RCVD_IN_PBL, RCVD_IN_SORBS_DUL,USER_IN_WHITELIST autolearn=no version=3.2.3 Chris Mannion wrote: > Hi All > > I've recently started having a problem with one of the servlets I'm > running on a Tomcat 5.5 system. The code of the servlet hasn't > changed at all so I'm wondering if there are any Tomcat settings that > could affect this kind of thing or if anyone has come across a similar > problem before. > > The servlet in question accepts XML data that is posted to it as a URL > parameter called 'xml'. The code to retrieve the XML as a String > (which is then used to build a document object) is simply - > > String xmlMessage = req.getParameter("xml"); > > - where req is the HttpServletRequest object. Until recently this has > worked fine with the XML being received properly formatted - > > > ... > etc. > > However, recently something has changed and the XML is now being > retrieved from the request object with escape characters in, so the > above has become - > <xml version="1.0" encoding="UTF-8"?> > <records> > <record> > > Before sending the XML is encoded using the java.net.URLEncoder object > and the UTF-8 character set, but using a java.net.URLDecoder on > receiving it does not get rid of the encoded characters. I did some > reading about a possible Tomcat 6.0 bug and so tried explicitly > setting the character encoding (req.setCharacterEncoder("UTF-8")) > before retrieving the parameter but that had no effect either and even > if there's something that could explicitly decode the < > etc. I > couldn't use it as the XML data often contains characters like & > which have to remain encoded to keep the XML valid. > > As I said, this problem started without the servlet code having > changed at all so is there any Tomcat setting that could be > responsible for this? > Just a couple of indirect comments on the above. In your post, you seem to indicate that you also control the client which sends the request to Tomcat. If so, and for that kind of data, might it not be better to send the data in the body of a request, instead of in the URL ? That is probably not the bottom reason of the issue you describe above, but it may avoid similar questions of encoding in the future. (check the HTTP POST method, and enctype=multipart/form-data) It will also avoid the case where your data gets so long that the request URLs (and thus your data) get cut off at a certain length. Next, the way you indicate that the data is now received, shows an "html style" encoding, rather than a "URL style" encoding. If the data was now URL-encoded, it would not have (for example) """ replacing a quotation mark, but it would have some %xy sequence instead (where xy is the iso-8859-1 codepoint of the character, expressed in hexdecimal digits). What I mean is that it is very unlikely that this encoding just happens "automatically" due to some protocol layer at the browser or HTTP server level. There must be something that explicitly encodes your original request data in this way, before it even gets put in a URL. I guess what I am trying to say, is that maybe you are looking in the wrong place for your problem, by focusing on the receiving Tomcat side first. I believe you should first have a good look at the sending side. --------------------------------------------------------------------- To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org For additional commands, e-mail: users-help@tomcat.apache.org