tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Mannion" <chris.mann...@nonstopgov.com>
Subject Re: Odd encoding of servlet parameters
Date Thu, 27 Nov 2008 13:58:33 GMT
André

Thanks for the comments, I will definitely look into the approach of
sending the data in the request body, probably something that should
have been done originally.

It's true that the program sending the data is ours as well but I
don't suspect it to be the culprit because the problem doesn't occur
in a way consistent with that.  For example, I can send data from my
local client to my local server and it arrives intact but when I send
the same data from the same client to the problem server, it arrives
with the HTML encoding.  And, in fact, the sending program has been
distributed to several customers who use it with the same results,
uploads to a test server arrive well formed, to the problem server
they are HTML encoded.  And it's the fact that both servers are
running the exact same code that receives the upload that made me
wonder if it could be a Tomcat setting that was causing the problem.

2008/11/27 André Warnier <aw@ice-sa.com>:
> Chris Mannion wrote:
>>
>> Hi All
>>
>> I've recently started having a problem with one of the servlets I'm
>> running on a Tomcat 5.5 system.  The code of the servlet hasn't
>> changed at all so I'm wondering if there are any Tomcat settings that
>> could affect this kind of thing or if anyone has come across a similar
>> problem before.
>>
>> The servlet in question accepts XML data that is posted to it as a URL
>> parameter called 'xml'.  The code to retrieve the XML as a String
>> (which is then used to build a document object) is simply -
>>
>> String xmlMessage = req.getParameter("xml");
>>
>> - where req is the HttpServletRequest object.  Until recently this has
>> worked fine with the XML being received properly formatted -
>> <?xml version="1.0" encoding="UTF-8"?>
>>  <records>
>>    <record>...
>> etc.
>>
>> However, recently something has changed and the XML is now being
>> retrieved from the request object with escape characters in, so the
>> above has become -
>> &lt;xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
>>  &lt;records&gt;
>>    &lt;record&gt;
>>
>> Before sending the XML is encoded using the java.net.URLEncoder object
>> and the UTF-8 character set, but using a java.net.URLDecoder on
>> receiving it does not get rid of the encoded characters.  I did some
>> reading about a possible Tomcat 6.0 bug and so tried explicitly
>> setting the character encoding (req.setCharacterEncoder("UTF-8"))
>> before retrieving the parameter but that had no effect either and even
>> if there's something that could explicitly decode the &lt; &gt; etc. I
>> couldn't use it as the XML data often contains characters like &amp;
>> which have to remain encoded to keep the XML valid.
>>
>> As I said, this problem started without the servlet code having
>> changed at all so is there any Tomcat setting that could be
>> responsible for this?
>>
> Just a couple of indirect comments on the above.
>
> In your post, you seem to indicate that you also control the client which
> sends the request to Tomcat.
> If so, and for that kind of data, might it not be better to send the data in
> the body of a request, instead of in the URL ?
> That is probably not the bottom reason of the issue you describe above, but
> it may avoid similar questions of encoding in the future.
> (check the HTTP POST method, and enctype=multipart/form-data)
> It will also avoid the case where your data gets so long that the request
> URLs (and thus your data) get cut off at a certain length.
>
> Next, the way you indicate that the data is now received, shows an "html
>  style" encoding, rather than a "URL style" encoding.
> If the data was now URL-encoded, it would not have (for example) "&quot;"
> replacing a quotation mark, but it would have some %xy sequence instead
> (where xy is the iso-8859-1 codepoint of the character, expressed in
> hexdecimal digits).
> What I mean is that it is very unlikely that this encoding just happens
> "automatically" due to some protocol layer at the browser or HTTP server
> level.  There must be something that explicitly encodes your original
> request data in this way, before it even gets put in a URL.
>
> I guess what I am trying to say, is that maybe you are looking in the wrong
> place for your problem, by focusing on the receiving Tomcat side first. I
> believe you should first have a good look at the sending side.
>
>
>
> ---------------------------------------------------------------------
> To start a new topic, e-mail: users@tomcat.apache.org
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>



-- 
Chris Mannion
iCasework and LocalAlert implementation team
0208 144 4416

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message