tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Terence M. Bandoian" <>
Subject Re: Problem with Transfer-Encoding
Date Mon, 07 Jul 2014 17:07:02 GMT
On 7/5/2014 6:36 PM, André Warnier wrote:
> Sushil Prusty wrote:
>> Dear User
>> Thanks for you input.
> You're welcome.
> First, a foreword : I will try my best to help you, but doing this is 
> very difficult, and doing it via email is even more difficult.
> I was not kidding when I wrote earlier that even looking at the data 
> may make it change.
> Of course, that is not really true, but the fact of cutting and 
> pasting this data, from your saved HTTPFox trace into an email that 
> you send to the Tomcat list, and then the Tomcat list server 
> forwarding this to other people in a new email, may again decode and 
> re-encode this data several times, and confuse the situation totally.
> So we need to be very, very systematic, and make sure that what we see 
> is really what we get, ok ?
> What you should really do, is to save the original HttpFox data to a 
> file, then save that file, then zip that file, then post it somewhere 
> where we can get this zip-file.
> So that we can download it, unzip it, and then be sure that we are 
> really seeing the same data as you do.
> In the meantime, a question :
>> I just debugged using HttpFox here is below you find header
>> (Request-Line)    POST /test/
> The above request line is triggered by something.
> By what ?
> Is that a link or button on a HTML page which is currently loaded in 
> your browser ?
> If yes, then before you actually click this link, can you in your 
> browser use the "View..Character set" function, and tells us what the 
> browser thinks about the current page loaded in the browser, before 
> you even send this request to the server ?
> The reason why I am asking, is that this is the character set which 
> the browser will most probably use to encode the text data that it 
> sends to the server (when you click the link).
> Then see the note below, in the text.

I agree with André about the difficulties of debugging character 
encodings.  A couple of things you might check are the character 
encodings of the page and the form.  The character encoding of the page 
may be set with the Content-type meta tag:

<meta http-equiv="Content-type" content="text/html;charset=UTF-8"/>

For the form, I believe the character encoding defaults to the character 
encoding of the page but may be explicitly set with the accept-charset 

<form method="post" action="" accept-charset="utf-8"></form>

Hope that helps.

-Terence Bandoian

>> HTTP/1.1
>> Host    **********
>> User-Agent    Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:30.0)
>> Gecko/20100101 Firefox/30.0
>> Accept text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
>> Accept-Language    en-US,en;q=0.5
>> Accept-Encoding    gzip, deflate
>> Referer    https://s
>> ************

>> Cookie    JSESSIONID=******************; doNotShowStartupOnLoad=true
>> Connection    keep-alive
>> Content-Type    multipart/form-data;
>> boundary=---------------------------*******************
>> Content-Length    4039
>> In Post body
>> -----------------------------1550434539176507601876254213
>> Content-Disposition: form-data; name="disclaimerText"
>> Zażółć gęślÄ jaźń! ta funkcjonalność nie jest wspierana
> The line above may or may not have been further corrupted (compared to 
> the original that you see), by the simple fact of copying this text 
> into your email.
> But assuming for a moment that it was not, and that it really is what 
> it looks like above, there is some kind of a problem :
> (You'll have to follow carefully here)
> If I take the original text line which you posted in your first message :
> Zażółć gęślą jaźń! ta funkcjonalność nie jest wspierana*
> and I imagine that internally, this is encoded as UTF-8;
> Then if I look at that same series of UTF-8 characters, but now 
> examine the *bytes* that compose these characters and view them in 
> ASCII, I should see this :
> Zażółć gęślą jaźń! ta funkcjonalność nie jest
> But if you compare this carefully, with the string as it appears in 
> your HttpFox trace, you will see that it does not match exactly. For 
> example, look at the last 2 letters of the word "funkcjonalność", in 
> both versions.
> So there appears to be some discrepancy between the character set 
> which your browser is really using (to send data to the server), and 
> the UTF-8 that your server seems to expect.
> Furthermore (and put this on account of my suspicious nature if you 
> want) :
> The second part of that message, in Polish, means : "This 
> functionality is not supported".
> Which triggers the question : what kind of HTML page would be sending 
> this phrase, as part of the data, in a POST to a server ? Can you give 
> us some context as to what you are trying to do here ?
>> -----------------------------1550434539176507601876254213
>> I believe psot data got changed before   reaching   to server .
>>   Do you have any ideas what's wrong here, where the error might be ?
>> On Sat, Jul 5, 2014 at 9:08 PM, André Warnier <> wrote:
>>> Konstantin Kolinko wrote:
>>>> 2014-07-05 9:24 GMT+04:00 Sushil Prusty <>:
>>>>> Hello,
>>>>> summary of my Problem:
>>>>> When a client POSTs Tranfer-Encoding data   using browser ,
>>>>> my server is not processing the request character encoding properly .
>>>>> I send the following request:
>>>>> Content-Type text/html;charset=UTF-8
>>>>> Date Sat, 05 Jul 2014 05:10:09 GMT
>>>>> Server Apache-Coyote/1.1
>>>>> Transfer-Encoding chunked
>>>>> *disclaimerTextZażółć gęślą jaźń! ta funkcjonalność nie jest

>>>>> wspierana*
>>>>>   Full details:
>>>>> My application running on  apache-tomcat-7.0.40
>>>>>  and Java
>>>>> 1.6.x)  in linux box.
>>>>> Below response is changed once it's reach to my controller
>>>>> *ZażóÅÄ gÄÅlÄ jaźÅ! ta funkcjonalnoÅÄ nie jest wspierana*
>>>>> I have below configuration  in server.xml
>>>>>  <Connector port="80" protocol="HTTP/1.1" connectionTimeout="20000"
>>>>> maxPostSize="5242880" maxParameterCount="25000"/>
>>>>>     <Connector
>>>>>             port="443"
>>>>>             protocol="HTTP/1.1"
>>>>>             scheme="https"
>>>>>             noCompressionUserAgents="gozilla, traviata"
>>>>> compressableMimeType="text/html,text/xml,text/javascript,
>>>>> text/css,application/javascript,application/json"
>>>>>             URIEncoding="UTF-8"
>>>>>     />
>>>>> and in my
>>>>> set JAVA_OPTS=-Djavax.servlet.request.encoding=UTF-8
>>>>> -Dfile.encoding=UTF-8
>>>>> (...)
>>>> As a sanity check:
>>>> 1) That "I send the following request" listing looks more like a
>>>> response, not a request. (E.g. the "Server Apache-Coyote/1.1" header
>>>> makes no sense in a request).
>>>> So you are lying somewhere.
>>>> There is no point for me to try guessing what you are doing. You may
>>>> have confused "reading" with "writing" somewhere, and without source
>>>> code one cannot verify your words.
>>>> You have to provide a step-by-step instruction and enough source code
>>>> so that a person who is not familiar with your system were able to
>>>> reproduce your problem.
>>>> 2) Content-Type says "text/html", but that line of text is not a valid
>>>> HTML document.
>>> +1
>>> Character encoding/decoding issues are hell to debug as it is, because
>>> they are like quantum physics : even looking at them can change 
>>> them.(*)
>>> So you need to provide *accurate* and "raw" information, otherwise 
>>> it is
>>> just a loss of time for everyone.
>>> Use a browser plugin like HttpFox, LiveHttpHeaders, HttpFox or 
>>> similar to
>>> monitor the requests being sent and responses being received, at the
>>> browser level.  All these plugins allow you to selectively dump
>>> requests/responses to a file.  Do that.
>>> Also, check in your browser that when you receive a response page back
>>> from the server, your browser is really seeing this response in the 
>>> proper
>>> character set (use "View.. Character encoding..").
>>> "Transfer Encoding" has nothing to do with the *character encoding* of
>>> either the request or the response.  The little imprecise data that 
>>> the OP
>>> provided above /suggests/ that there is some double encoding taking 
>>> place
>>> /somewhere/, but so far it could as well be in the email client that he
>>> used to post to the list, as anywhere else.
>>> (*) with the wrong editor, or the wrong locale e.g.
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>>> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message