tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: URIEncoding
Date Sat, 17 Dec 2011 14:37:25 GMT
starz10de wrote:
> Thanks a lot for your answer. I already did what you suggested:
> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

That's good.

> but unfortunately the same problem. As I said when the default in the
> server.xml is "ISO-8859-1" all are fine. 

Can you show *exactly* what you are doing in server.xml ?
(paste the relevant portion here, remove comments and passwords)

I am dealing with English and
> German languages. The problem in the umlaut, when I submit it to the servlet
> it is not recognized.
> here where I submit my request:
> <form action="http://localhost:8080//Search/main?name method="get"
> TARGET="result">

I do not see anything in the above that submits anything with an "umlaut".
This is a "GET" request, so anything submitted would have to be in the URL, as a query-string.
I only see "name" here.  The quotes appear wrong too.
There is also a double // after 8080, where it should not be.
Are you sure it is not simply the "action" of your <form> which is wrong ?

What are the <input> fields being submitted in your <form>, and what value do
you put in 
it/them ?

Try the following, directly in your browser's URL bar :
http://localhost:8080/Search/main?name=böse zeichen
(note the ö-umlaut)
What does that do ?

> any more hint?
> awarnier wrote:
>> starz10de wrote:
>>> I have an application which is running in local machine and it work
>>> perfect.
>>> I installed my application in the server to make it available for all. In
>>> the server we have tomcat running and provide services for many
>>> instances.
>>> After I played my application in the server, I had problem with query
>>> which
>>> have special language character. After long time, I could find where is
>>> the
>>> problem. The problem was in server.xml where the URIEncoding is set to
>>> "UTF-8". I made test and just removed this line or set it to "ISO-8859-1"
>>> and all was perfect. My question here is it possible to set the
>>> URIEncoding
>>> for each instance or is it possible to set it some where else. I send the
>>> query from jsp page to the servlet. in my jsp page the
>>> charset=ISO-8859-1".
>>> I tried to make all utf-8 but I couldn't success. I tried the filter
>>> approach but also doesn't help: 
>>> <filter> 
>>> <filter-name>Set Character Encoding</filter-name> 
>>> <filter-class>servlet.CharsetFilter</filter-class> 
>>> <init-param> 
>>> <param-name>encoding</param-name> 
>>> <param-value>ISO-8859-1</param-value> 
>>> </init-param> 
>>> </filter> 
>>> <!-- Define filter mappings for the defined filters --> 
>>> <filter-mapping> 
>>> <filter-name>Set Character Encoding</filter-name> 
>>> <servlet-name>action</servlet-name> 
>>> </filter-mapping> 
>>> Any hint will be appreciated. 
>> Hi.
>> 1) By default, under HTTP (and HTML), the character set is ISO-8859-1.
>> So, if you do not specify anything anywhere to say something else,
>> everything should be 
>> understood and processed as ISO-8859-1.
>> (**)
>> 2) When a browser submits the contents of a <form> to a server, it will
>> /generally/ use 
>> the same character set, as the one which /it thinks/ is the character set
>> of the *current* 
>> page (the one which is currently shown on the screen == the one which
>> contains the link or 
>> button which will send data to the server).
>> So, what you need to do, is to look in the browser in the "Page info" or
>> similar, which 
>> character set the browser believes is in effect for the current page.
>> (*)
>> 3) Normally also, this character set will be the one which, in the page
>> source, is 
>> indicated by the following tag :
>> <meta http-equiv="content-type" content="text/html; charset=XXXXX" />
>> (it is the XXXXX above)
>> So make sure that all the pages that you send to the browser contain such
>> a tag, with the 
>> correct character set.
>> 4) Thus, if your pages are UTF-8, then any link in the page which "calls"
>> the server, is 
>> going to send all values to the server in the UTF-8 character set.
>> That includes the "query-string" part of URLs, and also the POST
>> parameters which may be sent.
>> If that is the case, you need to tell the server that it is so, because
>> that is /not/ the 
>> default for HTTP.
>> So that is when you should use the "URIencoding" parameter : if your forms
>> are sending 
>> requests to the server containing a query-string.
>> 5) if your forms are sending values by means of POST requests, then the
>> situation gets 
>> more complicated, if you use a character set other than ISO-8859-1.
>> But let's leave that for the next time.
>> A question maybe, for later : what is/are the (human) language(s) that are
>> used on your 
>> pages ?
>> (*) I also /strongly/ advise, for issues of that nature, that you get a
>> browser plug-in 
>> such as HttpFox or similar (for Firefox) or Fiddler2 (for Internet
>> Explorer), to be able 
>> to check exactly what is being sent from the browser to the server and
>> vice-versa.
>> (**) Unfortunately, in Java the internal representation for characters and
>> strings is 
>> Unicode, which can lead to mixups if you are not careful.
>> Or, let me turn this around : it is much better to use Unicode as a
>> character set, than 
>> any other "alphabet".  But unfortunately, in the WWW, for historical
>> reasons, the default 
>> is still ISO-8859-1, which creates many problems when one tries to deal
>> with non-English 
>> languages.
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message