tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Antonio Vidal Ferrer" <antonio.vi...@globalia-sistemas.com>
Subject RE: Migrating to tomcat 6 gives formatted currency amounts problem
Date Fri, 12 Sep 2008 15:27:14 GMT
Hi,

Have you checked the configuration for this catalina opts?:

    -Duser.language=es
    -Duser.country=ES

Check that they are the same in both tomcats. (In this case, for instance,
is configured for Spanish-Spain)

Good Luck

Best,

Toni

-----Original Message-----
From: André Warnier [mailto:aw@ice-sa.com] 
Sent: viernes, 12 de septiembre de 2008 16:58
To: Tomcat Users List
Subject: Re: Migrating to tomcat 6 gives formatted currency amounts problem

Konstantin Kolinko wrote:
> 2008/9/12 André Warnier <aw@ice-sa.com>
> 
>> Konstantin Kolinko wrote:
>>
>>> 2008/9/12 André Warnier <aw@ice-sa.com>:
>>>
>>>> Caldarale, Charles R wrote:
>>>>
>>>>> I'm not sure these days what the "normal web character set" really is.
>>>>>  If
>>>>> you're referring to ASCII (aka Basic Latin), then no, the Pound
Sterling
>>>>> symbol is not present.  However, for any of the ISO-8859-x variants,
it
>>>>> is
>>>>> present, using the 163 (0xA3) value you noted (same as the Unicode
code
>>>>> point).  It's also in UTF-8 of course, but requires two bytes (0xC2
>>>>> 0xA3) to
>>>>> represent the code point.
>>>>>
>>>>>  I love these discussions about character sets. They seem to confuse
so
>>>> many
>>>> people; even I, who have been involved in them for 30 years...
>>>>
>>>> Anyway, I have a related question, which I don't think constitutes a
>>>> hijack
>>>> of this thread, because the underlying cause is probably similar.
>>>> Here it goes :
>>>>
>>>> Tomcat (v 4.1, v 5.0, v5.5, have not tried yet in 6.x)
>>>> The above Tomcat's running under the same Linux or Solaris, essentially
>>>> set
>>>> up the same way. The JVM may vary, but I don't think that is the
problem,
>>>> because of the consistency of the problem as explained below.
>>>> I am running a webapp from an external supplier, always the same binary
>>>> version.  I don't have the code, can't see what's in it.
>>>> The pages served by that webapp are the same html pages, all of them
>>>> having
>>>> a declaration <meta http-equiv="Content-Type" content="text/html;
>>>> charset=iso-8859-1">.
>>>> The pages also *are* properly encoded as iso-8859-1 (100% positive, I
>>>> know
>>>> the difference).
>>>> The browser receiving the pages is always the same one, same settings.
>>>>
>>>> Now,
>>>>
>>>> case a)
>>>> in the Tomcat startup files, I do nothing, meaning I just take Tomcat
>>>> out-of-the-box and run the webapp.
>>>> Result : in any such html page that contains characters with an
ISO-8859
>>>> codepoint above \xA0 (meaning the displayable characters of the "high"
>>>> part
>>>> of the table, where one finds things like "uppercase A with umlaut"),
>>>> these
>>>> characters
>>>>  - appear in the browser display as "?" (minus the quotes)
>>>>  - also if I save the page from the browser to disk, and look at them
>>>> with
>>>> an iso-8859-1 capable editor, they are effectively "?".
>>>> (So it's not the browser misunderstanding them, it is Tomcat sending
them
>>>> that way).
>>>>
>>>> case b)
>>>> In one of the Tomcat startup files (e.g. tomcat_dir/bin/startup.sh or
>>>> even
>>>> in /etc/init.d/tomcat5.5), I add the following line
>>>> LC_CTYPE="en_us.iso88591"
>>>> (or whatever is valid on that host to specify an iso-8859-1 LC_CTYPE)
>>>> (before the actual start of Tomcat)
>>>> and restart Tomcat
>>>> then the same page displays properly in the browser, and also is
correct
>>>> iso-8859-1 when saved to disk and examined with the editor.
>>>> (In other words, what previously were "?" characters, are now the
correct
>>>> iso-8859-1 character bytes).
>>>>
>>>> Now my question is :
>>>> How can it matter which LC_CTYPE Tomcat is started under, that would
have
>>>> the result above ?
>>>> The behaviour above is consistent across different hosts, across the
same
>>>> or
>>>> different Tomcat versions, it is always the same webapp, always the
same
>>>> html pages, always the same browser, etc.  Only that LC_CTYPE line
>>>> changes
>>>> the behaviour.
>>>> On the face of it, the only thing I can think of that would explain
this,
>>>> is
>>>> that the webapp in question does something wrong, but what exactly
could
>>>> it
>>>> be doing ?
>>>> Any ideas ?
>>>>
>>>>
>>> It is <%@page pageEncoding="..." %> that is missing from those pages.
>>> Thus JSP compiler does not know what encoding they are using for their
>>> source and messes them at compilation time.
>>>
>> [...]
>>
>> But these pages, as far as Tomcat and the webapp are concerned, are not
>> dynamic
>>
> in any way.  They are straight static html pages.
>> So is the JSP stuff relevant ?
>> (I'm genuinely asking, since I know nothing about JSP pages)
>>
>>
> The static HTML pages, as well as all the other static files, are served
by
> the
> DefaultServlet. You should dig there. I think that fileEncoding
> initialization parameter
> of the servlet, as well as <mime-mapping> settings in web.xml come into
> play.
> 
> JSP settings are irrelevant for them, of course.
> 

Hi.
Thanks for the intent and answer above.
But I insist : these html pages are served by that webapp of which I am 
talking, not by the DefaultServlet.
Those pages are being accessed via URLs like
http://myhost.mycompany.com/myservlet?..(additional parameters 
indicating which static file to serve)..
It is on the way through that servlet that they get "corrupted", unless 
I start Tomcat with LC_CTYPE="iso-8859-1".
That servlet, in its own web.xml config file in 
tomcat_dir/webapps/myservlet/WEB-INF/web.xml, has no fileEncoding nor 
mime-mapping section nor parameter.

So my question remains, I think : what could be going on in that servlet 
so that :
- if LC_CTYPE is not set in the environment *of Tomcat* when it starts, 
the upper iso-8859-1 characters in the pages are replaced by "?"
- if LC_CTYPE is set to "iso-8859-1" in the Tomcat environment when it 
starts, then the pages delivered by the servlet are correct
?

I am not very qualified in Java, but could it be something like :
- the servlet reads those documents with some InputStream, without 
specifying a character set or encoding, and by default that means to use 
Tomcat's idea of its default LC_CTYPE for those InputStreams ?
- or the servlet outputs the document via an OutputStream without 
specifying an encoding etc..
?

André

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message