tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anil K. Vijendran" <>
Subject Re: JASPER: page charset handling broken
Date Tue, 02 Nov 1999 19:02:02 GMT

Michal Mosiewicz wrote:

> "Anil K. Vijendran" wrote:
> >
> > [Moving the discussion to tomcat-dev]
> >
> > I recently heard about this, myself from one of the users of this JSP
> > engine. I believe the way it is supposed to work is that you read until you
> > encounter contentType and then re-read the file using the encoding you saw
> > in contentType. Right now, the JSP engine always uses the encoding obtained
> > using System.getProperty("file.encoding", "8859_1").
> It seems that there are more than one bug...

Quite possible :-)

> I have done exactly what you're talking about. I.e. I changed
> createJspReader to pass additional encoding parameter, and changed
> Compiler to check files twice if it appears that the file was read using
> a different encoding.

Let's investigate this a bit more and then I can commit your patch. I'm hoping to
hear from folks that implement XML parsers :-) since they have to deal with
similar issues.

> The result is somehow strange... If I set 'charset=iso-8859-1', I can
> see that the content of resulting page matches what I typed. However, if
> I try using iso-8859-2, I can see in the source of page, that it looks
> like it was interpreted as unicode string...
> For example, by using (excuse me this 8859-2 chars) the following
> characters: "¿¼ó³±¿¼±¿¼¼¼ó³±", I get them exactly the same in resulting
> page if I set charset=iso-8859-1. Of course it is improperly interpreted
> by the browser, becouse charset is obviously wrong, but the codes are
> matched. However, if I set iso-8859-2, I get something like:
> '|zóB|z|zzzóB' as result, and
> "...|z\u00f3B\u0005|z\u0005|zzz\u00f3B\u0005..." in the page source.
> It seems like setting iso-8859-2 makes my JVM to interpret the stream as
> unicode???
> -- Mike
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Peace, Anil +<:-)

View raw message