cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niclas Hedhman <nic...@localbar.com>
Subject Re: BUG?: Writing NaN% to a ServletOutputStream
Date Mon, 13 Mar 2000 08:19:29 GMT

8859-1 is a Latin character set, to handle european characters except
Russia and Greece (possibly some more), and as far as I know there are
less than 256 of these characters.

The check in Tomcat is to ensure that other Unicode characters are not
sent, since the Servlet stream is set to 8859-1 encoding.

So, the task at hand in Tomcat is to figure out what encoding is used in
the content, and set the response accordingly. I presume this should be
the responsibility of the Servlet writer, and Tomcat should only default
to 8859-1 (or a default property setting for it) if it is missing, and
accordingly not do the checks otherwise.

In fact, looking up the Unicode table shows
221E = Infinity

Interesting.
Niclas

"Stevenson, Chris (SSABSA)" wrote:

> Aha!
>
>         String s = "<p>" + nf.format(60.0/0) + "</p>";
>         for (int i=0; i<s.length(); i++) {
>             char c = s.charAt(i);
>             System.out.println((c & 0xff00) + "-" +(c & 0x00ff));
>         }
>
> Gives:
>
> 8704-30
> 0-37
>
> So that is what is causing the problem. Looking in the code for
> org.apache.tomcat.core.BufferedServletOutputStream
>
>         for (int i = 0; i < len; i++) {
>             char c = s.charAt (i);
>
>             //
>             // XXX NOTE:  This is clearly incorrect for many strings,
>             // but is the only consistent approach within the current
>             // servlet framework.  It must suffice until servlet output
>             // streams properly encode their output.
>             //
>             if ((c & 0xff00) != 0) {    // high order byte must be zero
>                 String errMsg = sm.getString(
>                     "servletOutputStream.fmt.not_iso8859_1",
>                      new Object[] {new Character(c)});
>                 throw new IOException(errMsg);
>             }
>             write(c);
>         }
>
> and this explains why it happens.
>
> I guess my question becomes:
>
> *When will* "servlet output streams properly encode their output"?
>
> Do they already?
>
> .. and is there anyone who knows unicode enough to chack
> how up to date the above note is, and if the check
> is still needed??
>
> I tried to look on CVS but I couldn't access the info
> and I have to leave work now ...
>
> ... (Off to rehearse for a
> show in the Adelaide Festival of the Arts - Copland's
> 'The second hurricane' and Brecht/Weill's 'Der
> LindberghFlug' (sp) :-)
>
> chris.
>
> -- Chris Stevenson ----------------------- SSABSA --
> Senior Secondary Assessment Board of South Australia
> 60 Greenhill Road, Wayville SA 5034, Australia
> email: chris@ssabsa.sa.gov.au
> phone: (08) 8372 7515
>   fax: (08) 8372 7590
> ----------------------------------------------------


Mime
View raw message