Return-Path: Mailing-List: contact tomcat-dev-help@jakarta.apache.org; run by ezmlm Delivered-To: mailing list tomcat-dev@jakarta.apache.org Delivered-To: moderator for tomcat-dev@jakarta.apache.org Received: (qmail 93209 invoked from network); 3 Jun 2000 23:51:29 -0000 Received: from w153.z209031224.sjc-ca.dsl.cnc.net (HELO edamame.stinky.com) (qmailr@209.31.224.153) by locus.apache.org with SMTP; 3 Jun 2000 23:51:29 -0000 Received: (qmail 15860 invoked by uid 510); 3 Jun 2000 16:45:35 -0000 Date: Sat, 3 Jun 2000 09:45:34 -0700 From: Alex Chaffee To: tomcat-dev@jakarta.apache.org Cc: Ken Flurchick , Armen Ezekielian , haupt@erc.msstate.edu, Jan Labanowski Subject: Re: Tomcat bug Message-ID: <20000603094534.D3663@edamame.stinky.com> Reply-To: alex@jguru.com References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0.1i In-Reply-To: ; from ed@apache.org on Sat, Jun 03, 2000 at 04:47:14PM -0700 X-Spam-Rating: locus.apache.org 1.6.2 0/1000/N On Sat, Jun 03, 2000 at 04:47:14PM -0700, Ed Korthof wrote: > On Sat, 3 Jun 2000, Ed Korthof wrote: > > > This is not a valid Java statement: > > > > char c = '\u000d'; > > > > because the '\u000d' is not a valid character constant. > > The Java Language Spec is very helpful at times like this. Here's what it > has to say about this particular value: > > Because Unicode escapes are processed very early, it is not > correct to write '\u000a' for a character literal whose value is > linefeed (LF); the Unicode escape \u000a is transformed into an > actual linefeed in translation step 1 (3.3) and the linefeed > becomes a LineTerminator in step 2 (3.4), and so the character > literal is not valid in step 3. Instead, one should use the escape > sequence '\n' (3.10.6). Similarly, it is not correct to write > '\u000d' for a character literal whose value is carriage return > (CR). Instead, use '\r'. > > This is out of http://java.sun.com/docs/books/jls/html/3.doc.html#100960 > ... IMO, it is kinda lame (why special case these two characters?), but > that's how the spec is written. They're *not* special cased -- that's the problem :-) The value specified by \uXXXX is literally placed into the parse stream as that character. So... char c = '\u000a'; becomes char c = ' '; Now you see why that doesn't parse correctly? - Alex