Return-Path: Delivered-To: apmail-cocoon-dev-archive@www.apache.org Received: (qmail 15067 invoked from network); 22 Dec 2003 11:03:09 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 22 Dec 2003 11:03:09 -0000 Received: (qmail 25286 invoked by uid 500); 22 Dec 2003 11:02:48 -0000 Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 25251 invoked by uid 500); 22 Dec 2003 11:02:47 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: dev@cocoon.apache.org Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 25238 invoked from network); 22 Dec 2003 11:02:46 -0000 Received: from unknown (HELO exchange.sun.com) (192.18.33.10) by daedalus.apache.org with SMTP; 22 Dec 2003 11:02:46 -0000 Received: (qmail 13308 invoked by uid 50); 22 Dec 2003 11:03:00 -0000 Date: 22 Dec 2003 11:03:00 -0000 Message-ID: <20031222110300.13307.qmail@nagoya.betaversion.org> From: bugzilla@apache.org To: dev@cocoon.apache.org Cc: Subject: DO NOT REPLY [Bug 25694] New: - [PATCH] JSPEngineImplNamedDispatcherInclude incorrectly converts bytes to characters X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=25694 [PATCH] JSPEngineImplNamedDispatcherInclude incorrectly converts bytes to characters Summary: [PATCH] JSPEngineImplNamedDispatcherInclude incorrectly converts bytes to characters Product: Cocoon 2 Version: Current CVS 2.1 Platform: All OS/Version: All Status: NEW Severity: Normal Priority: Other Component: sitemap components AssignedTo: dev@cocoon.apache.org ReportedBy: johan@hippo.nl The MyServletOutputStream class in the JSPEngineImplNamedDispatcherInclude class is an output stream. All output streams have to implement the write(byte) method. This is documented in the implementation, but it also states that the method is not used. This is not true. The write(byte) method does get invoked when MyServletOutputStream is used as an output stream instead of using MyServletOutputStream's PrintWriter. Its current implementation writes the byte as a character to the PrintWriter. When the byte is in the range 0..127 this poses no problem; the UTF-8 encoding is the same byte. When the byte is in the range -128..-1 the data gets corrupted because of the conversion to an int and subsequently to a char: the negative value is first sign-extended to an int. Then the lower 16-bits are used as the char value, which results in a char value between 65408..65535. The PrintWriter (using the UTF-8 encoding) will output these characters using multiple bytes. If MyServletOutputStream is used to stream bytes in UTF-8 encoding and this data contains the representation for an 'e' with an umlaut (2 bytes in UTF-8 encoding), the final data contains six bytes. Instead of writing the byte to the PrintWriter, during which it gets converted to a char, MyServletOutputStream should write to the underlying ByteArrayOutputStream. Before doing so, it should syncrhonize the PrintWriter and the ByteArrayOutputStream.