Return-Path: Mailing-List: contact tomcat-dev-help@jakarta.apache.org; run by ezmlm Delivered-To: mailing list tomcat-dev@jakarta.apache.org Received: (qmail 43450 invoked from network); 5 Feb 2001 10:25:06 -0000 Received: from unknown (HELO b0s3-000.bequbed.com) (212.115.185.3) by h31.sny.collab.net with SMTP; 5 Feb 2001 10:25:06 -0000 Received: from B0L9010 (dhcp-1-111.bequbed.com [192.168.1.111]) by b0s3-000.bequbed.com (Postfix) with SMTP id 0524557; Mon, 5 Feb 2001 11:24:53 +0100 (CET) From: "Zhu Ming" To: , Subject: RE: serializing XML to a ServletOutputStream fails Date: Mon, 5 Feb 2001 11:24:55 +0100 Message-ID: <001601c08f5d$e264ef10$6f01a8c0@bequbed.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 In-Reply-To: <20010204210410.C15825@bailey.dscga.com> X-Mimeole: Produced By Microsoft MimeOLE V5.50.4133.2400 Importance: Normal X-Spam-Rating: h31.sny.collab.net 1.6.2 0/1000/N Hi, Maybe you should not use character set "UTF-8". I remember that it's 8-bit Unicode. As I know, Chinese and Korean has 16-bit code. So at least, you should try 16-bit Unicode. I forgot the name, maybe it's "UTF-16". But I'm not sure if JDK have fully support to "UTF-16". I'm not an Unicode expert. I'll be happy if what I say can be a hint to solve this problem. Ming -----Original Message----- From: Michael Mealling [mailto:michael@bailey.dscga.com] Sent: Monday, February 05, 2001 03:04 To: tomcat-dev@jakarta.apache.org Subject: serializing XML to a ServletOutputStream fails (This might be a bug so I'm cc-ing to tomcat-dev) Hi, I'm trying to serialize some XML out to a ServletOutputStream but the resulting XML on the client side contains corrupted Unicode characters (the DOM I'm serializing out contains Chinese, Korean, English, etc). Here's the code in question: response.setContentType("text/xml; charset=UTF-8"); ServletOutputStream out = response.getOutputStream(); out.print("\n" + "\n"); out.flush(); OutputFormat format = new OutputFormat(document); format.setOmitXMLDeclaration(true); format.setIndenting(true); // it makes debuggin easier format.setEncoding("UTF-8"); // this is the default anyway XMLSerializer serializer = new XMLSerializer(out, format); serializer.serialize(document.getDocumentElement()); The XML that the client gets is fine except that the non-ASCII subset of the UTF-8 encoded Unicode characters are garbled. I can serialize the XML out to a FileOutputStream and it works just fine. I'm running Tomcat 3.2.1 that's the backend for a remote Apache 1.3.17 server using ajp13 (and thus mod_jk). This code looks like its the right way to do this but either I've hit a bug or else I'm missing something (an encoding somewhere between a Stream and a Writer?) -MM -- ---------------------------------------------------------------------------- ---- Michael Mealling | Vote Libertarian! | www.rwhois.net/michael Sr. Research Engineer | www.ga.lp.org/gwinnett | ICQ#: 14198821 Network Solutions | www.lp.org | michaelm@netsol.com --------------------------------------------------------------------- To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org For additional commands, email: tomcat-dev-help@jakarta.apache.org