Return-Path: Delivered-To: apmail-jakarta-poi-user-archive@www.apache.org Received: (qmail 87397 invoked from network); 28 Aug 2003 06:27:09 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 28 Aug 2003 06:27:09 -0000 Received: (qmail 41980 invoked by uid 500); 28 Aug 2003 06:26:36 -0000 Delivered-To: apmail-jakarta-poi-user-archive@jakarta.apache.org Received: (qmail 41957 invoked by uid 500); 28 Aug 2003 06:26:36 -0000 Mailing-List: contact poi-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "POI Users List" Reply-To: "POI Users List" Delivered-To: mailing list poi-user@jakarta.apache.org Received: (qmail 41900 invoked from network); 28 Aug 2003 06:26:35 -0000 Received: from unknown (HELO solita.fi) (195.197.9.34) by daedalus.apache.org with SMTP; 28 Aug 2003 06:26:35 -0000 Received: (qmail 20887 invoked from network); 28 Aug 2003 06:28:14 -0000 Received: from unknown (HELO vasarahai) (10.0.0.2) by 10.0.0.3 with SMTP; 28 Aug 2003 06:28:14 -0000 Message-ID: <015a01c36d2d$133f7e70$8001a8c0@solita.fi> From: "Jussi Koiranen" To: "POI Users List" References: <005801c36c6c$8ab177e0$8001a8c0@solita.fi> <000d01c36ca6$c5ba7cd0$0d8fcda3@tdryan> Subject: =?iso-8859-1?Q?Re:_'=E4'=2C_'=F6'_and_'=E5'_with_WordDocument?= Date: Thu, 28 Aug 2003 09:24:45 +0300 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1158 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N I downloaded sources from CVS and compiled, now '�', '�' and '�' are woking. I down't know why it didn't work with the release-version. Jussi Koiranen ----- Original Message ----- From: "Ryan Ackley" To: "POI Users List" Sent: Wednesday, August 27, 2003 5:23 PM Subject: Re: '�', '�' and '�' with WordDocument > Your StringWriter may not be using the correct character encoding. Character > encoding determines how java converts bytes to characters. I think the > default is utf-8 for java Strings. One way to test this out is to step > through the code and find the actual bytes that are being read from the Word > doc then go to http://www.unicode.org and verify that these are the correct > bytes for '�', '�' and '�'. If they are correct then the encoding you are > using is wrong. > > If its our fault I will try to address this issue in a future release. > > Ryan > > ----- Original Message ----- > From: "Jussi Koiranen" > To: > Sent: Wednesday, August 27, 2003 3:26 AM > Subject: '�', '�' and '�' with WordDocument > > > > I am readin word document with org.apache.poi.hdf.extractor.WordDocument > as > > follows: > > > > WordDocument wordDoc = new WordDocument("test.doc"); > > StringWriter strWriter = new StringWriter(); > > wordDoc.writeAllText(strWriter); > > System.out.println(strWriter); //for debuging > > > > But �, � and � are not read correctly from word document (test.doc). > > I am doing something wrong, or is this bug? > > > > I am tested this with jakarta-poi-1.5.1-final-bin and > > jakarta-poi-1.8.0-dev-bin packages. > > > > Jussi Koiranen > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org > > For additional commands, e-mail: poi-user-help@jakarta.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org > For additional commands, e-mail: poi-user-help@jakarta.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: poi-user-help@jakarta.apache.org