Return-Path: Delivered-To: apmail-jakarta-poi-user-archive@www.apache.org Received: (qmail 66269 invoked from network); 20 Dec 2004 05:36:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 20 Dec 2004 05:36:25 -0000 Received: (qmail 80512 invoked by uid 500); 20 Dec 2004 05:36:24 -0000 Delivered-To: apmail-jakarta-poi-user-archive@jakarta.apache.org Received: (qmail 80019 invoked by uid 500); 20 Dec 2004 05:36:22 -0000 Mailing-List: contact poi-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "POI Users List" Reply-To: "POI Users List" Delivered-To: mailing list poi-user@jakarta.apache.org Received: (qmail 80004 invoked by uid 99); 20 Dec 2004 05:36:22 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=RCVD_BY_IP,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: domain of indianattech@gmail.com designates 64.233.184.205 as permitted sender) Received: from wproxy.gmail.com (HELO wproxy.gmail.com) (64.233.184.205) by apache.org (qpsmtpd/0.28) with ESMTP; Sun, 19 Dec 2004 21:36:18 -0800 Received: by wproxy.gmail.com with SMTP id 68so469539wri for ; Sun, 19 Dec 2004 21:36:16 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=m5XqnN6JTwlVpkjbVV72ofFX98lQsUxpQhSP3tYRnIhR8++P0hEvUxo+sW7AW031snfRWqfq0MS/GvAhaE6YSp6o7FqWW7qsQ/Q2nhCjjj9xgfIAMgb85Bx8xfhhvqSKBsffwaOgjJyk7DOx5otmRu4bGa5Q90pMjQnMIZKonmw= Received: by 10.54.15.75 with SMTP id 75mr323031wro; Sun, 19 Dec 2004 21:36:16 -0800 (PST) Received: by 10.54.18.12 with HTTP; Sun, 19 Dec 2004 21:36:16 -0800 (PST) Message-ID: <62a5541104121921365bddfdb8@mail.gmail.com> Date: Mon, 20 Dec 2004 11:06:16 +0530 From: IndianAtTech Reply-To: IndianAtTech To: POI Users List Subject: Re: Need Help! In-Reply-To: <20041220015729.85397.qmail@web15810.mail.cnb.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable References: <1103376453.17198.7.camel@heike.rainer-klute.de> <20041220015729.85397.qmail@web15810.mail.cnb.yahoo.com> X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N I suggest you to use Text Mining API which is built from POI libraries only here is the site link http://www.textmining.org/ here is example org.textmining.text.extraction.WordExtractor _word; _word =3D new org.textmining.text.extraction.WordExtractor(); //initialise the TEXTMINING-POI word object InputStream _wordInput =3D new FileInputStream(strDocName); String wordTextBuffer =3D _word.extractText(_wordInput);=20 System.out.println(wordTextBuffer); _wordInput.close(); //close the input stream _word =3D null; _wordInput =3D null; Best Regards Sudhakar On Mon, 20 Dec 2004 09:57:29 +0800 (CST), rec liu wrote: > Hello, > I got some code from intenet. which extrator ms word file to text file. > i try it in English, it do right. but in case of Chinese characters. it > will short some.that's to say,only part of content was saved ,part of > them lost. no matter it short or long file. why? what can i do? my code > as follows: > public boolean Extrator(){ > try > { > file =3D new WordDocument(fileName); >=20 > //Writer out =3D new BufferedWriter(new FileWriter(outFileName)); > Writer out =3D new OutputStreamWriter(new > FileOutputStream(outFileName),"utf-8"); > file.writeAllText(out); >=20 > //file.closeDoc(); > out.flush(); > out.close(); > } catch(Throwable t){ > t.printStackTrace(); > return false; > } > return true; > } > } > thanks. > jack >=20 >=20 > --------------------------------- > Do You Yahoo!? > 150=E4=B8=87=E6=9B=B2MP3=E7=96=AF=E7=8B=82=E6=90=9C=EF=BC=8C=E5=B8=A6=E6= =82=A8=E9=97=AF=E5=85=A5=E9=9F=B3=E4=B9=90=E6=AE=BF=E5=A0=82 > =E7=BE=8E=E5=A5=B3=E6=98=8E=E6=98=9F=E5=BA=94=E6=9C=89=E5=B0=BD=E6=9C=89= =EF=BC=8C=E6=90=9C=E9=81=8D=E7=BE=8E=E5=9B=BE=E3=80=81=E8=89=B3=E5=9B=BE=E5= =92=8C=E9=85=B7=E5=9B=BE > 1G=E5=B0=B1=E6=98=AF1000=E5=85=86=EF=BC=8C=E9=9B=85=E8=99=8E=E7=94=B5=E9= =82=AE=E8=87=AA=E5=8A=A9=E6=89=A9=E5=AE=B9=EF=BC=81 > --------------------------------------------------------------------- To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: poi-user-help@jakarta.apache.org