Return-Path: Delivered-To: apmail-poi-user-archive@www.apache.org Received: (qmail 42186 invoked from network); 19 May 2008 22:19:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 19 May 2008 22:19:29 -0000 Received: (qmail 88344 invoked by uid 500); 19 May 2008 22:19:29 -0000 Delivered-To: apmail-poi-user-archive@poi.apache.org Received: (qmail 88322 invoked by uid 500); 19 May 2008 22:19:29 -0000 Mailing-List: contact user-help@poi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "POI Users List" Delivered-To: mailing list user@poi.apache.org Received: (qmail 88311 invoked by uid 99); 19 May 2008 22:19:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 May 2008 15:19:29 -0700 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=DNS_FROM_OPENWHOIS,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 May 2008 22:18:42 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1JyDh3-0001Ix-4s for user@poi.apache.org; Mon, 19 May 2008 15:18:57 -0700 Message-ID: <17329385.post@talk.nabble.com> Date: Mon, 19 May 2008 15:18:57 -0700 (PDT) From: nacho210 To: user@poi.apache.org Subject: errors in text extraction MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Nabble-From: icanovi@truelogic.com.ar X-Virus-Checked: Checked by ClamAV on apache.org i=C2=B4m using hwpf to extract text from word documents.=20 WordExtractor extractor =3D new WordExtractor(fis); String body =3D extractor.getText(); Returns invalid characters like: \u0013 \u0014 \u000b any suggestion on what the problem might be? --=20 View this message in context: http://www.nabble.com/errors-in-text-extracti= on-tp17329385p17329385.html Sent from the POI - User mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@poi.apache.org For additional commands, e-mail: user-help@poi.apache.org