Return-Path: Delivered-To: apmail-jakarta-poi-user-archive@www.apache.org Received: (qmail 47965 invoked from network); 6 Jul 2006 16:19:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 6 Jul 2006 16:19:11 -0000 Received: (qmail 61779 invoked by uid 500); 6 Jul 2006 16:19:09 -0000 Delivered-To: apmail-jakarta-poi-user-archive@jakarta.apache.org Received: (qmail 61744 invoked by uid 500); 6 Jul 2006 16:19:08 -0000 Mailing-List: contact poi-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Help: List-Post: List-Id: "POI Users List" Reply-To: "POI Users List" Delivered-To: mailing list poi-user@jakarta.apache.org Received: (qmail 61726 invoked by uid 99); 6 Jul 2006 16:19:08 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Jul 2006 09:19:08 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [140.177.205.37] (HELO webmail.wolfram.com) (140.177.205.37) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Jul 2006 09:19:05 -0700 Received: from [10.10.150.44] ([10.10.150.44]) (authenticated bits=0) by webmail.wolfram.com (8.13.6/8.13.4) with ESMTP id k66GIjUt017870 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 6 Jul 2006 11:18:58 -0500 Message-ID: <44AD37C2.60201@wolfram.com> Date: Thu, 06 Jul 2006 12:18:10 -0400 From: Suba Suresh User-Agent: Mozilla Thunderbird 1.0.7 (Windows/20050923) X-Accept-Language: en-us, en MIME-Version: 1.0 To: POI Users List Subject: Re: PowerPoint extractor References: <44A0154D.1030609@wolfram.com> <44A026F9.5010509@wolfram.com> <44A1A461.3090103@wolfram.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N I tried the July4th build. The warnings are gone. Thank You. I used the following code for a couple of small excel files to index with lucene. I don't know how effective the search is going to be since it is still in the implementation stage.If there are any errors please let me know. public class ExcelHandler implements DocumentHandler { String fileName; public ExcelHandler(String name) { super(); fileName = new String(name); } public Document getDocument(InputStream is) throws DocumentHandlerException { Document doc = new Document(); POIFSDocument pdoc = new POIFSDocument(fileName,is); DocumentInputStream docis = new DocumentInputStream(pdoc); byte[] content = new byte[docis.available()]; docis.read(content); docis.close(); StringBuffer textBuf = new StringBuffer(); for(int i =0; i On Tue, 27 Jun 2006, Suba Suresh wrote: > >>Thank you for all the pointers. It is a great help. I used today's >>build. It worked fine for WordDocument. I did not try the meta data yet. >>For PowerPoint I am getting the following for powerpoint extractor just >>for one file. Am I doing anything wrong? I did'nt change my code. > > > These errors should now have gone. Can you try a new svn checkout / > tomorrow's SVN build? > > > >>Also since some the excel files were not 97-2002 format I used the >>POIFSFilesystem and read it as a bytestream and stored as text string. I >>hope that is fine. > > > If you have some code for getting some basic text out of Excel 95 files, > we'd be interested in hosting it. I'm sure that something that outputs > text that can be fed to lucene would be useful for a lot of people, even > if that's all the excel 95 support we have. > > Nick > > --------------------------------------------------------------------- > To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org > Mailing List: http://jakarta.apache.org/site/mail2.html#poi > The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/ --------------------------------------------------------------------- To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org Mailing List: http://jakarta.apache.org/site/mail2.html#poi The Apache Jakarta Poi Project: http://jakarta.apache.org/poi/