Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 96024 invoked from network); 30 Jun 2006 12:03:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 30 Jun 2006 12:03:59 -0000 Received: (qmail 7566 invoked by uid 500); 30 Jun 2006 12:03:53 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 7539 invoked by uid 500); 30 Jun 2006 12:03:53 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 7528 invoked by uid 99); 30 Jun 2006 12:03:53 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jun 2006 05:03:53 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [62.81.148.13] (HELO smtp.isoco.com) (62.81.148.13) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jun 2006 05:03:51 -0700 Received: by smtp.isoco.com (Postfix-smc, from userid 65534) id 9949DA7CCA; Fri, 30 Jun 2006 14:05:46 +0200 (CEST) X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on kilimanjaro.isoco.net X-Spam-Level: Received: from mad.isoco.net (mad.isoco.net [172.18.0.11]) by smtp.isoco.com (Postfix-smc) with ESMTP id C8B82A7CAB for ; Fri, 30 Jun 2006 14:05:45 +0200 (CEST) Received: from mcarcelen (darek.mad.isoco.net [172.18.2.70]) by mad.isoco.net (Postfix) with ESMTP id 94949C069 for ; Fri, 30 Jun 2006 14:03:58 +0200 (CEST) From: "mcarcelen" To: Subject: RE: Lucene indexing PPT Date: Fri, 30 Jun 2006 14:03:20 +0200 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Office Outlook, Build 11.0.5510 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2869 Thread-Index: AcacLtc6RZfVV9DuQwCxqVkm3bvIdgADfevw In-Reply-To: Message-Id: <20060630120358.94949C069@mad.isoco.net> X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-3.7 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Hello Nick! Thanks for your help, it=B4s useful for me Bye -----Mensaje original----- De: Nick Burch [mailto:nick@torchbox.com]=20 Enviado el: viernes, 30 de junio de 2006 12:19 Para: java-user@lucene.apache.org Asunto: Re: Lucene indexing PPT On Fri, 30 Jun 2006, mcarcelen wrote: > I=B4m trying to build a index with PPT files. I have downloaded the = api > POI, "poi.bin.3.0" and "poi.src.3.0", but I don=B4t know where may I = have > to unzip them. I=B4d like to build the index by the command line, the = same > way as I don't know about the lucene demo, but I can help with your POI issue. You only need the poi bin package, but you do need to unpack it. In = there you'll find three jar files - for PowerPoint stuff, you'll just need to put the poi-3.0 and poi-scratchpad-3.0 jars on your classpath. You can then use org.apache.poi.hslf.extractor.PowerPointExtractor to do your text extraction. Perhaps someone can advise you on how to integrate this into the demo. Nick --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org