lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Burch <n...@torchbox.com>
Subject Re: Lucene indexing PPT
Date Fri, 30 Jun 2006 10:19:27 GMT
On Fri, 30 Jun 2006, mcarcelen wrote:
> I´m trying to build a index with PPT files. I have downloaded the api
> POI, "poi.bin.3.0" and "poi.src.3.0", but I don´t know where may I have
> to unzip them. I´d like to build the index by the command line, the same
> way as

I don't know about the lucene demo, but I can help with your POI issue.
You only need the poi bin package, but you do need to unpack it. In there
you'll find three jar files - for PowerPoint stuff, you'll just need to
put the poi-3.0 and poi-scratchpad-3.0 jars on your classpath.

You can then use org.apache.poi.hslf.extractor.PowerPointExtractor to do
your text extraction.


Perhaps someone can advise you on how to integrate this into the demo.

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message