lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <>
Subject Re: Can POI provide reliable text extraction results for production search engine for Word, Excel and PowerPoint formats?
Date Mon, 12 May 2008 16:42:13 GMT
Lukas Vlcek skrev:
> Hi,
> I need to find a reliable way how to extract content out of Word, Excel and
> PowerPoint formats prior to indexing and I am not sure if POI is the best
> way to go. Can anybody share experience with POI and/or other [commercial]
> Java library for text extraction from MS formats?

I like Antiword for .doc files.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message