lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <karl.wet...@gmail.com>
Subject Re: Can POI provide reliable text extraction results for production search engine for Word, Excel and PowerPoint formats?
Date Mon, 12 May 2008 16:42:13 GMT
Lukas Vlcek skrev:
> Hi,
> 
> I need to find a reliable way how to extract content out of Word, Excel and
> PowerPoint formats prior to indexing and I am not sure if POI is the best
> way to go. Can anybody share experience with POI and/or other [commercial]
> Java library for text extraction from MS formats?

I like Antiword for .doc files.

http://www.winfield.demon.nl/


        karl

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message