I've been using Ryan's textmining in prefence to the POI as internally TM uses
POI and the Word6 extractor so handles a greater variety of files.
Ryan, thanks for fixing your site. Do you have any plans/ideas on how to parse
the 'fast-saved' files and any ideas on Word files older than the Word 6 format?
Regards
Antony
Ryan Ackley wrote:
> As the author of both Word POI and textmining.org, I recommend using
> textmining.org. POI is for general purpose manipulation of Word
> documents. textmining's only purpose is extracting text.
>
> Also, people recommend using POI for text extraction but the only
> place I've seen an actual how-to on this is in the "Lucene in Action"
> book.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
|