poi-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Burch <n...@torchbox.com>
Subject Re: hwpf for text extraction
Date Tue, 27 Jun 2006 18:24:53 GMT
On Tue, 27 Jun 2006, Suba Suresh wrote:
> 	I just want to extract text from word doc to index with lucene.
> The version I downloaded from apache is year poi-3.0-alpha2005.. Can you
> tell me where I can get the current stable build from?

Your best bet is to use poi-3.0-alpha2 (from your favourite apache
mirror), and then follow the basic text extraction stuff as documented in
	http://jakarta.apache.org/poi/hwpf/quick-guide.html
I use this with my own lucene stuff, and it works fine.

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-user-unsubscribe@jakarta.apache.org
Mailing List:     http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta Poi Project:  http://jakarta.apache.org/poi/


Mime
View raw message