commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Grogan <mgro...@keizensoftware.com>
Subject [Fileupload] Reading MS-Word docs
Date Wed, 07 Jun 2006 13:40:53 GMT
Hi all,
Forgive the slightly off-topic question, but if someone here has done 
this before, I'd appreciate a pointer.
Using Fileupload to allow a user to upload a MS-Word document and would 
like to be able to strip out the text for indexing.
I have done this for PDF files using PDFBox and am looking for something 
similar for Word documents. I have looked at Lucene, but it looks too 
big and heavy for what we need.
Anyone have any ideas?
Thanks,
Martin

-- 
------------
Martin Grogan
Keizen Software

mgrogan@keizensoftware.com
www.keizensoftware.com


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message