commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark <elihusma...@gmail.com>
Subject Re: [Fileupload] Reading MS-Word docs
Date Wed, 07 Jun 2006 14:45:44 GMT
Apache POI is probably your best bet.


On 6/7/06, Martin Grogan <mgrogan@keizensoftware.com> wrote:
> Hi all,
> Forgive the slightly off-topic question, but if someone here has done
> this before, I'd appreciate a pointer.
> Using Fileupload to allow a user to upload a MS-Word document and would
> like to be able to strip out the text for indexing.
> I have done this for PDF files using PDFBox and am looking for something
> similar for Word documents. I have looked at Lucene, but it looks too
> big and heavy for what we need.
> Anyone have any ideas?
> Thanks,
> Martin
>
> --
> ------------
> Martin Grogan
> Keizen Software
>
> mgrogan@keizensoftware.com
> www.keizensoftware.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message