lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pete Lewis" <>
Subject Re: Parsers
Date Wed, 28 May 2003 13:01:35 GMT
Hi Adriano

Thanks.  Code samples would be nice :)

Will come back if I find something for .ppt.


----- Original Message -----
From: "Adriano Labate" <>
To: "'Lucene Users List'" <>
Sent: Wednesday, May 28, 2003 1:03 PM
Subject: RE : Parsers

The text extractors work very well for Word and pdf
They use both PDFBox and POI.

For Excel, using POI directly is very easy. Tell me if you want to see
code samples.

I'm looking myself for a Powerpoint text extractor, if you know one...

Adriano Labate

-----Message d'origine-----
De : Pete Lewis []
Envoyé : mercredi, 28 mai 2003 12:48
À : Lucene Users List
Objet : Parsers

Hi all,

I have a rather nice html parser that I got from SourceForge.  Does
anyone know of any good parsers for pdf and Microsoft Office Suite
(.doc, .ppt, .xls, etc), any help would be much appreciated.

Pete Lewis

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message