lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrei Melis <Andrei.Me...@snt.ro>
Subject RE: Parsers
Date Wed, 28 May 2003 11:02:12 GMT
Hi Pete,

For pdf try PageMark - http://www.etymon.com ( I'm not sure if it's free,
but there is some downloadable source code)
For Excel try JexcelApi - http://www.andykhan.com or Jakarta POI -
http://jakarta.apache.org/poi/


Andrei
-----Original Message-----
From: Pete Lewis [mailto:pete@uptima.co.uk] 
Sent: Wednesday, May 28, 2003 1:48 PM
To: Lucene Users List
Subject: Parsers


Hi all,

I have a rather nice html parser that I got from SourceForge.  Does anyone
know of any good parsers for pdf and Microsoft Office Suite (.doc, .ppt,
.xls, etc), any help would be much appreciated.

Pete Lewis

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message