lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Bhagat <vbha...@blastradius.com>
Subject RE: indexing other documents (.doc .pdf .txt ...)
Date Mon, 04 Nov 2002 16:12:00 GMT
well lucene can not directly index pdf.. u have to extract text from pdf....
ad Jpedal is a good library that i used to extract text from pdf and than
lucene api's can index it.....

 u  neeed to downlaod Jpedal  library  from   http://www.jpedal.org/

 and the mentioned class is there in the example section..
hope this helped..
vind.


-----Original Message-----
From: Murthy, Suryanarayana (MED, TCS)
[mailto:Suryanarayana.Murthy@med.ge.com]
Sent: Monday, November 04, 2002 5:15 PM
To: 'Lucene Users List'
Subject: RE: indexing other documents (.doc .pdf .txt ...)


Where is this class?

-----Original Message-----
From: Vinod Bhagat [mailto:vbhagat@blastradius.com] 
Sent: Monday, November 04, 2002 4:59 PM
To: 'Lucene Users List'
Subject: RE: indexing other documents (.doc .pdf .txt ...)

look at the ExtracttextObjects.java   class.. this is ur answer for pdf....
vin.



-----Original Message-----
From: Friaa Nafaa [mailto:friaa@excite.com]
Sent: Monday, November 04, 2002 5:04 PM
To: lucene-user@jakarta.apache.org
Subject: indexing other documents (.doc .pdf .txt ...)


 Can I index pdf or doc or txt documents with lucene ? and how I procede to
do this ?I have installed a demo copy of Lucene and whene I index a set of
documents, lucene index only html documents and no pdf or doc.thanks.

_______________________________________________
Join Excite! - http://www.excite.com
The most personalized portal on the Web!

--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message