lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Townsend" <david.towns...@magus.co.uk>
Subject RE: pdf search
Date Fri, 20 Aug 2004 12:37:13 GMT
Hi Santosh,

Lucene doesn't search pdfs per se.  To make anything searchable you have to first extract
the content and then put it in lucene in a form it understands (i.e document objects).  So
in order to search your pdfs you first need to extract the info from the PDFs using something
like PDFBox.  So your battleplan should be forget lucene for a while, get the raw data out
of all the items you want to search. Then look at the lucene articles about creating simple
searchable indices.

DT

"If we didn't train to fight, who'd fight the wars?" :)

-----Original Message-----
From: Santosh [mailto:santosh.s@softprosys.com]
Sent: 20 August 2004 13:30
To: Lucene Users List
Subject: Fw: pdf search


How can I search through PDF?
----- Original Message ----- 
From: Santosh 
To: Lucene Users List 
Sent: Friday, August 20, 2004 5:59 PM
Subject: pdf search


Hi,

I am new bee to lucene.

I have downloaded zip file. now how can i give my own list words to lucene?
In the demo i saw that lucene is automatically creating index if we run the java program.but
I want to give my own search words, how is it possible? 


regards
Santosh kumar
SoftPro Systems
Hyderabad


"The harder you train in peace, the lesser you bleed in war"

-----------------------SOFTPRO DISCLAIMER------------------------------



Information contained in this E-MAIL and any attachments are

confidential being  proprietary to SOFTPRO SYSTEMS  is 'privileged'

and 'confidential'.



If you are not an intended or authorised recipient of this E-MAIL or

have received it in error, You are notified that any use, copying or

dissemination  of the information contained in this E-MAIL in any

manner whatsoever is strictly prohibited. Please delete it immediately

and notify the sender by E-MAIL.



In such a case reading, reproducing, printing or further dissemination

of this E-MAIL is strictly prohibited and may be unlawful.



SOFTPRO SYSYTEMS does not REPRESENT or WARRANT that an attachment

hereto is free from computer viruses or other defects. 



The opinions expressed in this E-MAIL and any ATTACHEMENTS may be

those of the author and are not necessarily those of SOFTPRO SYSTEMS.

------------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message