lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Rowe <sar...@syr.edu>
Subject Re: Searching pdf, getting page number
Date Mon, 16 Oct 2006 15:44:59 GMT
Hi Bill,

Bill Taylor wrote:
> On Oct 16, 2006, at 5:44 AM, Christoph P├Ąchter wrote:
>> I know that I can index pdf-files (using a third-party library).
> 
> Could you please tell me where to find this library? 

There are several PDF extraction packages listed here (look under the
"Lucene Document Converters" heading):

<http://lucene.apache.org/java/docs/contributions.html>

I haven't personally used it, but the documentation for PDF Box (one of
the packages listed on the above-linked page) describes integration with
Lucene:

<http://www.pdfbox.org/userguide/text_extraction.html#Lucene+Integration>

Steve


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message