lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ioan Cocan <ico...@gmail.com>
Subject Re: Indexing Performance issue
Date Fri, 10 Nov 2006 14:16:18 GMT
You may want to use something like pdftotext part of XPDF 
(http://www.foolabs.com/xpdf/download.html). It will produce a text 
extract for a PDF. Indexing will work like a breeze, without memory 
consumption of PDFBox.
Regards,
Ioan

spinergywmy wrote:
> Hi,
>
>    I having this indexing the pdf file performance issue. It took me more
> than 10 sec to index a pdf file about 200kb. Is it because I only have a
> segment file? How can I make the indexing performance better?
>
>    Thanks
>
>
> regards,
> Wooi Meng
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message