lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: indexing performance issue
Date Thu, 30 Nov 2006 15:31:04 GMT lists several  
PDF alternatives, but I can't speak to their performance.  I am sure  
if you googled PDF converters you could find a fair number of hits.

Perhaps w/ some more details about your app we might be able to find  
a workaround.  We often convert PDFs as a one time offline task  
(which doesn't get around the fact that you need to do it) so that  
they don't get in the way when indexing (and reindexing), but we  
generally deal w/ fixed collections, which may not be your case.


On Nov 30, 2006, at 5:48 AM, spinergywmy wrote:

> Hi guys,
>    I have posted this question before and this time I found that it  
> could be
> pdfbox problem and this pdfbox I downloaded doesn't use the  
> log4j.jar. To
> index the app 2.13mb pdf file took me 17s and total time to upload  
> a file is
> 18s.
>    So, is there any way or others software than pdfbox to solve the
> performance issue.
>    Thanks.
> regards,
> Wooi Meng
> -- 
> View this message in context: 
> performance-issue-tf2730895.html#a7617155
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll
Center for Natural Language Processing

Read the Lucene Java FAQ at 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message