lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Galambos <Le...@seznam.cz>
Subject Re: Index advice...
Date Tue, 10 Feb 2004 12:29:47 GMT
Otis Gospodnetic napsal(a):

>Without seeing more information/code, I can't tell which part of your
>system slows down with time, but I can tell you that Lucene's 'add'
>does not slow over time (i.e. as the index gets larger).  Therefore, I
>would look elsewhere for causes of the slowdown.
>  
>

Otis, can you point me to some proofs that time of "insert" operation 
does not depend on the index size, please? Amortized time of "insert" is 
O(log(docsIndexed/mergeFac)), I think. Thus I do not know how it could 
be O(1).

Thank you.
Leo

AFAIK the issue with PDF files can be based on the PDF parser (I already 
encountered this with PDFbox).

>The easiest thing to do is add logging to suspicious portions of the
>code.  That will narrow the scope of the code you need to analyze.
>
>Otis
>
>
>--- kevin@ckhill.com wrote:
>  
>
>>Hey Lucene-users,
>>
>>I'm setting up a Lucene index on 5G of PDF files (full-text search). 
>>I've 
>>been really happy with Lucene so far but I'm curious what tips and
>>strategies 
>>I can use to optimize my performance at this large size.
>>
>>So far I am using pretty much all of the defaults (I'm new to
>>Lucene).
>>
>>I am using PDFBox to add the documents to the index.
>>I can usually add about 800 or so PDF files and then the add loop:
>>
>>for ( int i = 0; i < fileNames.length; i++ ) {
>>	Document doc = IndexFile.index(baseDirectory+documentRoot+"fileNames
>>[i]); 
>>	writer.addDocument(doc);
>>}
>>
>>
>>really starts to slow down.  Doesn't seem to be memory related.
>>Thoughts anyone?
>>
>>Thanks in advance,
>>CK Hill
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>    
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
>
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message