lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Morus Walter <>
Subject Re: Search performance with one index vs. many indexes
Date Mon, 28 Feb 2005 07:30:21 GMT
Jochen Franke writes:
> Topic: Search performance with large numbers of indexes vs. one large index
> My questions are:
> - Is the size of the "wordlist" the problem?
> - Would we be a lot faster, when we have a smaller number
> of files per index?

Index lookup of a word is O(ln(n)) where n is the number of words.
Index lookup of a word in k indexes having m words is O( k ln(m) )
In the best case all word lists are distict (purely theoretical), 
that is n = k*m or m = n/k
For n = 15 Mio, k = 800
ln(n) = 16.5
k*ln(n/k) = 7871
In a realistic case, m is much bigger since word lists won't be distinct.
But it's the linear factor k that bites you.
In the worst case (all words in all indices) you have
k*ln(n) = 13218.8


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message