lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: RangeFilter performance problem using MultiReader
Date Fri, 10 Apr 2009 19:22:06 GMT
On Fri, Apr 10, 2009 at 3:11 PM, Mark Miller <markrmiller@gmail.com> wrote:
> Mark Miller wrote:
>>
>> Michael McCandless wrote:
>>>
>>> which is why I'm baffled that Raf didn't see a speedup on
>>> upgrading.
>>>
>>> Mike
>>>
>>
>> Another point is that he may not have such a nasty set of segments - Raf
>> says he has 24 indexes, which sounds like he may not have the logarithmic
>> sizing you normally see. If you have somewhat normal term distribution for
>> all 24 segments, the problem is not exasperated nearly as much (along with
>> not being so bad as its not using all of the terms for the field).
>
> Better clarify this: it will still be a problem - you still have all the
> extra seeks - but they are not as many wasted seeks that we can avoid like
> the problem with the tailed logarithmic segments.

Right, I think "uniqueness" of terms may be the driving factor.  So,
if segment sizes are all the same (no logarithmic tail), but terms are
very unique, you'll still have N-1 SegmentTermEnums trying to seek to
a term that they don't have.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message