lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Smith <>
Subject Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?
Date Tue, 22 Sep 2009 18:01:38 GMT
This sounds pretty interesting

is there a proposed API for doing this warming yet?
Is there a ticket tracking this?

for my use cases, it would be really nice for applications to be able to
associate a custom "IndexCache" object with an index reader, then this
pluggable "AutoWarmer" would be in charge of initializing this cache for
a segment reader. I have a number of caches outside the realm of regular
field caches that i associate with a segment, currently doing this after
getting the IndexReader by iterating over its segments, and getting a
cache object shared across all instances of the same logical segment. it
would be nice if i could just have my "cache" object subclass a lucene
IndexCache class and drop it right into this auto warming infrastructure
(would greatly simplify things).

then, once the index reader has been closed, it would call close on any
attached IndexCache objects in order to free up memory/objects. (so i
don't have to maintain reference counts anymore)

Seems this could also greatly simplify the current field caching
mechanisms, as the field caches could be associated with an IndexReader
directly using the attached "IndexCache" object, instead of using static
weak reference hash maps. (could then add methods like getFieldCache()
to the IndexReader)

 -- Tim Smith

Michael McCandless wrote:
> Well described, that's exactly it!  I like the concrete example :)
> Thanks Yonik.
> Mike
> On Tue, Sep 22, 2009 at 1:38 PM, Yonik Seeley
> <> wrote:
>> OK Mike, thanks for your patience - I understand now :-)
>> Here's an example that helped me understand - hopefully it will add to
>> others understanding more than it confuses ;-)
>> IW.getReader() => segments={A, B}
>>  // something causes a merge of A,B into AB to start
>> addDoc(doc1)
>>  // doc1 goes into segment C
>> IW.getReader() => segments={A, B, C}
>>  // merge isn't done yet, so getReader() still returns A,B instead of
>> AB, but doc1 is still searchable!
>> OK, in this scenario, there's no advantage to warming in the IW vs the app.
>> Let's start over with a little different timing:
>> segments={A,B}
>>  // something causes a merge of A,B into AB to start
>> addDoc(doc1)
>>  // doc1 goes into segment C
>>  // merging of A,B into AB finishes
>> IW.getReader() => segments={AB, C}
>> Oh, no... with warming at the app level, we need to warm the huge AB
>> segment before doc1 is visible.  We could continue using the old
>> reader while the warming is ongoing, so no user requests will
>> experience long queries, but doc1 isn't in the old segment.
>> With warming in the IW (basically warming becomes part of the same
>> operation as merging), then getReader() would return segments={A,B,C}
>> and doc1 would still be instantly searchable.
>> The only way to duplicate this functionality at the app layer would be
>> to recognize that there is a new segment, try and figure out what old
>> segments were merged to create this new segment, and create a reader
>> that's a mix of old and new to avoid unwarmed segments - not nice.
>> -Yonik
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View raw message