lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Smith <tsm...@attivio.com>
Subject Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?
Date Tue, 22 Sep 2009 19:00:30 GMT
Jason Rutherglen wrote:
> For that you can subclass IW.mergeSuccess.
>
>   
looks like thats package private :(
also doesn't look like it has the merged output SegmentReader which
could be used for cache loading/cache key (since it may not have been
opened yet, but with NRT it should be available?)
OneMerge looks heavily package private as well

 -- Tim

> On Tue, Sep 22, 2009 at 11:43 AM, Tim Smith <tsmith@attivio.com> wrote:
>   
>> Jason Rutherglen wrote:
>>
>> I have a working version of Simple FieldCache Merging LUCENE-1785 that
>> should go in real soon.
>>
>>
>>
>> Will this contain a callback mechanism i can register with to know what
>> segments are being merged?
>> that way i can merge my own caches as well at the application layer, perhaps
>> exposed through something like IndexReaderWarmer.warmMerge(IndexReader[]
>> input, IndexReader output)
>>
>> On Tue, Sep 22, 2009 at 11:14 AM, Mark Miller <markrmiller@gmail.com> wrote:
>>
>>
>> 1. see IndexWriter and the method/class that Mike pointed out earlier
>> for the warming.
>>
>> 2. See Lucene-831 - I think we will get some form of that in someday.
>>
>> Tim Smith wrote:
>>
>>
>> This sounds pretty interesting
>>
>> is there a proposed API for doing this warming yet?
>> Is there a ticket tracking this?
>>
>> for my use cases, it would be really nice for applications to be able
>> to associate a custom "IndexCache" object with an index reader, then
>> this pluggable "AutoWarmer" would be in charge of initializing this
>> cache for a segment reader. I have a number of caches outside the
>> realm of regular field caches that i associate with a segment,
>> currently doing this after getting the IndexReader by iterating over
>> its segments, and getting a cache object shared across all instances
>> of the same logical segment. it would be nice if i could just have my
>> "cache" object subclass a lucene IndexCache class and drop it right
>> into this auto warming infrastructure (would greatly simplify things).
>>
>> then, once the index reader has been closed, it would call close on
>> any attached IndexCache objects in order to free up memory/objects.
>> (so i don't have to maintain reference counts anymore)
>>
>> Seems this could also greatly simplify the current field caching
>> mechanisms, as the field caches could be associated with an
>> IndexReader directly using the attached "IndexCache" object, instead
>> of using static weak reference hash maps. (could then add methods like
>> getFieldCache() to the IndexReader)
>>
>>  -- Tim Smith
>>
>> Michael McCandless wrote:
>>
>>
>> Well described, that's exactly it!  I like the concrete example :)
>>
>> Thanks Yonik.
>>
>> Mike
>>
>> On Tue, Sep 22, 2009 at 1:38 PM, Yonik Seeley
>> <yonik@lucidimagination.com> wrote:
>>
>>
>>
>> OK Mike, thanks for your patience - I understand now :-)
>>
>> Here's an example that helped me understand - hopefully it will add to
>> others understanding more than it confuses ;-)
>>
>> IW.getReader() => segments={A, B}
>>  // something causes a merge of A,B into AB to start
>> addDoc(doc1)
>>  // doc1 goes into segment C
>> IW.getReader() => segments={A, B, C}
>>  // merge isn't done yet, so getReader() still returns A,B instead of
>> AB, but doc1 is still searchable!
>>
>> OK, in this scenario, there's no advantage to warming in the IW vs the app.
>> Let's start over with a little different timing:
>>
>> segments={A,B}
>>  // something causes a merge of A,B into AB to start
>> addDoc(doc1)
>>  // doc1 goes into segment C
>>  // merging of A,B into AB finishes
>> IW.getReader() => segments={AB, C}
>>
>> Oh, no... with warming at the app level, we need to warm the huge AB
>> segment before doc1 is visible.  We could continue using the old
>> reader while the warming is ongoing, so no user requests will
>> experience long queries, but doc1 isn't in the old segment.
>>
>> With warming in the IW (basically warming becomes part of the same
>> operation as merging), then getReader() would return segments={A,B,C}
>> and doc1 would still be instantly searchable.
>>
>> The only way to duplicate this functionality at the app layer would be
>> to recognize that there is a new segment, try and figure out what old
>> segments were merged to create this new segment, and create a reader
>> that's a mix of old and new to avoid unwarmed segments - not nice.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>   


Mime
View raw message