lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Smith <tsm...@attivio.com>
Subject Re: How to leverage the LogMergePolicy "calibrateSizeByDeletes" patch in Solr ?
Date Tue, 22 Sep 2009 18:43:33 GMT
Jason Rutherglen wrote:
> I have a working version of Simple FieldCache Merging LUCENE-1785 that
> should go in real soon.
>
>   
Will this contain a callback mechanism i can register with to know what
segments are being merged?
that way i can merge my own caches as well at the application layer,
perhaps exposed through something like
IndexReaderWarmer.warmMerge(IndexReader[] input, IndexReader output)

> On Tue, Sep 22, 2009 at 11:14 AM, Mark Miller <markrmiller@gmail.com> wrote:
>   
>> 1. see IndexWriter and the method/class that Mike pointed out earlier
>> for the warming.
>>
>> 2. See Lucene-831 - I think we will get some form of that in someday.
>>
>> Tim Smith wrote:
>>     
>>> This sounds pretty interesting
>>>
>>> is there a proposed API for doing this warming yet?
>>> Is there a ticket tracking this?
>>>
>>> for my use cases, it would be really nice for applications to be able
>>> to associate a custom "IndexCache" object with an index reader, then
>>> this pluggable "AutoWarmer" would be in charge of initializing this
>>> cache for a segment reader. I have a number of caches outside the
>>> realm of regular field caches that i associate with a segment,
>>> currently doing this after getting the IndexReader by iterating over
>>> its segments, and getting a cache object shared across all instances
>>> of the same logical segment. it would be nice if i could just have my
>>> "cache" object subclass a lucene IndexCache class and drop it right
>>> into this auto warming infrastructure (would greatly simplify things).
>>>
>>> then, once the index reader has been closed, it would call close on
>>> any attached IndexCache objects in order to free up memory/objects.
>>> (so i don't have to maintain reference counts anymore)
>>>
>>> Seems this could also greatly simplify the current field caching
>>> mechanisms, as the field caches could be associated with an
>>> IndexReader directly using the attached "IndexCache" object, instead
>>> of using static weak reference hash maps. (could then add methods like
>>> getFieldCache() to the IndexReader)
>>>
>>>  -- Tim Smith
>>>
>>> Michael McCandless wrote:
>>>       
>>>> Well described, that's exactly it!  I like the concrete example :)
>>>>
>>>> Thanks Yonik.
>>>>
>>>> Mike
>>>>
>>>> On Tue, Sep 22, 2009 at 1:38 PM, Yonik Seeley
>>>> <yonik@lucidimagination.com> wrote:
>>>>
>>>>         
>>>>> OK Mike, thanks for your patience - I understand now :-)
>>>>>
>>>>> Here's an example that helped me understand - hopefully it will add to
>>>>> others understanding more than it confuses ;-)
>>>>>
>>>>> IW.getReader() => segments={A, B}
>>>>>  // something causes a merge of A,B into AB to start
>>>>> addDoc(doc1)
>>>>>  // doc1 goes into segment C
>>>>> IW.getReader() => segments={A, B, C}
>>>>>  // merge isn't done yet, so getReader() still returns A,B instead of
>>>>> AB, but doc1 is still searchable!
>>>>>
>>>>> OK, in this scenario, there's no advantage to warming in the IW vs the
app.
>>>>> Let's start over with a little different timing:
>>>>>
>>>>> segments={A,B}
>>>>>  // something causes a merge of A,B into AB to start
>>>>> addDoc(doc1)
>>>>>  // doc1 goes into segment C
>>>>>  // merging of A,B into AB finishes
>>>>> IW.getReader() => segments={AB, C}
>>>>>
>>>>> Oh, no... with warming at the app level, we need to warm the huge AB
>>>>> segment before doc1 is visible.  We could continue using the old
>>>>> reader while the warming is ongoing, so no user requests will
>>>>> experience long queries, but doc1 isn't in the old segment.
>>>>>
>>>>> With warming in the IW (basically warming becomes part of the same
>>>>> operation as merging), then getReader() would return segments={A,B,C}
>>>>> and doc1 would still be instantly searchable.
>>>>>
>>>>> The only way to duplicate this functionality at the app layer would be
>>>>> to recognize that there is a new segment, try and figure out what old
>>>>> segments were merged to create this new segment, and create a reader
>>>>> that's a mix of old and new to avoid unwarmed segments - not nice.
>>>>>
>>>>> -Yonik
>>>>> http://www.lucidimagination.com
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>>
>>>>>
>>>>>
>>>>>           
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>>>
>>>>
>>>>         
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>>
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>   


Mime
View raw message