lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From robert engels <reng...@ix.netcom.com>
Subject Re: Concurrent merge
Date Wed, 21 Feb 2007 23:29:56 GMT
I think when you start discussing background threads you need to  
think server environment.

It is fairly trivial there. I have pushed to move Lucene in that  
direction, rather than the multiple client accessing a shared  
resource via a network filesystem. No decent server product works  
this way.

On Feb 21, 2007, at 5:23 PM, Yonik Seeley wrote:

> On 2/21/07, Doron Cohen <DORONC@il.ibm.com> wrote:
>> Ning Li wrote:
>>
>> > There are three main challenges in enabling concurrent merge:
>> >   1 a robust merge policy
>> >   2 detect when merge lags document additions/deletions
>> >   3 how to slow down document additions/deletions (and amortize
>> >     the cost) when merge falls behind
>>
>> I wonder what it means for current API semantics -
>>
>> - An application today can set max-bufferred-docs to N, and after
>> the Nth (or N+1th?) call to addDoc returns, a newly opened searcher
>> would see these docs. With merges in a background thread this
>> might not hold.
>>
>> - Today, after add(), an application can call flush() or close(),
>> but with a background merge thread these calls would be blocked.
>> Mmm... this is probably not a behavior change, because today
>> these operations can trigger a merge that would take a long(er) time.
>
> We shouldn't advertise or guarantee that behavior.  This wasn't even
> true before the new merge policy was implemented.
>
>> - numRamDocs() and ramSizeInBytes() - not sure what they mean
>> once a background merge thread had started.
>
> IMO, for the current "batch" of documents being buffered.
> The "old" buffered documents should be flushed to disk ASAP.
>
>> Still, having non blocking adds is compelling.
>
> Somewhat... It would result in some performance increase...
> overlapping analysis of new documents with merging of other segments,
> resulting in a higher CPU utilization (esp on multi-processor
> systems).  The larger the maxBufferedDocs, the better.
>
> The downside is another complexity increase though.
>
> -Yonik
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message