From java-dev-return-15598-apmail-lucene-java-dev-archive=lucene.apache.org@lucene.apache.org Wed Sep 06 15:27:26 2006 Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 66133 invoked from network); 6 Sep 2006 15:27:12 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 6 Sep 2006 15:27:12 -0000 Received: (qmail 36506 invoked by uid 500); 6 Sep 2006 15:27:04 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 36408 invoked by uid 500); 6 Sep 2006 15:27:03 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 36350 invoked by uid 99); 6 Sep 2006 15:27:03 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Sep 2006 08:27:03 -0700 X-ASF-Spam-Status: No, hits=0.5 required=10.0 tests=DNS_FROM_RFC_ABUSE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of ning.li.li@gmail.com designates 64.233.184.231 as permitted sender) Received: from [64.233.184.231] (HELO wr-out-0506.google.com) (64.233.184.231) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Sep 2006 08:27:02 -0700 Received: by wr-out-0506.google.com with SMTP id i5so807502wra for ; Wed, 06 Sep 2006 08:26:41 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=YpNPKy8pi5OD5s/mT34FyzUlAEjUy02QrrUPQquSg2lc2AcyEs6/ljyS/KKnnSjql5X6+Y0dZqmfpDxvA07VjUTwsWSeXSxN9D5Hd3WRkwZ3or2Rc9ik5i9fIh150hOKz9Djfgpwfqt+omLrOYirv4G8WY21CXWj/DEUzer8W/Q= Received: by 10.90.120.13 with SMTP id s13mr2267583agc; Wed, 06 Sep 2006 08:26:41 -0700 (PDT) Received: by 10.90.31.12 with HTTP; Wed, 6 Sep 2006 08:26:41 -0700 (PDT) Message-ID: Date: Wed, 6 Sep 2006 11:26:41 -0400 From: "Ning Li" To: java-dev@lucene.apache.org Subject: Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided) In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <9563073.1147131621231.JavaMail.jira@brutus> <20862232.1157124927804.JavaMail.jira@brutus> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N > > "Less than M number of segments whose doc count n satisfies B*(M^c) <= > > n < B*(M^(c+1)) for any c >= 0." > > In other words, less than M number of segments with the same f(n). > > Ah, I had missed that. But I don't believe that lucene currently > obeys this in all cases. I think it does hold for n >= B, i.e. c >= 0. But not for n < B. > The new IndexWriter changes ad an additional constraint: to delete > documents efficiently, the first merge must be on buffered documents > only to ensure that ids don't change. We should also explore changing > the index invariants to accommodate this. > > Do you have any ideas in this area? Is a monotonically decreasing > segment level (your f(n)) really required? Currently, the first merge always starts on buffered documents. Do you want this constraint to be reflected in the index invariants, or do you want to remove this constraint? In any case, a monotonically decreasing f(n) is definitely a good thing. Otherwise, cases like a sandwich (segments with small f(n) sandwiched by two segments with large f(n)) make it even harder to come up with a robust merge policy. > > So between B-sum(L) and B? Once there are M segments with > > docs less than B, they'll be merged. But what if L=0? Should B ram > > docs be accumulated before flushed in that case? > > It seems like it. Examples are easier to visualize sometimes... do > you have an example where this wouldn't be advisable? I'm ok with it. I simply wish there were one strategy that would work for both cases. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org