lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Li" <ning.li...@gmail.com>
Subject Re: [jira] Commented: (LUCENE-565) Supporting deleteDocuments in IndexWriter (Code and Performance Results Provided)
Date Tue, 12 Sep 2006 14:21:10 GMT
The new code does handle the case.

After mergeSegments(...) in maybeMergeSegments(), there is the following code:
          numSegments -= mergeFactor;
          if (docCount > upperBound) {
            minSegment++;
            exceedsUpperLimit = true;
          } else if (docCount > 0) {
            numSegments++;
          }
Assume segment sizes = 30, 30, 30, 30, 30 just before the merge of the
leftmost 3 segments:
  - If the merge results in segment sizes = 60, 30, 30, i.e. the
merged size is greater than upperBound, continue with the rest of the
worthy segments on this level. In this case, only two left, no further
merge on this level.
  - If the merge results in segment sizes = 2, 30, 30, i.e. the merged
size is less than or equal to upperBound, consider this segment again
for further merges on this same level. In this case, there are three
segments, they are merged.


Also worth noticing is the while loop which is used to find
merge-worthy segments in maybeMergeSegments(). Merge-worthy segments
start from the rightmost segment whose doc count is in bounds, and end
before the rightmost segment whose doc count is greater than
upperBound. Although merge-worthy segments start from one in bounds
and the others are <= upperBound, but the other are not necessarily >
lowerBound. This only happens if parameters such as mergeFactor and
maxBufferedDocs change.
So if segment sizes = 2, 30, 30 when the loop is executed with
lowerBound 10 and upperBound 30, all three will be considered
merge-worthy and be merged.


In TestIndexWriterMergePolicy, testMaxBufferedDocsChange() (in fact,
both mergeFactor and maxBufferedDocs are changed in this test case)
tests both scenarios.


Ning


On 9/11/06, Yonik Seeley <yonik@apache.org> wrote:
> A strange case I just thought of.
> Does the new code handle the case where a merge can drop the resulting
> segment "down a level" (due to deletions)?
>
> Example: M=3, B=10, maxMergeDocs=30
>
> 1) segment sizes = 30, 30, 30, 30
> 2) set maxMergeDocs=1000000
> 3) add enough docs to cause a merge
> 4) the leftmost 3 segments will be merged first, resulting in segment
> sizes = 2, 30, n
>
> -Yonik
> http://incubator.apache.org/solr Solr, the open-source Lucene search server
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message