lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Smith (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1076) Allow MergePolicy to select non-contiguous merges
Date Tue, 21 Jul 2009 21:17:14 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733831#action_12733831
] 

Tim Smith commented on LUCENE-1076:
-----------------------------------

i suppose you could do a preliminary round of merging that would merge together segments that
share doc store/termvector data

once this preliminary round of merging is done, you would then no longer have the need to
slice the doc stores up, just merge them together (contiguous or non-contiguous wouldn't matter
anymore, however if a "segmented session" still exists higher up, this would prevent you from
selecting these segments, or newer segments)

it might even be desirable to have a "commit()" optionally perform this merging prior to the
commit finishing as this will result in each commit producing one segment, regardless of the
number of flushes that were done under the hood

> Allow MergePolicy to select non-contiguous merges
> -------------------------------------------------
>
>                 Key: LUCENE-1076
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1076
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.3
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>            Priority: Minor
>         Attachments: LUCENE-1076.patch
>
>
> I started work on this but with LUCENE-1044 I won't make much progress
> on it for a while, so I want to checkpoint my current state/patch.
> For backwards compatibility we must leave the default MergePolicy as
> selecting contiguous merges.  This is necessary because some
> applications rely on "temporal monotonicity" of doc IDs, which means
> even though merges can re-number documents, the renumbering will
> always reflect the order in which the documents were added to the
> index.
> Still, for those apps that do not rely on this, we should offer a
> MergePolicy that is free to select the best merges regardless of
> whether they are continuguous.  This requires fixing IndexWriter to
> accept such a merge, and, fixing LogMergePolicy to optionally allow
> it the freedom to do so.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message