lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4560) Support Filtering Segments During Merge
Date Mon, 19 Nov 2012 06:40:58 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500050#comment-13500050
] 

Shai Erera commented on LUCENE-4560:
------------------------------------

I don't see why you need to gradually migrate segments, rather than a one-off thing that you
do when you decide to change the format of the index.

Is it really important to do this gradually? Or if there was a tool which allowed you to do
an in-place upgrade of certain segments, that would be something to consider?
For instance, you can do something similar to:

{code}
Directory dir; // directory with indexed documents
IndexWriter writer; // your instance of IndexWriter
IndexReader r = YourIndexReader.open(dir);
writer.deleteAll();
writer.addIndexes(r);
writer.commit();
{code}

This is all transactional, so until you commit, searches don't see any of this work.

Note however that while it's seemingly done in-place, you need to copy all the segments, even
if they don't need to be upgraded.

I guess that I just can't think of a good reason to do a gradual upgrade of segments. Whenever
I had to upgrade old indexes, it was done as a one-off process and that's it. E.g. IndexUpgrader
is such a tool -- upgrades the index in place.

Having said that, if others think that it might be useful to let one extend e.g. IndexWriter
by providing different instances than SegmentReader (hard-coded in IW), then I prefer that
route to the MergedSegmentFilter. Today it's SegmentMerger, tomorrow it will be something
else. If we want to handle it, let's handle it from the root. SegmentMerger itself really
has nothing to do with filtering readers. That way, you could write something like IndexUpgrader
(or UpgradeMergePolicy) and upgrade the index as a one-off process, in place, touching only
needed segments.
                
> Support Filtering Segments During Merge
> ---------------------------------------
>
>                 Key: LUCENE-4560
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4560
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Tim Smith
>         Attachments: LUCENE-4560.patch
>
>
> Spun off from LUCENE-4557
> It is desirable to be able to filter segments during merge.
> Most often, full reindex of content is not possible.
> Merging segments can sometimes have negative consequences when fields are have different
options (most restrictive option is forced during merge)
> Being able to filter segments during merges will allow gradually migrating indexed data
to new index settings, support pruning/enhancing existing data gradually
> Use Cases:
> * Migrate IndexOptions for fields (See LUCENE-4557)
> * Gradually Remove index fields no longer used
> * Migrate indexed sort fields to DocValues
> * Support converting data types for indexed data
> * and so on
> patch will be forthcoming

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message