lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4752) Merge segments to sort them
Date Wed, 06 Mar 2013 12:00:17 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13594638#comment-13594638
] 

Adrien Grand commented on LUCENE-4752:
--------------------------------------

bq.  I took a look at MP API, and maybe if we change OneMerge from holding a List<SegmentReader>
to List<AtomicReader>, we could write an MP which sorts segments together by opening
a SortingAtomicReader over the segments that were picked for merge?

Making the sorting stuff part of MergePolicy makes sense. However, I think that the (package-private)
List<SegmentReader> in MergePolicy is only used to track the list of segment readers
being used while merging (this reference is only used in IndexWriter). What MP actually manipulates
is a list of SegmentInfoPerCommit, it is possible that no reader is open for a segment when
MergePolicy picks it, and Lucene should not force a reader to be open until the merge actually
starts. So maybe we should have an additional method in MergePolicy (or OneMerge for finer-grained
control?) to tell IndexWriter how to view a list of segment readers? (either sequentially
as today or a sorted view as suggested in this issue description).



                
> Merge segments to sort them
> ---------------------------
>
>                 Key: LUCENE-4752
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4752
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: core/index
>            Reporter: David Smiley
>            Assignee: Adrien Grand
>
> It would be awesome if Lucene could write the documents out in a segment based on a configurable
order.  This of course applies to merging segments to. The benefit is increased locality on
disk of documents that are likely to be accessed together.  This often applies to documents
near each other in time, but also spatially.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message