lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: segments and sorting
Date Wed, 19 Jun 2013 08:10:17 GMT
Hi,

On Wed, Jun 19, 2013 at 12:16 AM, Sriram Sankar <sankar@gmail.com> wrote:
> Is it possible to do this more efficiently using a merge sort?  Assuming
> the individual segments are already sorted, is there a wrapper that I can
> use where I can pass the same sorting function?  I'm guessing the
> SlowCompositeReaderWrapper does not assume that the individual segments are
> already sorted and therefore would repeat the work?

Given that online sorting is rather new to Lucene, we tried to keep it
simple. Merging segments in parallel by maintaining a priority queue
is totally doable and is probably one of the next steps for online
sorting but it would require some non-trivial work to reimplement
merging for all formats (postings lists especially) and to be able to
plug a custom SegmentMerger into the IndexWriter.

For now, we just make sure that sorting a SlowCompositeReaderWrapper
which wraps several sorted segments is faster than sorting a random
AtomicReader by using TimSort to compute the mapping between the old
and the new doc IDs and to sort all individual postings lists.

--
Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message