lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (LUCENE-4839) Sorter API: Use TimSort to sort doc IDs and postings lists
Date Sat, 16 Mar 2013 15:46:12 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adrien Grand updated LUCENE-4839:
---------------------------------

    Attachment: LUCENE-4839.patch

bq. Nice! Why do we need the private inner class TimSort?

It's no needed but my first patch (not uploaded) did not use a helper class and was hard to
read, so I think this is better this way?

bq. I would be happy to also add the timSort algorithm to ArrayUtils and CollectionUtils.

Done in the patch.

bq. The bonus would be: The extensive random tests in TestArrayUtils and TestCollectionUtils
could be used for timSort, too (their existence is the reason why there is no TestSorterTemplate
class in current code).

Done.
                
> Sorter API: Use TimSort to sort doc IDs and postings lists
> ----------------------------------------------------------
>
>                 Key: LUCENE-4839
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4839
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-4839.patch, LUCENE-4839.patch
>
>
> TimSort (http://svn.python.org/projects/python/trunk/Objects/listsort.txt, used by python
and Java's Arrays.sort(Object[]) in particular) is a sorting algorithm that performs very
well on partially-sorted data. Indeed, with TimSort, sorting an array which is in reverse
order or a finite concatenation of sorted arrays is a linear operation (instead of O(n ln(n))).
> The sorter API could benefit from this algorithm when using Sorter.REVERSE_DOCS or merging
several sorted readers for example.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message