hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arkady Borkovsky <ark...@yahoo-inc.com>
Subject Re: [jira] Commented: (HADOOP-587) SequenceFile sort should use quicksort instead of merge sort for sorting runs.
Date Mon, 09 Oct 2006 23:11:16 GMT
This need to be done with some caution as
quicksort may take quadratic time when there are too few keys with  
uneven distribution.
At least this used to be true for libc when I used it.
Merge sort is guaranteed N log(N).

-- ab

On Oct 9, 2006, at 11:06 AM, Benjamin Reed (JIRA) wrote:

>     [  
> http://issues.apache.org/jira/browse/HADOOP-587? 
> page=comments#action_12440951 ]
>
> Benjamin Reed commented on HADOOP-587:
> --------------------------------------
>
> This is a duplicate of HADOOP-287
>
>> SequenceFile sort should use quicksort instead of merge sort for  
>> sorting runs.
>> ---------------------------------------------------------------------- 
>> --------
>>
>>                 Key: HADOOP-587
>>                 URL: http://issues.apache.org/jira/browse/HADOOP-587
>>             Project: Hadoop
>>          Issue Type: Improvement
>>          Components: mapred
>>            Reporter: Runping Qi
>>            Priority: Minor
>>
>> I noticed that Hadoop uses mergesort for sorting the runs in the  
>> sequence file SortPass class.
>> I think quicksort should be faster and should be used instead.
>> I also noticed that Java's Arrays.sort() static functions for  
>> non-primitive element types also use merge sort.
>> Are there any subtle reasons why not to use quicksort?
>> If no, we should implement out quicksort. Otherwise, we should use  
>> Java's sort, unless we believe our mergesort is better.
>
> -- 
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the  
> administrators:  
> http://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see:  
> http://www.atlassian.com/software/jira
>
>


Mime
View raw message