[ https://issues.apache.org/jira/browse/HADOOP-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12602823#action_12602823 ] Chris Douglas commented on HADOOP-3442: --------------------------------------- Analysis of the data (thanks to everyone who provided their test cases) led us to consider the following degenerate case: Consider a partition: {noformat} a_n, a_1, a_2, ... , a_n-2, a_n-1 {noformat} Where {{a_1 ... a_n-1}} are sorted. The median of three partitioning will consider {{a_n}}, {{a_n/2}}, and {{a_n-1}} and select {{a_n-1}} as the pivot. While the sort runs: {noformat} a_n-1, a_1, a_2, ... , a_n-2, a_n {noformat} The left index will run all the way to {{a_n}} and swap the pivot into place, yielding the following: {noformat} a_n-2, a_1, a_2, ... , a_n-3, a_n-1, a_n {noformat} So the next partition will get: {noformat} a_n-2, a_1, a_2, ... , a_n-4, a_n-3 {noformat} So while sorted data will yield a series of optimal partitions, nearly sorted data like this can cause the sort to fall into a degenerate case. Among the suggestions to ameliorate this: # Consider the median and two random offsets for the median-of-three partitioning (or three random offsets, etc.) # Always pick a random pivot # After swapping the pivot into place, swap what it replaced into a random position in the left partition Randomizing the input data makes this case far less common and Introsort regards it as an inevitable, degenerate case; both are also sound additions. > QuickSort may get into unbounded recursion > ------------------------------------------ > > Key: HADOOP-3442 > URL: https://issues.apache.org/jira/browse/HADOOP-3442 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Affects Versions: 0.17.0 > Reporter: Runping Qi > Assignee: Chris Douglas > Attachments: 3442-0.patch, 3442-0v17.patch, CheckSortBuffer.java, HADOOP-3442.patch, overflow.zip, spillbuffers.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.