incubator-stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshua Lehrer (JIRA)" <j...@apache.org>
Subject [jira] Commented: (STDCXX-397) std::sort introsort implementation error
Date Sun, 29 Apr 2007 21:20:15 GMT

    [ https://issues.apache.org/jira/browse/STDCXX-397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12492602
] 

Joshua Lehrer commented on STDCXX-397:
--------------------------------------

Martin,

It is very difficult to come up with a test case given the complexity of the issue.  I have
been trying and have yet to be able to.  I will keep looking.

However, if you change the selection of the __cut to be 'first' instead of the median of three
then sort an already sorted array, you will see the number of comparisons explode and O(N)
recursive calls.

Looking at the code, you can clearly see that the recursion on the right half never has __max_depth
decreased by half, the /2 is only ever applied to the left half.  Therefore, the algorithm
will detect too many recursive calls to the left, but not to the right.

I will report back if I can come up with an example using the median-of-three pivot selector.

-josh

> std::sort introsort implementation error
> ----------------------------------------
>
>                 Key: STDCXX-397
>                 URL: https://issues.apache.org/jira/browse/STDCXX-397
>             Project: C++ Standard Library
>          Issue Type: Bug
>          Components: 25. Algorithms
>    Affects Versions: 4.1.3
>         Environment: Bug in algorithm.cc effects all platforms
>            Reporter: Joshua Lehrer
>
> introsort is designed to detect when an input set would push quicksort into its worst-case
scenario N^2 and fall back on a slower, yet still NLogN algorithm.
> The implementation in __introsort_loop has a bug, however, and it fails to catch all
of the scenarios.  While I can not supply an exact input set to demonstrate the bug, I can
explain the bug very easily.
> First, allow me to paste in the code:
> // David R. Musser's Introspective Sorting algorithm
> // O(N * log (N)) worst case complexity
> _EXPORT
> template <class _RandomAccessIter, class _Dist, class _Compare>
> void __introsort_loop (_RandomAccessIter __first, _RandomAccessIter __last,
>                        _Dist __max_depth, _Compare __comp)
> {
>     for (; __last - __first > __rw_threshold; __max_depth /= 2) {
>         if (0 == __max_depth) {
>             __partial_sort (__first, __last, __last,
>                             _RWSTD_VALUE_TYPE (_RandomAccessIter), __comp);
>             break;
>         }
>         _RandomAccessIter __cut =
>             __unguarded_partition (__first, __last,
>                                    __median (*__first,
>                                              *(__first + (__last - __first) /2),
>                                              *(__last - 1), __comp), __comp);
>         // limit the depth of the recursion tree to log2 (last - first)
>         // where first and last are the initial values passed in from sort()
>         __introsort_loop (__cut, __last, __max_depth, __comp);
>         __last = __cut;
>     }
> }
> the variable '__max_depth' is supposed to be cut in half on each subsequent "recursive"
call.  Once it reaches zero, LogN recurisve calls have been made, and the algorithm falls
back on a different sorting algorithm for the remainder.
> The algorithm, as implemented, uses real recursion and tail recursion.
> First, the pivot is selected, the pivot is done, and the algorithm has a left and a right
half, hopefully balanced.
> Consider what happens for the LEFT half, which is done using tail recursion.  '__last'
gets assigned '__cut', then the code goes to the top of the 'for' loop.  The test condition
of the loop is run, which divides '__max_depth' by two, bringing it closer to zero.
> Now consider what happens for the RIGHT half, which is done using real recursion.  The
function is called recurisvely on the right.  __max_depth is NOT cut in half.
> What would happen if a poor pivot was selected causing the right half to be large and
the left half to be small?  What if that happens again and again?  The real-recursion case
is failing to decrement __max_depth until it starts working on the left half.  You can see
how if the algorithm continually built right-halves that were relatively large that __max_depth
never gets decremented, and the algorithm never detects that it has made LogN recurisve calls.
> I believe the proper fix is as follows:
> // David R. Musser's Introspective Sorting algorithm
> // O(N * log (N)) worst case complexity
> _EXPORT
> template <class _RandomAccessIter, class _Dist, class _Compare>
> void __introsort_loop (_RandomAccessIter __first, _RandomAccessIter __last,
>                        _Dist __max_depth, _Compare __comp)
> {
>     for (; __last - __first > __rw_threshold; ) {
>         if (0 == __max_depth) {
>             __partial_sort (__first, __last, __last,
>                             _RWSTD_VALUE_TYPE (_RandomAccessIter), __comp);
>             break;
>         }
>         _RandomAccessIter __cut =
>             __unguarded_partition (__first, __last,
>                                    __median (*__first,
>                                              *(__first + (__last - __first) /2),
>                                              *(__last - 1), __comp), __comp);
>         // limit the depth of the recursion tree to log2 (last - first)
>         // where first and last are the initial values passed in from sort()
>         __max_depth /= 2;
>         __introsort_loop (__cut, __last, __max_depth, __comp);
>         __last = __cut;
>     }
> }
> "__max_depth/=2" is removed from the "for" loop and placed just above the two recursive
calls.
> This fixes the worst-case sample set that I have generated.
> I look forward to your response,
> joshua lehrer
> http://www.lehrerfamily.com/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message