From stdcxx-dev-return-3233-apmail-incubator-stdcxx-dev-archive=incubator.apache.org@incubator.apache.org Wed Apr 25 21:30:37 2007 Return-Path: Delivered-To: apmail-incubator-stdcxx-dev-archive@www.apache.org Received: (qmail 11901 invoked from network); 25 Apr 2007 21:30:36 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 25 Apr 2007 21:30:36 -0000 Received: (qmail 43659 invoked by uid 500); 25 Apr 2007 21:30:43 -0000 Delivered-To: apmail-incubator-stdcxx-dev-archive@incubator.apache.org Received: (qmail 43625 invoked by uid 500); 25 Apr 2007 21:30:43 -0000 Mailing-List: contact stdcxx-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: stdcxx-dev@incubator.apache.org Delivered-To: mailing list stdcxx-dev@incubator.apache.org Received: (qmail 43609 invoked by uid 99); 25 Apr 2007 21:30:42 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Apr 2007 14:30:42 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 Apr 2007 14:30:35 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 52E27714076 for ; Wed, 25 Apr 2007 14:30:15 -0700 (PDT) Message-ID: <15985785.1177536615336.JavaMail.jira@brutus> Date: Wed, 25 Apr 2007 14:30:15 -0700 (PDT) From: "Joshua Lehrer (JIRA)" To: stdcxx-dev@incubator.apache.org Subject: [jira] Commented: (STDCXX-397) std::sort introsort implementation error In-Reply-To: <12581384.1177047075317.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/STDCXX-397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12491788 ] Joshua Lehrer commented on STDCXX-397: -------------------------------------- By the way, a killer set can be generated by: 1] change the selection of __cut to be 'first' instead of median of three 2] pass in an already sorted array The algorithm will explode the stack. Use my corrected code (still with the selection of __cut being "first") and sort the same already sorted array, and the algorithm will complete in NLogN time. -josh > std::sort introsort implementation error > ---------------------------------------- > > Key: STDCXX-397 > URL: https://issues.apache.org/jira/browse/STDCXX-397 > Project: C++ Standard Library > Issue Type: Bug > Components: 25. Algorithms > Affects Versions: 4.1.3 > Environment: Bug in algorithm.cc effects all platforms > Reporter: Joshua Lehrer > > introsort is designed to detect when an input set would push quicksort into its worst-case scenario N^2 and fall back on a slower, yet still NLogN algorithm. > The implementation in __introsort_loop has a bug, however, and it fails to catch all of the scenarios. While I can not supply an exact input set to demonstrate the bug, I can explain the bug very easily. > First, allow me to paste in the code: > // David R. Musser's Introspective Sorting algorithm > // O(N * log (N)) worst case complexity > _EXPORT > template > void __introsort_loop (_RandomAccessIter __first, _RandomAccessIter __last, > _Dist __max_depth, _Compare __comp) > { > for (; __last - __first > __rw_threshold; __max_depth /= 2) { > if (0 == __max_depth) { > __partial_sort (__first, __last, __last, > _RWSTD_VALUE_TYPE (_RandomAccessIter), __comp); > break; > } > _RandomAccessIter __cut = > __unguarded_partition (__first, __last, > __median (*__first, > *(__first + (__last - __first) /2), > *(__last - 1), __comp), __comp); > // limit the depth of the recursion tree to log2 (last - first) > // where first and last are the initial values passed in from sort() > __introsort_loop (__cut, __last, __max_depth, __comp); > __last = __cut; > } > } > the variable '__max_depth' is supposed to be cut in half on each subsequent "recursive" call. Once it reaches zero, LogN recurisve calls have been made, and the algorithm falls back on a different sorting algorithm for the remainder. > The algorithm, as implemented, uses real recursion and tail recursion. > First, the pivot is selected, the pivot is done, and the algorithm has a left and a right half, hopefully balanced. > Consider what happens for the LEFT half, which is done using tail recursion. '__last' gets assigned '__cut', then the code goes to the top of the 'for' loop. The test condition of the loop is run, which divides '__max_depth' by two, bringing it closer to zero. > Now consider what happens for the RIGHT half, which is done using real recursion. The function is called recurisvely on the right. __max_depth is NOT cut in half. > What would happen if a poor pivot was selected causing the right half to be large and the left half to be small? What if that happens again and again? The real-recursion case is failing to decrement __max_depth until it starts working on the left half. You can see how if the algorithm continually built right-halves that were relatively large that __max_depth never gets decremented, and the algorithm never detects that it has made LogN recurisve calls. > I believe the proper fix is as follows: > // David R. Musser's Introspective Sorting algorithm > // O(N * log (N)) worst case complexity > _EXPORT > template > void __introsort_loop (_RandomAccessIter __first, _RandomAccessIter __last, > _Dist __max_depth, _Compare __comp) > { > for (; __last - __first > __rw_threshold; ) { > if (0 == __max_depth) { > __partial_sort (__first, __last, __last, > _RWSTD_VALUE_TYPE (_RandomAccessIter), __comp); > break; > } > _RandomAccessIter __cut = > __unguarded_partition (__first, __last, > __median (*__first, > *(__first + (__last - __first) /2), > *(__last - 1), __comp), __comp); > // limit the depth of the recursion tree to log2 (last - first) > // where first and last are the initial values passed in from sort() > __max_depth /= 2; > __introsort_loop (__cut, __last, __max_depth, __comp); > __last = __cut; > } > } > "__max_depth/=2" is removed from the "for" loop and placed just above the two recursive calls. > This fixes the worst-case sample set that I have generated. > I look forward to your response, > joshua lehrer > http://www.lehrerfamily.com/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.