Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 42106 invoked from network); 10 Jun 2008 07:38:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Jun 2008 07:38:42 -0000 Received: (qmail 63265 invoked by uid 500); 10 Jun 2008 07:38:39 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 63185 invoked by uid 500); 10 Jun 2008 07:38:39 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 63107 invoked by uid 99); 10 Jun 2008 07:38:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jun 2008 00:38:38 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Jun 2008 07:37:57 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 5C5E7234C13C for ; Tue, 10 Jun 2008 00:37:45 -0700 (PDT) Message-ID: <1041417723.1213083465376.JavaMail.jira@brutus> Date: Tue, 10 Jun 2008 00:37:45 -0700 (PDT) From: "Chris Douglas (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Updated: (HADOOP-3442) QuickSort may get into unbounded recursion In-Reply-To: <1674516953.1211579516126.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Douglas updated HADOOP-3442: ---------------------------------- Attachment: 3442-3.patch bq. QuickSort.getMaxDepth can be more aggressive: since the index are int, we have n <= 2^32 and lg n <= 32. So, we could safely let QuickSort.getMaxDepth be something around (lg n)^3 or it should depend on the maximum size of system stack. The 2*log\(n) heuristic was intended to bail out of a worst case, not to protect against the StackOverflowError. The recursion on the system stack is already limited to log\(n), but quicksort will always have O(n^2) cases that this needs to handle. I'm happy to discuss an appropriate value for k, but k*log\(n) seems appropriate to this purpose. bq. HeapSort is a singleton. Define a static variable HeapSort.INSTANCE. Then, we don't have to new HeapSort() in QuickSort. bq. There are quite a few sorting classes/interfaces. Consider create a new package for them. Of course, you're right; all the algorithms should be singletons, and further, having sort algorithms as objects at all is not the best design. Still, given that MapTask remains the only user, adding factories, moving these into a separate package, etc. seems like overkill at this point. Would it be sufficient to add a singleton instance of HeapSort to QuickSort? bq. public methods HeapSort.sort(IndexedSortable s, int p, int r) and QuickSort.sort(...) need javadoc. OK. bq. Remove BatcherSort in this patch. Create a new issue if it is useful. Fair enough. BatcherSort is quick for small sets of data (particularly when compared to QuickSort, which uses an insertion sort for small partitions), but small datasets aren't exactly a typical use case. :) The quicksort implementation might also benefit from some adaptive partitioning, but that should probably be left to a separate issue. > QuickSort may get into unbounded recursion > ------------------------------------------ > > Key: HADOOP-3442 > URL: https://issues.apache.org/jira/browse/HADOOP-3442 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Affects Versions: 0.17.0 > Reporter: Runping Qi > Assignee: Chris Douglas > Priority: Blocker > Fix For: 0.17.1, 0.18.0 > > Attachments: 3442-0.patch, 3442-0v17.patch, 3442-1.patch, 3442-2.patch, 3442-3.patch, CheckSortBuffer.java, HADOOP-3442.patch, overflow.zip, spillbuffers.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.