Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 37316 invoked from network); 14 Sep 2006 22:34:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 14 Sep 2006 22:34:41 -0000 Received: (qmail 98935 invoked by uid 500); 14 Sep 2006 22:34:40 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 98817 invoked by uid 500); 14 Sep 2006 22:34:40 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 98808 invoked by uid 99); 14 Sep 2006 22:34:40 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [209.237.227.198] (HELO brutus.apache.org) (209.237.227.198) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Sep 2006 15:34:40 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 046A271431F for ; Thu, 14 Sep 2006 22:30:25 +0000 (GMT) Message-ID: <33397188.1158273025015.JavaMail.jira@brutus> Date: Thu, 14 Sep 2006 15:30:25 -0700 (PDT) From: "Doug Cutting (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-531) Need to sort on more than the primary key In-Reply-To: <23568129.1158183922358.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-531?page=comments#action_12434835 ] Doug Cutting commented on HADOOP-531: ------------------------------------- So do HADOOP-485 and HADOOP-531 solve this issue? If so, then we should mark it duplicate, or at least dependent. > Need to sort on more than the primary key > ----------------------------------------- > > Key: HADOOP-531 > URL: http://issues.apache.org/jira/browse/HADOOP-531 > Project: Hadoop > Issue Type: Improvement > Components: contrib/streaming > Affects Versions: 0.5.0 > Reporter: Richard Kasperski > > There are many tasks where I need to have finer control over the ordering in the reduce than a sort on a single key provides. Most of these situations arise when a merge two sources of data and am attaching a single instance of one source to multiple instances of a second source. I know that I can read all the the records with a single key. It's possible that there might be many millions of these making memory demands that cannot be satisfied. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira