Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 83700 invoked from network); 26 Aug 2008 07:40:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 26 Aug 2008 07:40:24 -0000 Received: (qmail 2208 invoked by uid 500); 26 Aug 2008 07:40:13 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 2172 invoked by uid 500); 26 Aug 2008 07:40:13 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 2156 invoked by uid 99); 26 Aug 2008 07:40:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Aug 2008 00:40:13 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of taiping.du@gmail.com designates 209.85.128.187 as permitted sender) Received: from [209.85.128.187] (HELO fk-out-0910.google.com) (209.85.128.187) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Aug 2008 07:39:15 +0000 Received: by fk-out-0910.google.com with SMTP id 26so1286959fkx.13 for ; Tue, 26 Aug 2008 00:39:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:mime-version:content-type; bh=OOnlaN1qfp0aJlVCWZC9lpQtT8cvc+DyHO6aiLHIO08=; b=W+q6dT2LlqQhevZuNvHBNwN8YmJodMlvK6Ol2zBmsP2oNDDYlJKMe7l6b7z5J6fBsM RCYFz6naUveUSJnmFdE4Qi92IK7LwrKuJWnxn0v2FbmIQYNlBZ40EH+AAIwC6XO1c1Rf 3Oysz2jMnpJzsx2mYp0VZ1WZl2IuzFpk3Phq0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type; b=ODKAQUVXs38jEeNZYWyHxmd5C4bhaCYI8tEK0gpQN45KLBy6q7hDnTF0yIOyepPCOy A6ZhwOYOP31kUOuyexyJM//3ZdcTTbNeZiTxzLOzgIZKQgRpSb40Gm/Ty3OaLyHxle6w tH4oNumBQHPSY0iTS/cRD2jaO3ksx5iJ6z9Sc= Received: by 10.180.249.4 with SMTP id w4mr2681634bkh.79.1219736373690; Tue, 26 Aug 2008 00:39:33 -0700 (PDT) Received: by 10.180.211.16 with HTTP; Tue, 26 Aug 2008 00:39:33 -0700 (PDT) Message-ID: <710ef8220808260039l68b8ab03u5e3fd306c47e9458@mail.gmail.com> Date: Tue, 26 Aug 2008 00:39:33 -0700 From: "charles du" To: core-user@hadoop.apache.org Subject: questions on sorting big files and sorting order MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_1650_20847000.1219736373690" X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_1650_20847000.1219736373690 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Hi all: I would like to sort a large number of records in a big file based on a given field (key). If I run just one reducer, it works fine because the reducer will sort all records based on the key. To increase the sorting performance, I would like to run multiple reducers, how can I guarantee the order of records that got partitioned to different reducers? Also, the default order is ascending. How can I program my reducer to output records in descending order? My key could be IntWritable, or Text. Thanks. -- tp ------=_Part_1650_20847000.1219736373690--