Return-Path: Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: (qmail 51395 invoked from network); 16 Nov 2009 19:26:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Nov 2009 19:26:16 -0000 Received: (qmail 52778 invoked by uid 500); 16 Nov 2009 19:26:15 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 52692 invoked by uid 500); 16 Nov 2009 19:26:15 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 52682 invoked by uid 99); 16 Nov 2009 19:26:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Nov 2009 19:26:15 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mailinglists19@gmail.com designates 72.14.220.152 as permitted sender) Received: from [72.14.220.152] (HELO fg-out-1718.google.com) (72.14.220.152) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Nov 2009 19:26:10 +0000 Received: by fg-out-1718.google.com with SMTP id d23so2328588fga.11 for ; Mon, 16 Nov 2009 11:25:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=CvNPTg5YNMj6gFTv1o5/I15mewxqbdYibDZkBB+Rw4A=; b=FPIq1wCKFd5Io8PbXwz2Pdxu0yKtLYHystI09AeCBH8L8dxfzbX4CX9bmAd55QSsbL ddyUwIukEC/du8AEZdX7E5m46R9UgPn7rB61+YwxDW1nJ5XD3BLWDhM3xfZOs6kmPuYr PCXUWSGC48iYN3YzkbeW5JuWhNgipbvGTr3ik= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=H/2SyvcdHmzXGwBL/pYIWKw5G74cz4Sv36vrXeq3VNxPhdjbQsWVOYyIvrtZXCstCO OdpRpRJ+dTT4IdHrb50FQSEhAnuSeAEKpRDx3dH7WrSZ8YqFnqfAoupGASaSFBTO97s7 0K944TJzDW0dwUAyFJh0HoMFLlk7u4vBYZaFQ= MIME-Version: 1.0 Received: by 10.216.85.194 with SMTP id u44mr485638wee.65.1258399548605; Mon, 16 Nov 2009 11:25:48 -0800 (PST) In-Reply-To: <895334.52215.qm@web32203.mail.mud.yahoo.com> References: <1b3669280911161108p4fbe9d20i12dd564a121aecac@mail.gmail.com> <895334.52215.qm@web32203.mail.mud.yahoo.com> Date: Mon, 16 Nov 2009 11:25:48 -0800 Message-ID: <1eabbac30911161125y32715fceub0d52fe98e68dd7e@mail.gmail.com> Subject: Re: Map-reduce sorting on multiple keys From: Something Something To: general@hadoop.apache.org Content-Type: multipart/alternative; boundary=0016e6d7e744148457047881fae3 --0016e6d7e744148457047881fae3 Content-Type: text/plain; charset=ISO-8859-1 Goutham, Can you please take a look at my email titled.. "Custom Writable not working"? (I just sent it a few minutes ago.) It's similar to this one. The only difference is, instead of IntWritable I am using Text. But the sort in Map is not working as expected. Can you tell me why? Thanks. On Mon, Nov 16, 2009 at 11:16 AM, Rajiv Maheshwari wrote: > Thanks, appreciate it. > > Rajiv > > --- On Mon, 11/16/09, goutham patnaik wrote: > > From: goutham patnaik > Subject: Re: Map-reduce sorting on multiple keys > To: general@hadoop.apache.org > Date: Monday, November 16, 2009, 11:08 AM > > Rajiv, > > You could write your own class which implements the WritableComparable > interface and use this as your key class - all u need to do is implement > the write, readFields and compareTo methods - the map will then sort your > keys using this method : > > public class TupleKey implements WritableComparable { > IntWritable k1; > IntWritable k2; > ....... > } > > On Mon, Nov 16, 2009 at 9:13 AM, Rajiv Maheshwari >wrote: > > > Hi everyone, > > > > I have a need to sort the output of map on 2 keys (key1, key2) - first on > > key1, then on key2. > > > > Example: > > key1 key2 values > > ------------------------- > > 0001 0001 values... > > 0001 0002 values... > > 0002 0001 values... > > 0002 0005 values... > > > > > > I am thinking of the following solution approach: > > > > Define KEY = key1, key2 /* concatenate keys */. Override default > > HashPartitioner and use only key1 in hashCode computation. > > > > > > public class HashPartitioner implements Partitioner { > > > > public void configure(JobConf job) {} > > > > public int getPartition(K2 key, V2 value, int numPartitions) { > > > > return (key.getKey1().hashCode() & Integer.MAX_VALUE) % > numPartitions; > > } > > } > > > > Would this work? > > > > Does anyone have any better ideas? > > > > Thanks much, > > Rajiv > > > > > > > > > > > > > > > --0016e6d7e744148457047881fae3--