hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajiv Maheshwari <rajiv...@yahoo.com>
Subject Re: Map-reduce sorting on multiple keys
Date Mon, 16 Nov 2009 20:17:07 GMT
On a second thought implementing WritableComparable will not help in my case. I forgot to mention
that I want to use multiple reducers. Records with same key1 splitting across multiple reducers
will break the logic.

If only 1 reducer is being used, I guess sorting on multiple keys can be accomplished just
by concatenating the keys, unless one has the need to change the compare method.

Thanks,
Rajiv

--- On Mon, 11/16/09, goutham patnaik <goutham.patnaik@gmail.com> wrote:

From: goutham patnaik <goutham.patnaik@gmail.com>
Subject: Re: Map-reduce sorting on multiple keys
To: general@hadoop.apache.org
Date: Monday, November 16, 2009, 11:08 AM

Rajiv,

You could write your own class which implements the WritableComparable
interface and use this as your key class -  all u need to do is implement
the write, readFields and compareTo methods - the map will then sort your
keys using this method :

public class TupleKey implements WritableComparable {
 IntWritable k1;
 IntWritable k2;
.......
}

On Mon, Nov 16, 2009 at 9:13 AM, Rajiv Maheshwari <rajivm01@yahoo.com>wrote:

> Hi everyone,
>
> I have a need to sort the output of map on 2 keys (key1, key2) - first on
> key1, then on key2.
>
> Example:
> key1  key2   values
> -------------------------
> 0001  0001  values...
> 0001  0002  values...
> 0002  0001  values...
> 0002  0005  values...
>
>
> I am thinking of the following solution approach:
>
> Define KEY = key1, key2   /* concatenate keys */. Override default
> HashPartitioner and use only key1 in hashCode computation.
>
>
> public class HashPartitioner<K2, V2> implements Partitioner<K2, V2> {
>
> public void configure(JobConf job) {}
>
> public int getPartition(K2 key, V2 value, int numPartitions) {
>
>     return (key.getKey1().hashCode() & Integer.MAX_VALUE) % numPartitions;
>     }
> }
>
> Would this work?
>
> Does anyone have any better ideas?
>
> Thanks much,
> Rajiv
>
>
>
>
>



      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message