hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Sammer <esam...@cloudera.com>
Subject Re: Hash Partitioner
Date Mon, 24 May 2010 22:06:57 GMT
Deepika:

That sounds very strange. Can you let us know what version of Hadoop
(e.g. Apache 0.20.x, CDH2, etc.) you're running and a bit more about
your hashCode() implementation? When this happens, do you see the same
values for the duplicate key? Did you also implement a grouping
comparator?

The hash partitioner is extremely simple. It basically does
key.hashCode() % numberOfReduces = partition number to which a key is
assigned. If one incorrectly implements a grouping comparator, it's
possible you could see odd behavior, though.

On Mon, May 24, 2010 at 5:35 PM, Deepika Khera <Deepika.Khera@avg.com> wrote:
> Hi,
>
> I am using a HashPartitioner on my key for a map reducer job.  I am wondering how sometimes
2 reducers end up getting the same key ? I have the hashCode method defined for my key.
>
> Also, I have speculative execution turned off for my jobs..
>
> Would appreciate any help.
>
> Thanks,
> Deepika
>



-- 
Eric Sammer
phone: +1-917-287-2675
twitter: esammer
data: www.cloudera.com

Mime
View raw message