hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stuart White <stuart.whi...@gmail.com>
Subject Re: Confused about partitioning and reducers
Date Sat, 27 Jun 2009 15:30:17 GMT
Please disregard this question.  I think I'm mistaken.

On Sat, Jun 27, 2009 at 10:25 AM, Stuart White <stuart.white1@gmail.com>wrote:

> If I call HashPartitioner.getPartition(), passing a key of 4 and a
> numPartitions of 5, it returns a partition of 4.  (Which is what I would
> expect.)
>
> However, if I have a mapred job, and in my mapper I emit a record with key
> 4, I'm configured to use the HashPartitioner, I have 5 Reducers configured,
> and I'm using the IdentityReducer, the record with key 4 gets handled by
> Reducer #0 (because it gets written out to part-00000).
>
> I would have expected a record with key 4 to be handled by reducer #4 (and
> therefore written to part-00004) because the HashPartitioner returns 4 for a
> key of 4 and a numPartitions of 5.
>
> Obviously I'm missing something here.  What is the logic for deciding which
> partition of records is handled by which reducer instance?
>
> It can't be random, otherwise mapside join wouldn't work.
>
> Thanks.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message