hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wiley <kwi...@keithwiley.com>
Subject Re: Force job to use all reducers evenly?
Date Fri, 25 Mar 2011 14:18:53 GMT
On Mar 25, 2011, at 2:02 AM, Harsh J wrote:

> On Fri, Mar 25, 2011 at 12:48 PM, Keith Wiley <kwiley@keithwiley.com> wrote:
>> Say my mappers produce at most (or precisely) 4 output keys.  Say I designate the
job to have at least (or precisely) 4 reducers.  I have noticed that it is not guaranteed
that all four reducers will be used, one per key.  Rather, it is entirely likely that one
reducer won't be used at all and another will receive two sets of keys, first receiving all
values of one key, then all values of the other key.
> This is merely the side effect of using a hash-based partitioner when

Ah, of course.  I'm know how the hash partitioners work, I just didn't realize (ok, remember)
that the default partitioner works that way.

Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"I used to be with it, but then they changed what it was.  Now, what I'm with
isn't it, and what's it seems weird and scary to me."
                                           --  Abe (Grandpa) Simpson

View raw message