hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Lewis <lordjoe2...@gmail.com>
Subject Re: question for understanding partitioning
Date Tue, 18 Jan 2011 20:33:15 GMT
1) you need not have 26 reducers but you want 26 partitions - you might send
    int reducer = character % Math.min(26,nreducers);  // this insures that
all items with A go to a single reducer but with less than 26 reducers some
reducers will get other characters -
   The scheme is inefficient as some letters are much more common than
others so some reducers will get a lot more data - you also cannot use more
than 26 reducers with this scheme.

   It is a good idea to think carefully about why you want all items with A
to go to a single reducer

On Tue, Jan 18, 2011 at 12:09 PM, Mapred Learn <mapred.learn@gmail.com>wrote:

> hi,
> I have a basic question. How does partitioning work ?
> Following is a scenario I created to put up my question.
> i) A parttition function is defined as partitioning map-output based on
> aphabetical sorting of the key i.e. a partition for keys starting with 'a',
> partition for keys starting with 'b'... partition for keys starting with
> 'z'. So, it means each map may have atmost 26 partitions ?
> ii) What input will Reducer get ? Reducer will get first partition
> (partition starting with 'a') of all the maps as it's input ? Does it mean
> we will need 26 reduce tasks ?
> Any inputs/documents/examples on this are appreciated. I am bit confused by
> this.
> Thanks in advance

Steven M. Lewis PhD
4221 105th Ave Ne
Kirkland, WA 98033
206-384-1340 (cell)
Institute for Systems Biology
Seattle WA

View raw message