hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Sonnenberg <steveis...@gmail.com>
Subject Re: All reducers are not being utilized
Date Thu, 02 Aug 2012 17:11:19 GMT
Harsh,
This may be similar to the case I was observing, when no tasks were created
on my 2nd node. It was the simple 'grep' example with only 6 input files.
-steve

On Thu, Aug 2, 2012 at 7:13 AM, Harsh J <harsh@cloudera.com> wrote:

> If you are speaking of Reduce tasks, then schedulers do try to spread them
> across (by attempting to schedule only 1 reduce task per heartbeat/round).
>
> If you speak of data distribution equality, that depends on your data
> (remember that reducers process key groups, and keys can carry skews
> depending on input data IRL) and your partitioner
> implementation/partitioning needs.
>
>
> On Thu, Aug 2, 2012 at 4:35 PM, Saurabh Bajaj <Saurabh.Bajaj@mu-sigma.com>wrote:
>
>>  Okay. One more question.****
>>
>> Can we somehow make sure that all the reducers are equally split across
>> all the nodes? ****
>>
>> ** **
>>
>> Saurabh****
>>
>> ** **
>>
>> *From:* Harsh J [mailto:harsh@cloudera.com]
>> *Sent:* Thursday, August 02, 2012 4:05 PM
>> *To:* mapreduce-user@hadoop.apache.org
>> *Subject:* Re: All reducers are not being utilized****
>>
>> ** **
>>
>> Saurabh,****
>>
>> ** **
>>
>> I do not see you talk about defining a custom Partitioner that can
>> guarantee such perfect key distribution. The default partitioner is the
>> HashPartitioner that can only guarantee randomized distribution (as it is
>> key data specific). Hence, your test here with just 3 keys is not really a
>> good way to test key distribution to reducers with a HashPartition. Try it
>> out with a large data set to see for real.****
>>
>> On Thu, Aug 2, 2012 at 3:55 PM, Saurabh Bajaj <Saurabh.Bajaj@mu-sigma.com>
>> wrote:****
>>
>> Hi everyone, ****
>>
>>  ****
>>
>> I was running a MR job in java and this scenario happened:****
>>
>>  ****
>>
>> *Case 1:*****
>>
>> Number of distinct output keys from mapper = 3****
>>
>> Expected # of reducers = 3****
>>
>> Defined set # of reducers to be called = 2****
>>
>> *Expected outcome:*****
>>
>> # of reducers spawned = 2****
>>
>> # of keys processed under first reducer = 1****
>>
>> # of keys processed under second reducer = 2****
>>
>> *Observed outcome:*****
>>
>> # of keys processed under first reducer = 3****
>>
>> # of keys processed under second reducer = 0****
>>
>>  ****
>>
>>  ****
>>
>>  ****
>>
>> *Case 2:*****
>>
>> Number of distinct output keys from mapper = 3****
>>
>> Expected # of reducers = 3****
>>
>> Defined set # of reducers to be called = 3****
>>
>> *Expected outcome:*****
>>
>> # of reducers spawned = 3****
>>
>> # of keys processed under first reducer = 1****
>>
>> # of keys processed under second reducer = 1****
>>
>> # of keys processed under third reducer = 1****
>>
>> * *****
>>
>> *Observed outcome:*****
>>
>> # of reducers spawned = 3****
>>
>> # of keys processed under first reducer = 2****
>>
>> # of keys processed under second reducer = 0****
>>
>> # of keys processed under third reducer = 1****
>>
>>  ****
>>
>>  ****
>>
>> Any idea why all the reducers are not utilized?****
>>
>>  ****
>>
>> Saurabh Bajaj *|* Senior Business Analyst *|* +91 9986588089 *|*
>> www.mu-sigma.com* **|*****
>>
>>  ****
>>
>> ** **
>>  ------------------------------
>>
>> This email message may contain proprietary, private and confidential
>> information. The information transmitted is intended only for the person(s)
>> or entities to which it is addressed. Any review, retransmission,
>> dissemination or other use of, or taking of any action in reliance upon,
>> this information by persons or entities other than the intended recipient
>> is prohibited and may be illegal. If you received this in error, please
>> contact the sender and delete the message from your system.
>>
>> Mu Sigma takes all reasonable steps to ensure that its electronic
>> communications are free from viruses. However, given Internet
>> accessibility, the Company cannot accept liability for any virus introduced
>> by this e-mail or any attachment and you are advised to use up-to-date
>> virus checking software.****
>>
>>
>>
>> ****
>>
>> ** **
>>
>> --
>> Harsh J****
>>
>> ------------------------------
>> This email message may contain proprietary, private and confidential
>> information. The information transmitted is intended only for the person(s)
>> or entities to which it is addressed. Any review, retransmission,
>> dissemination or other use of, or taking of any action in reliance upon,
>> this information by persons or entities other than the intended recipient
>> is prohibited and may be illegal. If you received this in error, please
>> contact the sender and delete the message from your system.
>>
>> Mu Sigma takes all reasonable steps to ensure that its electronic
>> communications are free from viruses. However, given Internet
>> accessibility, the Company cannot accept liability for any virus introduced
>> by this e-mail or any attachment and you are advised to use up-to-date
>> virus checking software.
>>
>
>
>
> --
> Harsh J
>



-- 
Steve Sonnenberg

Mime
View raw message