hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rahul Bhattacharjee <rahul.rec....@gmail.com>
Subject Re: Re: How to balance reduce job
Date Sun, 21 Apr 2013 14:41:56 GMT
You can use input sampler , and you have to plug a custom partitioner which
would ensure that all reducers have near-equal pairs to process. The input
sampler goes over the sampled data before the execution of the job starts.
I also had some doubt , but got no response.

Thanks,
Rahul


On Wed, Apr 17, 2013 at 12:17 PM, rauljin <liujin666jin@sina.com> wrote:

> **
>      <property>
>         <name>mapred.tasktracker.map.tasks.maximum</name>
>         <value>4</value>
>     </property>
>
>     <property>
>         <name>mapred.tasktracker.reduce.tasks.maximum</name>
>         <value>4</value>
>     </property>
>
>    I am not clear the number  of reuce slots in each Task tracker.Is it
> define in the configuration?
>
>
>
>
>
> ------------------------------
> rauljin
>
>  *From:* bejoy.hadoop <bejoy.hadoop@gmail.com>
> *Date:* 2013-04-17 13:09
> *To:* user <user@hadoop.apache.org>; liujin666jin <liujin666jin@sina.com>
> *Subject:* Re: How to balance reduce job
>  Hi Rauljin
>
> Few things to check here.
> What is the number of reduce slots in each Task Tracker? What is the
> number of reduce tasks for your job?
> Based on the availability of slots the reduce tasks are scheduled on TTs.
>
> You can do the following
> Set the number of reduce tasks to 8 or more.
> Play with the number of slots (not very advisable for tweaking this on a
> job level )
>
> The reducers are scheduled purely based on the slot availability so it
> won't be that easy to ensure that all TT are evenly loaded with same number
> of reducers.
> Regards
> Bejoy KS
>
> Sent from remote device, Please excuse typos
> ------------------------------
> *From: *rauljin <liujin666jin@sina.com>
> *Date: *Wed, 17 Apr 2013 12:53:37 +0800
> *To: *user@hadoop.apache.org<user@hadoop.apache.org>
> *ReplyTo: *user@hadoop.apache.org
> *Subject: *How to balance reduce job
>
> 8 datanode in my hadoop cluseter ,when running reduce job,there is only 2
> datanode running the job .
>
> I want to use the 8 datanode to run the reduce job,so I can balance the
> I/O press.
>
> Any ideas?
>
> Thanks.
>
> ------------------------------
> rauljin
>

Mime
View raw message