hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From java8964 <java8...@hotmail.com>
Subject RE: How to get the max number of reducers in Yarn
Date Sun, 05 Oct 2014 14:03:57 GMT
You should setNumberReducerTask in your job, just there is no such max reducer count in the
Yarn any more.
Setting reducer count is kind of art, instead of science. 
I think there is only one rule about it, don't set the reducer number larger than the reducer
input group count.
Set the reducer number larger than the reducer input group count will make some reducers have
no data to run, then just waste the resource. Other than that, it only depends how fast you
want your job to finish vs how many other jobs running in the cluster at the same time.
In my experience, in a multi tenants cluster, set it as high as close to the reducer input
group count is a good choice. 
This will require your job asking as many reducers as possible, but not more than the input
group count.This will make each reducer process as least input groups as possible, so the
reducer can be finished faster, so other jobs can get a fair chance for the reducer slot.
One problem with the above setting is that your job could use all the reducers in the beginning.
If your job is really big, your initial reducer tasks could take very long to finish, make
your job look bad in the cluster :-)
Yong

> Date: Sun, 5 Oct 2014 15:05:05 +0200
> From: gortiz@pragsis.com
> To: user@hadoop.apache.org
> Subject: Re: How to get the max number of reducers in Yarn
> 
> 
> Thanks for all your answers. 
> 
> So, if I don't ask for any concrete number of reduce, and I don't call setNumberReduceTask,
how many reduces would I get?? the default value??
> 
> If I want to get the maximum number of reducers possible on any time, should I just set
the number to maximum integer and RM would get me the maximum value on that time? I understand
if I ask for one million of reducers but it can just give me 16, it doesn't produce any error.
> 
> 
> 
> ----- Mensaje original -----
> De: "Ramil Malikov" <vivalalabor@gmail.com>
> Para: user@hadoop.apache.org
> Enviados: Viernes, 3 de Octubre 2014 15:36:48
> Asunto: Re: How to get the max number of reducers in Yarn
> 
> Hi.
> 
> (For Hadoop 2.2.0)
> 
> Nope. Number of mappers depends on number of splits (392 line of 
> JobSubmitter).
> Number of reducers depends on property mapreduce.job.reduces.
> 
> So, you can setup like this:
> 
> final Configuration configuration = new Configuration();
> configuration.set("mapreduce.job.reduces", "NUMBER_OF_REDUCERS", "COMMENT");
> 
> Make sure that you don't overwrite this configuration during job setup.
> 
> On 10/03/2014 04:29 PM, gortiz wrote:
> > I have been working with MapReduce1, (JobTracker and TaskTrakers).
> > Some of my jobs I want to define the number of reduces to the maximum 
> > capacity of my cluster.
> >
> > I did it with this:
> > int max = new JobClient(new 
> > JobConf(jConf)).getClusterStatus().getMaxReduceTasks();
> > Job job = new Job(jConf, this.getClass().getName());
> > job.setNumReduceTasks(max);
> >
> > Now, I want to work with YARN and it seems that it doesn't work. I 
> > think that YARN manages the number of reducers in real time depending 
> > of the resources it has available. The method getMaxReduceTasks it 
> > returns me just two.
> >  don't know if there's another way to set the number the reducer to 
> > the real capacity of the cluster or what I'm doing wrong. I guess that 
> > if I don't use setNumReduceTaskm, it'll get one because the default 
> > value.
> > AVISO CONFIDENCIAL\nEste correo y la información contenida o adjunta 
> > al mismo es privada y confidencial y va dirigida exclusivamente a su 
> > destinatario. Pragsis informa a quien pueda haber recibido este correo 
> > por error que contiene información confidencial cuyo uso, copia, 
> > reproducción o distribución está expresamente prohibida. Si no es Vd. 
> > el destinatario del mismo y recibe este correo por error, le rogamos 
> > lo ponga en conocimiento del emisor y proceda a su eliminación sin 
> > copiarlo, imprimirlo o utilizarlo de ningún modo.\nCONFIDENTIALITY 
> > WARNING.\nThis message and the information contained in or attached to 
> > it are private and confidential and intended exclusively for the 
> > addressee. Pragsis informs to whom it may receive it in error that it 
> > contains privileged information and its use, copy, reproduction or 
> > distribution is prohibited. If you are not an intended recipient of 
> > this E-mail, please notify the sender, delete it and do not read, act 
> > upon, print, disclose, copy, reta
> > in or redistribute any portion of this E-mail
> AVISO CONFIDENCIAL\nEste correo y la información contenida o adjunta al mismo es privada
y confidencial y va dirigida exclusivamente a su destinatario. Pragsis informa a quien pueda
haber recibido este correo por error que contiene información confidencial cuyo uso, copia,
reproducción o distribución está expresamente prohibida. Si no es Vd. el destinatario del
mismo y recibe este correo por error, le rogamos lo ponga en conocimiento del emisor y proceda
a su eliminación sin copiarlo, imprimirlo o utilizarlo de ningún modo.\nCONFIDENTIALITY
WARNING.\nThis message and the information contained in or attached to it are private and
confidential and intended exclusively for the addressee. Pragsis informs to whom it may receive
it in error that it contains privileged information and its use, copy, reproduction or distribution
is prohibited. If you are not an intended recipient of this E-mail, please notify the sender,
delete it and do not read, act upon, print, disclose, copy, reta
>  in or redistribute any portion of this E-mail.
 		 	   		  
Mime
View raw message