Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates
 209.85.216.176 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAOF-KfgzCqp9JTkXYodYqRtAt+dH-zVDAni3cH7k892J8K8iHg@mail.gmail.com>
References: 
 <CAOF-KfgzCqp9JTkXYodYqRtAt+dH-zVDAni3cH7k892J8K8iHg@mail.gmail.com>
From: Harsh J <harsh@cloudera.com>
Date: Fri, 11 May 2012 16:38:31 +0530
Message-ID: 
 <CAOcnVr3-kNx8c-BF4QfTvept+LEpyugUytEaatNe11UDq8NaZA@mail.gmail.com>
Subject: Re: freeze a mapreduce job
To: common-user@hadoop.apache.org
Content-Type: text/plain; charset=ISO-8859-1

I do not know about the per-host slot control (that is most likely not
supported, or not yet anyway - and perhaps feels wrong to do), but the
rest of the needs can be doable if you use schedulers and
queues/pools.

If you use FairScheduler (FS), ensure that this job always goes to a
special pool and when you want to freeze the pool simply set the
pool's maxMaps and maxReduces to 0. Likewise, control max simultaneous
tasks as you wish, to constrict instead of freeze. When you make
changes to the FairScheduler configs, you do not need to restart the
JT, and you may simply wait a few seconds for FairScheduler to refresh
its own configs.

More on FS at http://hadoop.apache.org/common/docs/current/fair_scheduler.html

If you use CapacityScheduler (CS), then I believe you can do this by
again making sure the job goes to a specific queue, and when needed to
freeze it, simply set the queue's maximum-capacity to 0 (percentage)
or to constrict it, choose a lower, positive percentage value as you
need. You can also refresh CS to pick up config changes by refreshing
queues via mradmin.

More on CS at http://hadoop.apache.org/common/docs/current/capacity_scheduler.html

Either approach will not freeze/constrict the job immediately, but
should certainly prevent it from progressing. Meaning, their existing
running tasks during the time of changes made to scheduler config will
continue to run till completion but further tasks scheduling from
those jobs shall begin seeing effect of the changes made.

P.s. A better solution would be to make your job not take as many
days, somehow? :-)

On Fri, May 11, 2012 at 4:13 PM, Rita <rmorgan466@gmail.com> wrote:
> I have a rather large map reduce job which takes few days. I was wondering
> if its possible for me to freeze the job or make the job less intensive. Is
> it possible to reduce the number of slots per host and then I can increase
> them overnight?
>
>
> tia
>
> --
> --- Get your facts first, then you can distort them as you please.--


-- 
Harsh J