hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: force minimum number of nodes to run a job?
Date Tue, 03 Jul 2012 02:07:55 GMT
If you're talking in per-machine-slot terms, it is possible to do if
you use the Capacity Scheduler, and set a memory requirement worthy of
4 slots for your job. This way CS will reserve 4 slots for running a
single task (on a single task tracker).

If you are instead asking for a way to not run tasks one by one, but
rather run them all in parallel (across machine) but otherwise not run
at all, thats not directly possible to do via the MR framework, but
you may hang your task with your own conditions and only invoke all to
begin if all have entered running modes. Using ZK should let you do
this. Alternatively, consider the YARN framework that gives you more
granular control on flow execution of tasks if you need that.

On Tue, Jul 3, 2012 at 5:48 AM, Yang <teddyyyy123@gmail.com> wrote:
> let's say my job can run on 4 mapper slots, but if there is only 1 slot
> available,
> I don't want them to run one by one, and have to wait till the time that at
> least 4 slots are available.
> is it possible to force hadoop to do this?
> thanks!
> yang

Harsh J

View raw message