hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Loddengaard <a...@cloudera.com>
Subject Re: Customizing machines to use for different jobs
Date Thu, 04 Jun 2009 18:06:03 GMT
Hi Raakhi,

Unfortunately there is no built-in way of doing this.  You'd have to
instantiate two entirely separate Hadoop clusters to accomplish what you're
trying to do, which isn't an uncommon thing to do.

I'm not sure why you're hoping to have this behavior, but the fair share
scheduler might be helpful to you.  It let's you essentially divvy up your
cluster into queues, where each queue has its own "chunk" of the cluster.
When resources are available outside of the "chunk," then jobs can span into
other queues' space.

Cloudera's Distribution for Hadoop (<http://www.cloudera.com/hadoop>)
includes the fair share scheduler.  I recommend using our distribution,
otherwise here is the fair share JIRA:


Hope this helps,


On Wed, Jun 3, 2009 at 11:34 PM, Rakhi Khatwani <rakhi.khatwani@gmail.com>wrote:

> Hi,
> Can we specify which subset of machines to use for different jobs? E.g. We
> set machine A as namenode, and B, C, D as datanodes. Then for job 1, we
> have
> a mapreduce that runs on B & C and for job 2, the map-reduce runs on C & D.
> Regards,
> Raakhi

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message