hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Prakash <ravi...@ymail.com>
Subject Re: dynamically resizing the Hadoop cluster?
Date Thu, 24 Oct 2013 18:04:10 GMT
Hi Nan!

Usually nodes are decommissioned slowly over some period of time so as not to disrupt the
running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated
blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline
at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available
(even without speculative execution. Speculative execution is for something else). 


On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zhunansjtu@gmail.com> wrote:
Hi, all

I’m running a Hadoop cluster on AWS EC2, 

I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution
to achieve this? 

E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances
(if some tasks are just running on them, can I rely on the speculative execution to re-run
them on the other nodes?)

I cannot use EMR, since I’m running a customized version of Hadoop 


Nan Zhu
School of Computer Science,
McGill University
View raw message