hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Prakash <ravi...@ymail.com>
Subject Re: dynamically resizing the Hadoop cluster?
Date Thu, 24 Oct 2013 18:04:10 GMT
Hi Nan!

Usually nodes are decommissioned slowly over some period of time so as not to disrupt the
running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated
blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline
at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available
(even without speculative execution. Speculative execution is for something else). 


HTH
Ravi




On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zhunansjtu@gmail.com> wrote:
 
Hi, all

I’m running a Hadoop cluster on AWS EC2, 

I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution
to achieve this? 

E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances
(if some tasks are just running on them, can I rely on the speculative execution to re-run
them on the other nodes?)

I cannot use EMR, since I’m running a customized version of Hadoop 

Best,

-- 
Nan Zhu
School of Computer Science,
McGill University
Mime
View raw message