Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 30F8A10C98 for ; Thu, 24 Oct 2013 18:12:03 +0000 (UTC) Received: (qmail 12869 invoked by uid 500); 24 Oct 2013 18:04:45 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 12732 invoked by uid 500); 24 Oct 2013 18:04:44 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 12717 invoked by uid 99); 24 Oct 2013 18:04:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Oct 2013 18:04:43 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [216.109.115.31] (HELO nm44-vm7.bullet.mail.bf1.yahoo.com) (216.109.115.31) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Oct 2013 18:04:35 +0000 Received: from [98.139.212.153] by nm44.bullet.mail.bf1.yahoo.com with NNFMP; 24 Oct 2013 18:04:14 -0000 Received: from [98.139.212.212] by tm10.bullet.mail.bf1.yahoo.com with NNFMP; 24 Oct 2013 18:04:13 -0000 Received: from [127.0.0.1] by omp1021.mail.bf1.yahoo.com with NNFMP; 24 Oct 2013 18:04:13 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 960657.51080.bm@omp1021.mail.bf1.yahoo.com Received: (qmail 15755 invoked by uid 60001); 24 Oct 2013 18:04:13 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ymail.com; s=s1024; t=1382637853; bh=AXs6Ws4+HLmdGsVFxx5SoKe7qnIbsp0eDIPYC6QB33s=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=5SDgTH9ps7dKKfYv6xS/T1BeO4J5A3wXu+m4NVWR+1yxithKzIq4UddSNdi+LjHC+sFefL0cby0Fwgt++OqUZtm5qMZo079qDsTrjLh839dlo++AzNSItgpbcJVcS5x1aDTChaW1CQWqqD2nLxhxGvr3WQBYpOXC1aCREJfR8v4= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=ymail.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=d6NG5m4eiVHHTUQ4Uzmu3OyULtz3eAkAGT6cwZiosdPaGzEaPik11pEuJZpETXhqsuVSV7XTNzSxgomyHEOJhKRFWd4SS5D4CyVYKwOhT3m8ByeK9FONy5BbS4mZth+B/ikcKx5blKltc4T4gXFDrt6UaRQUz4BOA6PYzxN7NW8=; X-YMail-OSG: 7HkHNp0VM1n5LIjhdHJDbbmzk4sHqLHJW6xlHnWCyNry0HP 0A3w32BMolVetcnZo5YzbgL3FP3nmdYFJ2sBCPefWf_A8VtDnY_CcDRy6zHH R.H5ZKg0EeJzsRCtQvUKivad52vCdeL5gyyXcI41g2fpi7yc73ptLNmxH9hF ruzmmV_I2g1uRpfmk1PhARoZkMBy6xv0ZYoNgWgP6keoZgSpAp4m3Qx934fR W1JHdL6fpZDMyWg8H3z3j7jkZ0kRuGEY7hptGog8wORI0jPWUMLQL1UumE8C SbW2WMbrixBADrIi6cFUXGevoVbReV4Lflv61HCJTTFTkKo77VUqimXMdw1L UMA1pRfdm2Q5j2OtN7L6XFrFLiJ3Kp5KrC_MlJvKGOzLT7.kHp4SISc6MhXo uUg4.0Nunpyne.6KUJctPkOyymXLC37ItAg..2fkW1jfzlcUBo8M2RYvc6RI y.KRDG3SS4SovTNKs1R4THBwVITnkCpHODUOb7WTSQGbJEBWU8fUZS1egS9h GBN5raToCjoYmGrcXizVaC8FOWHGOtEVTrKEYV3wF5_M0e2ul2yTKeNSmFb2 m8U37G_du_WS1MoXaOAJ9plq1XakEpskh29vll3wn_irt Received: from [216.145.54.50] by web141203.mail.bf1.yahoo.com via HTTP; Thu, 24 Oct 2013 11:04:10 PDT X-Rocket-MIMEInfo: 002.001,SGkgTmFuIQoKVXN1YWxseSBub2RlcyBhcmUgZGVjb21taXNzaW9uZWQgc2xvd2x5IG92ZXIgc29tZSBwZXJpb2Qgb2YgdGltZSBzbyBhcyBub3QgdG8gZGlzcnVwdCB0aGUgcnVubmluZyBqb2JzLiBXaGVuIGEgbm9kZSBpcyBkZWNvbW1pc3Npb25lZCwgdGhlIE5hbWVOb2RlIG11c3QgcmUtcmVwbGljYXRlIGFsbCB1bmRlci1yZXBsaWNhdGVkIGJsb2Nrcy4gUmF0aGVyIHRoYW4gc3VkZGVubHkgcmVtb3ZlIGhhbGYgdGhlIG5vZGVzLCB5b3UgbWlnaHQgd2FudCB0byB0YWtlIGEgZmV3IG5vZGVzIG9mZmxpbmUBMAEBAQE- X-Mailer: YahooMailWebService/0.8.160.587 References: <99B9D7A61BCB4F0999FC9EEEA1A15087@gmail.com> Message-ID: <1382637850.14709.YahooMailNeo@web141203.mail.bf1.yahoo.com> Date: Thu, 24 Oct 2013 11:04:10 -0700 (PDT) From: Ravi Prakash Reply-To: Ravi Prakash Subject: Re: dynamically resizing the Hadoop cluster? To: "user@hadoop.apache.org" In-Reply-To: <99B9D7A61BCB4F0999FC9EEEA1A15087@gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="1123101620-1229506293-1382637850=:14709" X-Virus-Checked: Checked by ClamAV on apache.org --1123101620-1229506293-1382637850=:14709 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi Nan!=0A=0AUsually nodes are decommissioned slowly over some period of ti= me so as not to disrupt the running jobs. When a node is decommissioned, th= e NameNode must re-replicate all under-replicated blocks. Rather than sudde= nly remove half the nodes, you might want to take a few nodes offline at a = time. Hadoop should be able to handle rescheduling tasks on nodes no longer= available (even without speculative execution. Speculative execution is fo= r something else). =0A=0A=0AHTH=0ARavi=0A=0A=0A=0A=0AOn Wednesday, October = 23, 2013 10:26 PM, Nan Zhu wrote:=0A =0AHi, all=0A= =0AI=E2=80=99m running a Hadoop cluster on AWS EC2,=C2=A0=0A=0AI would like= to dynamically resizing the cluster so as to reduce the cost, is there any= solution to achieve this?=C2=A0=0A=0AE.g. I would like to cut the cluster = size with a half, is it safe to just shutdown the instances (if some tasks = are just running on them, can I rely on the speculative execution to re-run= them on the other nodes?)=0A=0AI cannot use EMR, since I=E2=80=99m running= a customized version of Hadoop=C2=A0=0A=0ABest,=0A=0A--=C2=A0=0ANan Zhu=0A= School of Computer Science,=0AMcGill University --1123101620-1229506293-1382637850=:14709 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable
Hi Nan!

Usually n= odes are decommissioned slowly over some period of time so as not to disrup= t the running jobs. When a node is decommissioned, the NameNode must re-rep= licate all under-replicated blocks. Rather than suddenly remove half the no= des, you might want to take a few nodes offline at a time. Hadoop should be= able to handle rescheduling tasks on nodes no longer available (even without speculative execution. Speculative execution is for something else= ).

HTH
Ravi


=
On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zhunan= sjtu@gmail.com> wrote:
=0A
Hi, all

I=E2=80= =99m running a Hadoop cluster on AWS EC2, 

I would like to dynamically resizing the cluster so as to reduce the= cost, is there any solution to achieve this? 

E.g. I would like to cut the cluster size with a half, is it saf= e to just shutdown the instances (if some tasks are just running on them, c= an I rely on the speculative execution to re-run them on the other nodes?)<= /div>

I cannot use EMR, since I=E2=80=99m runni= ng a customized version of Hadoop 

Best,

-- 
Nan Zhu
Scho= ol of Computer Science,
McGill University


=0A


--1123101620-1229506293-1382637850=:14709--