Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3A18C1019D for ; Thu, 24 Oct 2013 05:07:52 +0000 (UTC) Received: (qmail 51483 invoked by uid 500); 24 Oct 2013 05:07:37 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 51307 invoked by uid 500); 24 Oct 2013 05:07:29 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 51294 invoked by uid 99); 24 Oct 2013 05:07:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Oct 2013 05:07:27 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.91.2.12] (HELO smtp-outbound-1.vmware.com) (208.91.2.12) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Oct 2013 05:07:15 +0000 Received: from sc9-mailhost3.vmware.com (sc9-mailhost3.vmware.com [10.113.161.73]) by smtp-outbound-1.vmware.com (Postfix) with ESMTP id D8543982BA; Wed, 23 Oct 2013 22:06:53 -0700 (PDT) Received: from zcs-prod-mta-1.vmware.com (zcs-prod-mta-1.vmware.com [10.113.163.63]) by sc9-mailhost3.vmware.com (Postfix) with ESMTP id 60C0B404AB; Wed, 23 Oct 2013 22:06:54 -0700 (PDT) Received: from zcs-prod-mta-1 (localhost.localdomain [127.0.0.1]) by zcs-prod-mta-1.vmware.com (Postfix) with ESMTP id 4EF94E00FC; Wed, 23 Oct 2013 22:06:54 -0700 (PDT) Received: from zcs-prod-mbox-41.vmware.com (lbv-sc9-t2prod2-int.vmware.com [10.113.160.246]) by zcs-prod-mta-1.vmware.com (Postfix) with ESMTP; Wed, 23 Oct 2013 22:06:54 -0700 (PDT) Date: Wed, 23 Oct 2013 22:06:54 -0700 (PDT) From: Jun Ping Du To: user@hadoop.apache.org Cc: zhunansjtu@gmail.com Message-ID: <1243906992.11523127.1382591214132.JavaMail.root@vmware.com> In-Reply-To: <1720260874.11520654.1382591007087.JavaMail.root@vmware.com> References: <8692BA27E2924BCB8F31EC6D44279D29@gmail.com> <1720260874.11520654.1382591007087.JavaMail.root@vmware.com> Subject: Re: dynamically resizing Hadoop cluster on AWS? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.113.62.221] X-Mailer: Zimbra 8.0.3_GA_5664 (ZimbraWebClient - SAF5.1 (Mac)/8.0.3_GA_5664) Thread-Topic: dynamically resizing Hadoop cluster on AWS? Thread-Index: vDna6nH+71RYnZG3qSqe0qjiG2vT9FSIyOiA X-Virus-Checked: Checked by ClamAV on apache.org Move to @user alias. ----- Original Message ----- From: "Jun Ping Du" To: general@hadoop.apache.org Sent: Wednesday, October 23, 2013 10:03:27 PM Subject: Re: dynamically resizing Hadoop cluster on AWS? If only compute node (TaskTracker or NodeManager) in your instance, then de= commission nodes and shutdown related EC2 instances should be fine although= some finished/running tasks might need to be re-run automatically. If futu= re, we would support gracefully decommission (tracked by YARN-914 and MAPRE= DUCE-5381) so that no tasks need to be rerun in this case (but need to wait= a while). Thanks, Junping ----- Original Message ----- From: "Nan Zhu" To: general@hadoop.apache.org Sent: Wednesday, October 23, 2013 8:15:51 PM Subject: Re: dynamically resizing Hadoop cluster on AWS? Oh, I=E2=80=99m not running HDFS in the instances, I use S3 to save data -- =20 Nan Zhu School of Computer Science, McGill University On Wednesday, October 23, 2013 at 11:11 PM, Nan Zhu wrote: > Hi, all =20 > =20 > I=E2=80=99m running a Hadoop cluster on AWS EC2, =20 > =20 > I would like to dynamically resizing the cluster so as to reduce the cost= , is there any solution to achieve this? =20 > =20 > E.g. I would like to cut the cluster size with a half, is it safe to just= shutdown the instances (if some tasks are just running on them, can I rely= on the speculative execution to re-run them on other nodes?) > =20 > I cannot use EMR, since I=E2=80=99m running a customized version of Hadoo= p =20 > =20 > Best, =20 > =20 > -- =20 > Nan Zhu > School of Computer Science, > McGill University > =20