Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: local policy)
Date: Wed, 23 Oct 2013 22:06:54 -0700 (PDT)
From: Jun Ping Du <jdu@vmware.com>
To: user@hadoop.apache.org
Cc: zhunansjtu@gmail.com
Message-ID: <1243906992.11523127.1382591214132.JavaMail.root@vmware.com>
In-Reply-To: <1720260874.11520654.1382591007087.JavaMail.root@vmware.com>
References: <A86A624EB5E44D47967065B71AEE684A@gmail.com>
 <8692BA27E2924BCB8F31EC6D44279D29@gmail.com>
 <1720260874.11520654.1382591007087.JavaMail.root@vmware.com>
Subject: Re: dynamically resizing Hadoop cluster on AWS?
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Thread-Topic: dynamically resizing Hadoop cluster on AWS?
Thread-Index: vDna6nH+71RYnZG3qSqe0qjiG2vT9FSIyOiA

Move to @user alias.

----- Original Message -----
From: "Jun Ping Du" <jdu@vmware.com>
To: general@hadoop.apache.org
Sent: Wednesday, October 23, 2013 10:03:27 PM
Subject: Re: dynamically resizing Hadoop cluster on AWS?

If only compute node (TaskTracker or NodeManager) in your instance, then de=
commission nodes and shutdown related EC2 instances should be fine although=
 some finished/running tasks might need to be re-run automatically. If futu=
re, we would support gracefully decommission (tracked by YARN-914 and MAPRE=
DUCE-5381) so that no tasks need to be rerun in this case (but need to wait=
 a while).

Thanks,

Junping

----- Original Message -----
From: "Nan Zhu" <zhunansjtu@gmail.com>
To: general@hadoop.apache.org
Sent: Wednesday, October 23, 2013 8:15:51 PM
Subject: Re: dynamically resizing Hadoop cluster on AWS?

Oh, I=E2=80=99m not running HDFS in the instances, I use S3 to save data

-- =20
Nan Zhu
School of Computer Science,
McGill University


On Wednesday, October 23, 2013 at 11:11 PM, Nan Zhu wrote:

> Hi, all =20
> =20
> I=E2=80=99m running a Hadoop cluster on AWS EC2, =20
> =20
> I would like to dynamically resizing the cluster so as to reduce the cost=
, is there any solution to achieve this? =20
> =20
> E.g. I would like to cut the cluster size with a half, is it safe to just=
 shutdown the instances (if some tasks are just running on them, can I rely=
 on the speculative execution to re-run them on other nodes?)
> =20
> I cannot use EMR, since I=E2=80=99m running a customized version of Hadoo=
p =20
> =20
> Best, =20
> =20
> -- =20
> Nan Zhu
> School of Computer Science,
> McGill University
> =20