Return-Path: X-Original-To: apmail-hadoop-common-dev-archive@www.apache.org Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 483BF878A for ; Thu, 15 Sep 2011 09:14:54 +0000 (UTC) Received: (qmail 19828 invoked by uid 500); 15 Sep 2011 09:14:52 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 19753 invoked by uid 500); 15 Sep 2011 09:14:52 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 19740 invoked by uid 99); 15 Sep 2011 09:14:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Sep 2011 09:14:51 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [98.138.91.209] (HELO nm21-vm2.bullet.mail.ne1.yahoo.com) (98.138.91.209) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 15 Sep 2011 09:14:42 +0000 Received: from [98.138.90.52] by nm21.bullet.mail.ne1.yahoo.com with NNFMP; 15 Sep 2011 09:14:19 -0000 Received: from [98.138.89.195] by tm5.bullet.mail.ne1.yahoo.com with NNFMP; 15 Sep 2011 09:14:19 -0000 Received: from [127.0.0.1] by omp1053.mail.ne1.yahoo.com with NNFMP; 15 Sep 2011 09:14:19 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 757801.12964.bm@omp1053.mail.ne1.yahoo.com Received: (qmail 19129 invoked by uid 60001); 15 Sep 2011 09:14:19 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1316078059; bh=azWslh6pSWCaX9DoP39vPHqzrTTrkdFx+EBuEAjmTgc=; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=2MhwR4eYW6vj2IHy9Oia3jVgivjM1TYcfFaeB/87A6Ljm0iOqDYcaQidYbAOqi/v2MV/CeC/htvFlLPzy4Ozn3oARSydRHnqUqDaRuP9Zi6MtM1ZOYTODHv9JkEuySgHeNfB56ibcMP9jxJsb7fJDUn327LIigKFP+U57lWmSo4= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=J0Y+/uLUevAmVtO+tjaXWLpPQZ3gwV9YzwZ1IhJDqymtKNSoJiBSorm26NsTog+pj93GIPxov2Algiu7fLeFdxl9pDdnYb8xgoelb/pwAlzNMc+4UEjuppVlxB57Kt2WJhrDQAqW/JJjz0JNYMGWWE7AsD+hnEax+NvzxVevqxE=; X-YMail-OSG: bY9ZxoYVM1lG7CxZb4Q4vPbN_3Z36FU0XiaeehVviWkzKOF pXNgmeZK1TtQ1RGzUF0dPnzgHubgrK2ZFKx0Aumy.wt0sHwnhtQ4jvlGbJjx ZRMOEkjSsZLj6I8heUSdiH4lmQHbSIttrLcF.8jEl9oTX0TeoS04MTENdEo_ fkaB_A0ugrPnRQrYLsm5iRyglSsrrDdutTVSO6TQku9cxjy5n_EpOnHZ5P_D 3xxzwEQNj_i3FfTMKiO.RRGyiW83Xoril_59uWS.xV3lxFL0UGyztA4wnQec fPXRxSDTAipg2TOsyXXKeUWy7GyKOAytgp_RyoklngzC5HvAxhQTNKfzVphm mYQpArYJwvaqufPXHQnY6cxp2Ic35.eVTCAptYEOgzoR7M0nRYadaGCHc.Fx b7IegZSthYsbnxipXeKA.lyJihYmu.LuS_alOAn1mkNCkOYfmHqNYfOUDvLf e_O09HVrIlt9Pdx2S_mW9U1M- Received: from [65.113.40.1] by web121613.mail.ne1.yahoo.com via HTTP; Thu, 15 Sep 2011 02:14:19 PDT X-Mailer: YahooMailWebService/0.8.113.315625 References: <9C9A947B-55F4-4225-85A3-631299DC4C29@hortonworks.com> Message-ID: <1316078059.14176.YahooMailNeo@web121613.mail.ne1.yahoo.com> Date: Thu, 15 Sep 2011 02:14:19 -0700 (PDT) From: Junping Du Reply-To: Junping Du Subject: Re: Adding Elasticity to Hadoop MapReduce To: "common-dev@hadoop.apache.org" Cc: "acm@hortonworks.com" In-Reply-To: <9C9A947B-55F4-4225-85A3-631299DC4C29@hortonworks.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="-1184114812-1800494582-1316078059=:14176" X-Virus-Checked: Checked by ClamAV on apache.org ---1184114812-1800494582-1316078059=:14176 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Hello Arun and all,=0A=A0=A0 =A0 =A0 =A0=A0I think current hadoop have a go= od capability of scale out but not so good at scale in. As its design for d= edicated cluster and machines, there is not too much attention for "scale i= n" capability in a long time. However, I noticed that there are more and mo= re users to deploy hadoop clusters in Cloud (ec2, eucalyptus, etc.) or shar= ed infrastructures(vmware, xen) that "scale in" capability can contribute t= o save resource utilization for other clusters or applications. The current= "scale in" solution (as you proposed in previous mail) have some significa= nt drawbacks:=0A=A0=A0 =A0 =A0 =A0 1. It doesn't use a formal way to handle= scale-in case but rather a temporary workaround base on a disaster recover= y mechanism.=0A=A0=A0 =A0 =A0 =A0 2. It is not convenient, Hadoop admin hav= e to manually kill datanode one by one(in fact, maximum to be N(replica num= ber) -1 each time to avoid possible data loss) and wait replica back To shr= ink a cluster from 1000 nodes to 500 nodes, how much time and effort it cou= ld be?=0A=A0=A0 =A0 =A0 =A0 3. It is not efficient as it is not well planne= d. Let's say both node A, B and C should be eliminated from cluster. At fir= st, A and B will be eliminated from cluster ( suppose N =3D3), and it is po= ssible that C can get some replicas for block in A or B. This problem is se= rious if big shrink happens.=0A=A0=A0 =A0 =A0 =A0 Thus, I think it is neces= sary to have a good discussion to let hadoop have this cool "elastic" featu= res. Here I am=A0volunteer for proposing one possible solution and welcome = better solutions:=0A=A0=A0 =A0 =A0 =A0 1. We can think of breaking out the = assumption of coexist of Datanode and TaskTracker on one machine and let so= me machines only have task node. I think network traffic inside a rack is n= ot so expensive, but you may say that it waste some local I/O resource for = machines only with task node. Hey, don't look at these machines as dedicate= d resource for this hadoop cluster. They can be used by other clusters and = application(so they should be eliminated at some time). To this cluster, th= ese machines are better than nothing, right?=0A=A0=A0 =A0 =A0 =A0 =A02. The= percentage of machines with only task node in whole cluster is a "elastic"= factor for this cluster. Take a example, if this cluster want to be scalab= le between "500"-"1000", the elastic factor could be 1/2, and it should hav= e 500 normal machines with both data and task nodes and another 500 machine= s with task node only.=0A=A0=A0 =A0 =A0 =A0 =A03. Elastic factor can be con= figured by hadoop admin and non-dedicated machines in this cluster can be m= arked through some script like what have been done in rack-awareness.=0A=A0= =A0 =A0 =A0 =A0 =A04. One command is provided to hadoop admin to shrink the= cluster to the target size directly. Some policy can be applied here for w= aiting or not waiting task completed. If target size is smaller than elasti= c factor * current size, some data node will be killed too but in a well pl= anned way.=0A=A0=A0 =A0 =A0 =A0 =A0My 2 cents.=0A=0AThanks,=0A=0AJunping=0A= =0A=0A________________________________=0AFrom: Arun C Murthy =0ATo: common-dev@hadoop.apache.org=0ASent: Thursday, September 15, = 2011 5:24 AM=0ASubject: Re: Adding Elasticity to Hadoop MapReduce=0A=0A=0AO= n Sep 14, 2011, at 1:27 PM, Bharath Ravi wrote:=0A=0A> Hi all,=0A> =0A> I'm= a newcomer to Hadoop development, and I'm planning to work on an idea=0A> = that I wanted to run by the dev community.=0A> =0A> My apologies if this is= not the right place to post this.=0A> =0A> Amazon has an "Elastic MapReduc= e" Service (=0A> http://aws.amazon.com/elasticmapreduce/) that runs on Hado= op.=0A> The service allows dynamic/runtime changes in resource allocation: = more=0A> specifically, varying the number of=0A> compute nodes that a job i= s running on.=0A> =0A> I was wondering if such a facility could be added to= the publicly available=0A> Hadoop MapReduce.=0A=0AFrom a long while=A0 you= can bring up either DataNodes or TaskTrackers and point them (via config) = to the NameNode/JobTracker and they will be part of the cluster.=0A=0ASimil= arly you can just kill the DataNode or TaskTracker and the respective maste= rs will deal with their loss.=0A=0AArun ---1184114812-1800494582-1316078059=:14176--