Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of tkang1@gmail.com designates
 74.125.82.48 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type:content-transfer-encoding;
        b=GDkTePnCfHTXrixYAO4WGNn8LY1s33hDGeUcOCLfncD3AvMA44ZQprOAazi9KHfYY0
         4qAl+QeXIB8S2u9YCnBIvI/eUrgo025g+KShUGXnPuIuhZgWl14fSPAnvAbB2dyVsvzb
         XQPcWDHCRA4FdutDG+sVqhI02Yfw9gtRcD+gE=
MIME-Version: 1.0
In-Reply-To: <201010081339309682566@corp.kaixin001.com>
References: <201009281609168599140@corp.kaixin001.com>
	<AANLkTikHFai9AhfqgFhdhDNty7Gjd1DffzaSh-L1nzs8@mail.gmail.com>
	<201010081339309682566@corp.kaixin001.com>
Date: Fri, 8 Oct 2010 14:49:19 +0900
Message-ID: <AANLkTik3MSRSRW7hHbmpApM+Q-YEZP0NGCSui3eF+aAY@mail.gmail.com>
Subject: Re: Re: how to make hadoop balance automatically
From: Taeho Kang <tkang1@gmail.com>
To: common-user@hadoop.apache.org
Content-Type: text/plain; charset=GB2312
Content-Transfer-Encoding: quoted-printable

Have the dfs upload done by a server not running a datanode and your
blocks will be randomly distributed among active datanodes.


On Fri, Oct 8, 2010 at 2:39 PM, shangan <shangan@corp.kaixin001.com> wrote:
> is there any way to change the default storage policy ? for example: don'=
t store the first copy of a block on the local node but distribute the copi=
es randomly instread
>
>
> 2010-10-08
>
>
>
>
> =B7=A2=BC=FE=C8=CB=A3=BA Raj V
> =B7=A2=CB=CD=CA=B1=BC=E4=A3=BA 2010-09-28  22:28:12
> =CA=D5=BC=FE=C8=CB=A3=BA common-user
> =B3=AD=CB=CD=A3=BA
> =D6=F7=CC=E2=A3=BA Re: how to make hadoop balance automatically
>
> The first copy of a block is always stored on the local node. If you want=
 a
> balanced distribution, do the data moving from  the name node and don't  =
make
> the name node into a data node.
> Raj
> ________________________________
> From: Neil Xu <neil.xuxf@gmail.com>
> To: common-user@hadoop.apache.org
> Sent: Tue, September 28, 2010 3:13:01 AM
> Subject: Re: how to make hadoop balance automatically
> Hi, Shangan
> you can find something useful at
> https://issues.apache.org/jira/browse/HADOOP-1652
> and the document
> https://issues.apache.org/jira/secure/attachment/12370966/BalancerUserGui=
de2.pdf
> shows how to rebalance.
> I think you can try to set more mappers (much larger than the number of
> nodes), and see if it will be improved.
> Neil
> =D4=DA 2010=C4=EA9=D4=C228=C8=D5 =CF=C2=CE=E74:09=A3=ACshangan <shangan@c=
orp.kaixin001.com>=D0=B4=B5=C0=A3=BA
>> I have a cluster of 30 nodes, and I put data into the cluster on one nod=
e I
>> called "NodeA" here. The consequence is that now this node always stores
>> more data than other node, for example other nodes store 10G to 15G,whil=
e
>> NodeA will store 50G to 60G .
>>
>> do anyone know what cause such consequence  and how to avoid it ?
>> btw: I know there a balancer tool can do balance
>>
>> 2010-09-28
>>
>>
>>
>> shangan
>>
> __________ Information from ESET NOD32 Antivirus, version of virus signat=
ure database 5484 (20100927) __________
> The message was checked by ESET NOD32 Antivirus.
> http://www.eset.com
>