Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of divs.sheth@gmail.com designates
 209.85.214.169 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAOcnVr1W6EkuPhi3uCrOyj9d7vQLFxthvs5UYTG+=PmMfacnKA@mail.gmail.com>
References: 
 <CAHpih6wf3wyY-MqNB6W0kLg3LGWePvj7vffxRSMvz_fzfWsaVw@mail.gmail.com>
	<CAOcnVr1W6EkuPhi3uCrOyj9d7vQLFxthvs5UYTG+=PmMfacnKA@mail.gmail.com>
Date: Wed, 5 Mar 2014 13:17:38 +0530
Message-ID: 
 <CAHpih6ydWa2sxHXVUm8qFYkKbw4Obe68B28_Pz8GtcWhE17LSg@mail.gmail.com>
Subject: Re: Question on DFS Balancing
From: divye sheth <divs.sheth@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=001a11c2cba412cd2304f3d73c2a

--001a11c2cba412cd2304f3d73c2a
Content-Type: text/plain; charset=ISO-8859-1

Thanks Harsh. The jira is fixed in version 2.1.0 whereas I am using Hadoop
0.20.2 (we are in a process of upgrading) is there a workaround for the
short term to balance the disk utilization? The patch in the Jira, if
applied to the version that I am using, will it break anything?

Thanks
Divye Sheth


On Wed, Mar 5, 2014 at 11:28 AM, Harsh J <harsh@cloudera.com> wrote:

> You're probably looking for
> https://issues.apache.org/jira/browse/HDFS-1804
>
> On Tue, Mar 4, 2014 at 5:54 AM, divye sheth <divs.sheth@gmail.com> wrote:
> > Hi,
> >
> > I am new to the mailing list.
> >
> > I am using Hadoop 0.20.2 with an append r1056497 version. The question I
> > have is related to balancing. I have a 5 datanode cluster and each node
> has
> > 2 disks attached to it. The second disk was added when the first disk was
> > reaching its capacity.
> >
> > Now the scenario that I am facing is, when the new disk was added hadoop
> > automatically moved over some data to the new disk. But over the time I
> > notice that data is no longer being written to the second disk. I have
> also
> > faced an issue on the datanode where the first disk had 100% utilization.
> >
> > How can I overcome such scenario, is it not hadoop's job to balance the
> disk
> > utilization between multiple disks on single datanode?
> >
> > Thanks
> > Divye Sheth
>
>
>
> --
> Harsh J
>

--001a11c2cba412cd2304f3d73c2a
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Thanks Harsh. The jira is fixed in version 2.1.0 whereas I=
 am using Hadoop 0.20.2 (we are in a process of upgrading) is there a worka=
round for the short term to balance the disk utilization? The patch in the =
Jira, if applied to the version that I am using, will it break anything?<di=
v>
<br></div><div>Thanks</div><div>Divye Sheth</div></div><div class=3D"gmail_=
extra"><br><br><div class=3D"gmail_quote">On Wed, Mar 5, 2014 at 11:28 AM, =
Harsh J <span dir=3D"ltr">&lt;<a href=3D"mailto:harsh@cloudera.com" target=
=3D"_blank">harsh@cloudera.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">You&#39;re probably looking for <a href=3D"h=
ttps://issues.apache.org/jira/browse/HDFS-1804" target=3D"_blank">https://i=
ssues.apache.org/jira/browse/HDFS-1804</a><br>

<div class=3D"HOEnZb"><div class=3D"h5"><br>
On Tue, Mar 4, 2014 at 5:54 AM, divye sheth &lt;<a href=3D"mailto:divs.shet=
h@gmail.com">divs.sheth@gmail.com</a>&gt; wrote:<br>
&gt; Hi,<br>
&gt;<br>
&gt; I am new to the mailing list.<br>
&gt;<br>
&gt; I am using Hadoop 0.20.2 with an append r1056497 version. The question=
 I<br>
&gt; have is related to balancing. I have a 5 datanode cluster and each nod=
e has<br>
&gt; 2 disks attached to it. The second disk was added when the first disk =
was<br>
&gt; reaching its capacity.<br>
&gt;<br>
&gt; Now the scenario that I am facing is, when the new disk was added hado=
op<br>
&gt; automatically moved over some data to the new disk. But over the time =
I<br>
&gt; notice that data is no longer being written to the second disk. I have=
 also<br>
&gt; faced an issue on the datanode where the first disk had 100% utilizati=
on.<br>
&gt;<br>
&gt; How can I overcome such scenario, is it not hadoop&#39;s job to balanc=
e the disk<br>
&gt; utilization between multiple disks on single datanode?<br>
&gt;<br>
&gt; Thanks<br>
&gt; Divye Sheth<br>
<br>
<br>
<br>
</div></div><span class=3D"HOEnZb"><font color=3D"#888888">--<br>
Harsh J<br>
</font></span></blockquote></div><br></div>

--001a11c2cba412cd2304f3d73c2a--