Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of tapas.sarangi@gmail.com
 designates 209.85.223.169 as permitted sender)
From: Tapas Sarangi <tapas.sarangi@gmail.com>
Content-Type: multipart/alternative;
 boundary="Apple-Mail=_20446F56-4F8F-4937-8E0B-6243D06EBCD6"
Message-Id: <C37D3D52-0479-4BBC-A83E-6478870CE981@gmail.com>
Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\))
Subject: Re: disk used percentage is not symmetric on datanodes (balancer)
Date: Mon, 18 Mar 2013 20:26:48 -0500
References: <522E52B1-497C-4D8D-9014-0182E8B9AABB@gmail.com>
 <CAO6W-2ew+j7NbVpd=vSEKp9zA+PnGbzVQvSKqxudmz9NUjTXRg@mail.gmail.com>
 <7ED0F250-9815-4262-BFD9-C743AE30F32E@gmail.com>
 <CAO6W-2cSQ-RBJe9zmfsxoYZEVBn9ccs6BWDq8FUxzihUisrKQQ@mail.gmail.com>
 <BLU0-SMTP271D6DBA8B373382531C419DAE90@phx.gbl>
To: user@hadoop.apache.org
In-Reply-To: <BLU0-SMTP271D6DBA8B373382531C419DAE90@phx.gbl>


--Apple-Mail=_20446F56-4F8F-4937-8E0B-6243D06EBCD6
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=utf-8

Hi,

On Mar 18, 2013, at 8:21 PM, =E6=9D=8E=E6=B4=AA=E5=BF=A0 =
<lhztop@hotmail.com> wrote:

> Maybe you need to modify the rackware script to make the rack balance, =
ie, all the racks are the same size,  on rack by 6 small nodes, one rack =
by 1 large nodes.=20
> P.S.
> you need to reboot the cluster for rackware script modify.

Like I mentioned earlier in my reply to Bertrand, we haven't considered =
rack awareness for the cluster, currently it is considered as just one =
rack. Can that be the problem ? I don't know=E2=80=A6

-Tapas


>  =20
> =E4=BA=8E 2013/3/19 7:17, Bertrand Dechoux =E5=86=99=E9=81=93:
>> And by active, it means that it does actually stops by itself? Else =
it might mean that the throttling/limit might be an issue with regard to =
the data volume or velocity.
>>=20
>> What threshold is used?
>>=20
>> About the small and big datanodes, how are they distributed with =
regards to racks?
>> About files, how is used the replication factor(s) and block size(s)?
>>=20
>> Surely trivial questions again.
>>=20
>> Bertrand
>>=20
>> On Mon, Mar 18, 2013 at 10:46 PM, Tapas Sarangi =
<tapas.sarangi@gmail.com> wrote:
>> Hi,
>>=20
>> Sorry about that, had it written, but thought it was obvious.=20
>> Yes, balancer is active and running on the namenode.
>>=20
>> -Tapas
>>=20
>> On Mar 18, 2013, at 4:43 PM, Bertrand Dechoux <dechouxb@gmail.com> =
wrote:
>>=20
>>> Hi,
>>>=20
>>> It is not explicitly said but did you use the balancer?
>>> http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#balancer
>>>=20
>>> Regards
>>>=20
>>> Bertrand
>>>=20
>>> On Mon, Mar 18, 2013 at 10:01 PM, Tapas Sarangi =
<tapas.sarangi@gmail.com> wrote:
>>> Hello,
>>>=20
>>> I am using one of the old legacy version (0.20) of hadoop for our =
cluster. We have scheduled for an upgrade to the newer version within a =
couple of months, but I would like to understand a couple of things =
before moving towards the upgrade plan.
>>>=20
>>> We have about 200 datanodes and some of them have larger storage =
than others. The storage for the datanodes varies between 12 TB to 72 =
TB.
>>>=20
>>> We found that the disk-used percentage is not symmetric through all =
the datanodes. For larger storage nodes the percentage of disk-space =
used is much lower than that of other nodes with smaller storage space. =
In larger storage nodes the percentage of used disk space varies, but on =
average about 30-50%. For the smaller storage nodes this number is as =
high as 99.9%. Is this expected ? If so, then we are not using a lot of =
the disk space effectively. Is this solved in a future release ?
>>>=20
>>> If no, I would like to know  if there are any checks/debugs that one =
can do to find an improvement with the current version or upgrading =
hadoop should solve this problem.
>>>=20
>>> I am happy to provide additional information if needed.
>>>=20
>>> Thanks for any help.
>>>=20
>>> -Tapas
>>>=20
>>=20
>>=20
>>=20
>>=20
>> --=20
>> Bertrand Dechoux
>=20


--Apple-Mail=_20446F56-4F8F-4937-8E0B-6243D06EBCD6
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=utf-8

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dutf-8"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: after-white-space; =
">Hi,<div><br><div><div>On Mar 18, 2013, at 8:21 PM, =E6=9D=8E=E6=B4=AA=E5=
=BF=A0 &lt;<a =
href=3D"mailto:lhztop@hotmail.com">lhztop@hotmail.com</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite">
 =20
    <meta content=3D"text/html; charset=3DUTF-8" =
http-equiv=3D"Content-Type">
 =20
  <div text=3D"#000000" bgcolor=3D"#FFFFFF">
    <div class=3D"moz-cite-prefix">Maybe you need to modify the rackware
      script to make the rack balance, ie, all the racks are the same
      size,&nbsp; on rack by 6 small nodes, one rack by 1 large nodes. =
<br>
      P.S.<br>
      you need to reboot the cluster for rackware script =
modify.<br></div></div></blockquote><div><br></div><div>Like I mentioned =
earlier in my reply to Bertrand, we haven't considered rack awareness =
for the cluster, currently it is considered as just one rack. Can that =
be the problem ? I don't =
know=E2=80=A6</div><div><br></div><div>-Tapas</div><div><br></div><br><blo=
ckquote type=3D"cite"><div text=3D"#000000" bgcolor=3D"#FFFFFF"><div =
class=3D"moz-cite-prefix">
      &nbsp; <br>
      =E4=BA=8E 2013/3/19 7:17, Bertrand Dechoux =E5=86=99=E9=81=93:<br>
    </div>
    <blockquote =
cite=3D"mid:CAO6W-2cSQ-RBJe9zmfsxoYZEVBn9ccs6BWDq8FUxzihUisrKQQ@mail.gmail=
.com" type=3D"cite">And by active, it means that it does actually stops =
by
      itself? Else it might mean that the throttling/limit might be an
      issue with regard to the data volume or velocity.<br>
      <br>
      What threshold is used?<br>
      <br>
      About the small and big datanodes, how are they distributed with
      regards to racks?<br>
      About files, how is used the replication factor(s) and block
      size(s)?<br>
      <br>
      Surely trivial questions again.<br>
      <br>
      Bertrand<br>
      <br>
      <div class=3D"gmail_quote">On Mon, Mar 18, 2013 at 10:46 PM, Tapas
        Sarangi <span dir=3D"ltr">&lt;<a moz-do-not-send=3D"true" =
href=3D"mailto:tapas.sarangi@gmail.com" =
target=3D"_blank">tapas.sarangi@gmail.com</a>&gt;</span>
        wrote:<br>
        <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0
          .8ex;border-left:1px #ccc solid;padding-left:1ex">
          <div style=3D"word-wrap:break-word">Hi,
            <div><br>
            </div>
            <div>Sorry about that, had it written, but thought it was
              obvious.&nbsp;</div>
            <div>Yes, balancer is active and running on the =
namenode.</div>
            <span class=3D"HOEnZb"><font color=3D"#888888">
                <div><br>
                </div>
                <div>-Tapas</div>
              </font></span>
            <div>
              <div class=3D"h5">
                <div><br>
                  <div>
                    <div>On Mar 18, 2013, at 4:43 PM, Bertrand Dechoux
                      &lt;<a moz-do-not-send=3D"true" =
href=3D"mailto:dechouxb@gmail.com" =
target=3D"_blank">dechouxb@gmail.com</a>&gt;
                      wrote:</div>
                    <br>
                    <blockquote type=3D"cite">Hi,<br>
                      <br>
                      It is not explicitly said but did you use the
                      balancer?<br>
                      <a moz-do-not-send=3D"true" =
href=3D"http://hadoop.apache.org/docs/r1.0.4/commands_manual.html#balancer=
" =
target=3D"_blank">http://hadoop.apache.org/docs/r1.0.4/commands_manual.htm=
l#balancer</a><br>
                      <br>
                      Regards<br>
                      <br>
                      Bertrand<br>
                      <br>
                      <div class=3D"gmail_quote">On Mon, Mar 18, 2013 at
                        10:01 PM, Tapas Sarangi <span dir=3D"ltr">&lt;<a =
moz-do-not-send=3D"true" href=3D"mailto:tapas.sarangi@gmail.com" =
target=3D"_blank">tapas.sarangi@gmail.com</a>&gt;</span>
                        wrote:<br>
                        <blockquote class=3D"gmail_quote" =
style=3D"margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">Hello,<br>
                          <br>
                          I am using one of the old legacy version
                          (0.20) of hadoop for our cluster. We have
                          scheduled for an upgrade to the newer version
                          within a couple of months, but I would like to
                          understand a couple of things before moving
                          towards the upgrade plan.<br>
                          <br>
                          We have about 200 datanodes and some of them
                          have larger storage than others. The storage
                          for the datanodes varies between 12 TB to 72
                          TB.<br>
                          <br>
                          We found that the disk-used percentage is not
                          symmetric through all the datanodes. For
                          larger storage nodes the percentage of
                          disk-space used is much lower than that of
                          other nodes with smaller storage space. In
                          larger storage nodes the percentage of used
                          disk space varies, but on average about
                          30-50%. For the smaller storage nodes this
                          number is as high as 99.9%. Is this expected ?
                          If so, then we are not using a lot of the disk
                          space effectively. Is this solved in a future
                          release ?<br>
                          <br>
                          If no, I would like to know &nbsp;if there are =
any
                          checks/debugs that one can do to find an
                          improvement with the current version or
                          upgrading hadoop should solve this =
problem.<br>
                          <br>
                          I am happy to provide additional information
                          if needed.<br>
                          <br>
                          Thanks for any help.<br>
                          <span><font color=3D"#888888"><br>
                              -Tapas<br>
                              <br>
                            </font></span></blockquote>
                      </div>
                    </blockquote>
                  </div>
                  <br>
                </div>
              </div>
            </div>
          </div>
        </blockquote>
      </div>
      <br>
      <br clear=3D"all">
      <br>
      -- <br>
      Bertrand Dechoux
    </blockquote>
    <br>
  </div>

</blockquote></div><br></div></body></html>=

--Apple-Mail=_20446F56-4F8F-4937-8E0B-6243D06EBCD6--