Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of bhuffman@etinternational.com
 designates 65.222.140.81 as permitted sender)
Message-ID: <54352257.3050709@etinternational.com>
Date: Wed, 08 Oct 2014 07:39:03 -0400
From: "Brian C. Huffman" <bhuffman@etinternational.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9;
 rv:24.0) Gecko/20100101 Thunderbird/24.6.0
MIME-Version: 1.0
To: user@hadoop.apache.org
Subject: Re: Datanode volume full, but not moving to free volume
References: <54329D2A.9020708@etinternational.com>
	<CAM5rYObt2WafA_-dcuqg++arzgsJnEKnmBLT7d2mg-dH015ADg@mail.gmail.com>
	<543515EA.10705@etinternational.com>
 <CAM5rYObSPNu0ZCxTgCi89hHNsDqZ628f90251Yv6v7E=zWQqJQ@mail.gmail.com>
In-Reply-To: 
 <CAM5rYObSPNu0ZCxTgCi89hHNsDqZ628f90251Yv6v7E=zWQqJQ@mail.gmail.com>
Content-Type: multipart/alternative;
 boundary="------------090905060909010103090208"

This is a multi-part message in MIME format.
--------------090905060909010103090208
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit

Hmmm..  It seems that there's only one block pool per disk.  So that 
won't help me.  :-(

Also, I see the blockpool directory names are all the same.  Is that 
expected?  So even if I put a larger disk in, I couldn't consolidate the 
smaller disk's blockpool directories?
[hadoop@thor1 current]$ ls 
/data/data1/hadoop/yarn_data/hdfs/datanode/current
BP-1408773897-172.17.1.1-1400769841207  VERSION
[hadoop@thor1 current]$ ls 
/data/data2/hadoop/yarn_data/hdfs/datanode/current
BP-1408773897-172.17.1.1-1400769841207  VERSION
[hadoop@thor1 current]$ ls 
/data/data3/hadoop/yarn_data/hdfs/datanode/current
BP-1408773897-172.17.1.1-1400769841207  VERSION

Regards,
Brian

On 10/8/14, 7:14 AM, Aitor Cedres wrote:
>
> Hi Brian,
>
> I would try to move the Block Pools directories 
> (BP-1408773897-172.17.1.1-1400769841207). You must shutdown your 
> DataNode process before doing this operation.
>
> Regards,
>
> Aitor Cedrés
>
> On 8 October 2014 11:46, Brian C. Huffman 
> <bhuffman@etinternational.com <mailto:bhuffman@etinternational.com>> 
> wrote:
>
>     Can I move a whole subdir?  Or does it have to be individual block
>     files / metadata?
>
>     For example, I see this:
>     [hadoop@thor1 finalized]$ pwd
>     /data/data2/hadoop/yarn_data/hdfs/datanode/current/BP-1408773897-172.17.1.1-1400769841207/current/finalized
>     [hadoop@thor1 finalized]$ du -sh subdir10/
>     80G    subdir10/
>
>     So could I move subdir10 to the same location under /data/data3?
>
>     Thanks,
>     Brian
>
>
>     Brian C. Huffman System Administrator ET International, Inc.On
>     10/8/14, 4:44 AM, Aitor Cedres wrote:
>>
>>     Hi Brian,
>>
>>     Hadoop does not balance the disks within a DataNode. If you ran
>>     out of space and then add additional disks, you should shutdown
>>     the DataNode and move manually a few files to the new disk.
>>
>>     Regards,
>>
>>     Aitor Cedrés
>>
>>     On 6 October 2014 14:46, Brian C. Huffman
>>     <bhuffman@etinternational.com
>>     <mailto:bhuffman@etinternational.com>> wrote:
>>
>>         All,
>>
>>         I have a small hadoop cluster (2.5.0) with 4 datanodes and 3
>>         data disks per node. Lately some of the volumes have been
>>         filling, but instead of moving to other configured volumes
>>         that *have* free space, it's giving errors in the datanode logs:
>>         2014-10-03 11:52:44,989 ERROR
>>         org.apache.hadoop.hdfs.server.datanode.DataNode:
>>         thor2.xmen.eti:50010:DataXceiver error processing WRITE_BLOCK
>>          operation  src: /172.17.1.3:35412 <http://172.17.1.3:35412>
>>         dst: /172.17.1.2:50010 <http://172.17.1.2:50010>
>>         java.io.IOException: No space left on device
>>             at java.io.FileOutputStream.writeBytes(Native Method)
>>             at java.io.FileOutputStream.write(FileOutputStream.java:345)
>>             at
>>         org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:592)
>>             at
>>         org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:734)
>>             at
>>         org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:741)
>>             at
>>         org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)
>>             at
>>         org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)
>>             at
>>         org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:234)
>>             at java.lang.Thread.run(Thread.java:745)
>>
>>         Unfortunately it's continuing to try to write and when it
>>         fails, it's passing the exception to the client.
>>
>>         I did a restart and then it seemed to figure out that it
>>         should move to the next volume.
>>
>>         Any suggestions to keep this from happening in the future?
>>
>>         Also - could it be an issue that I have a small amount of
>>         non-HDFS data on those volumes?
>>
>>         Thanks,
>>         Brian
>>
>>
>
>


--------------090905060909010103090208
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: 8bit

<html>
  <head>
    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hmmm..  It seems that there's only one
      block pool per disk.  So that won't help me.  :-(<br>
      <br>
      Also, I see the blockpool directory names are all the same.  Is
      that expected?  So even if I put a larger disk in, I couldn't
      consolidate the smaller disk's blockpool directories?<br>
      [hadoop@thor1 current]$ ls
      /data/data1/hadoop/yarn_data/hdfs/datanode/current<br>
      BP-1408773897-172.17.1.1-1400769841207  VERSION<br>
      [hadoop@thor1 current]$ ls
      /data/data2/hadoop/yarn_data/hdfs/datanode/current<br>
      BP-1408773897-172.17.1.1-1400769841207  VERSION<br>
      [hadoop@thor1 current]$ ls
      /data/data3/hadoop/yarn_data/hdfs/datanode/current<br>
      BP-1408773897-172.17.1.1-1400769841207  VERSION<br>
      <br>
      Regards,<br>
      Brian<br>
      <br>
      On 10/8/14, 7:14 AM, Aitor Cedres wrote:<br>
    </div>
    <blockquote
cite="mid:CAM5rYObSPNu0ZCxTgCi89hHNsDqZ628f90251Yv6v7E=zWQqJQ@mail.gmail.com"
      type="cite">
      <div dir="ltr"><br>
        <div>Hi Brian,</div>
        <div><br>
        </div>
        <div>I would try to move the Block Pools directories
          (BP-1408773897-172.17.1.1-1400769841207). You must shutdown
          your DataNode process before doing this operation.</div>
        <div><br>
        </div>
        <div>Regards,</div>
        <div class="gmail_extra"><br clear="all">
          <div>
            <div dir="ltr">Aitor Cedrés<br>
              <div>
                <div><br>
                </div>
              </div>
            </div>
          </div>
          <div class="gmail_quote">On 8 October 2014 11:46, Brian C.
            Huffman <span dir="ltr">&lt;<a moz-do-not-send="true"
                href="mailto:bhuffman@etinternational.com"
                target="_blank">bhuffman@etinternational.com</a>&gt;</span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0
              .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div bgcolor="#FFFFFF" text="#000000">
                <div>Can I move a whole subdir?  Or does it have to be
                  individual block files / metadata?<br>
                  <br>
                  For example, I see this:<br>
                  [hadoop@thor1 finalized]$ pwd<br>
/data/data2/hadoop/yarn_data/hdfs/datanode/current/BP-1408773897-172.17.1.1-1400769841207/current/finalized<br>
                  [hadoop@thor1 finalized]$ du -sh subdir10/<br>
                  80G    subdir10/<br>
                  <br>
                  So could I move subdir10 to the same location under
                  /data/data3?<br>
                  <br>
                  Thanks,<br>
                  Brian
                  <div>
                    <div class="h5"><br>
                      <br>
                      Brian C. Huffman System Administrator ET
                      International, Inc.On 10/8/14, 4:44 AM, Aitor
                      Cedres wrote:<br>
                    </div>
                  </div>
                </div>
                <div>
                  <div class="h5">
                    <blockquote type="cite">
                      <div dir="ltr"><br>
                        <div>Hi Brian,</div>
                        <div><br>
                        </div>
                        <div>Hadoop does not balance the disks within a
                          DataNode. If you ran out of space and then add
                          additional disks, you should shutdown the
                          DataNode and move manually a few files to the
                          new disk. </div>
                        <div><br>
                        </div>
                        <div>Regards,</div>
                        <div class="gmail_extra"><br clear="all">
                          <div>
                            <div dir="ltr">Aitor Cedrés<br>
                              <div>
                                <div><br>
                                </div>
                              </div>
                            </div>
                          </div>
                          <div class="gmail_quote">On 6 October 2014
                            14:46, Brian C. Huffman <span dir="ltr">&lt;<a
                                moz-do-not-send="true"
                                href="mailto:bhuffman@etinternational.com"
                                target="_blank">bhuffman@etinternational.com</a>&gt;</span>
                            wrote:<br>
                            <blockquote class="gmail_quote"
                              style="margin:0 0 0 .8ex;border-left:1px
                              #ccc solid;padding-left:1ex">All,<br>
                              <br>
                              I have a small hadoop cluster (2.5.0) with
                              4 datanodes and 3 data disks per node. 
                              Lately some of the volumes have been
                              filling, but instead of moving to other
                              configured volumes that *have* free space,
                              it's giving errors in the datanode logs:<br>
                              2014-10-03 11:52:44,989 ERROR
                              org.apache.hadoop.hdfs.server.datanode.DataNode:
                              thor2.xmen.eti:50010:DataXceiver error
                              processing WRITE_BLOCK<br>
                               operation  src: /<a
                                moz-do-not-send="true"
                                href="http://172.17.1.3:35412"
                                target="_blank">172.17.1.3:35412</a>
                              dst: /<a moz-do-not-send="true"
                                href="http://172.17.1.2:50010"
                                target="_blank">172.17.1.2:50010</a><br>
                              java.io.IOException: No space left on
                              device<br>
                                  at
                              java.io.FileOutputStream.writeBytes(Native
                              Method)<br>
                                  at
                              java.io.FileOutputStream.write(FileOutputStream.java:345)<br>
                                  at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:592)<br>
                                  at
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:734)<br>
                                  at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:741)<br>
                                  at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:124)<br>
                                  at
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:71)<br>
                                  at
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:234)<br>
                                  at
                              java.lang.Thread.run(Thread.java:745)<br>
                              <br>
                              Unfortunately it's continuing to try to
                              write and when it fails, it's passing the
                              exception to the client.<br>
                              <br>
                              I did a restart and then it seemed to
                              figure out that it should move to the next
                              volume.<br>
                              <br>
                              Any suggestions to keep this from
                              happening in the future?<br>
                              <br>
                              Also - could it be an issue that I have a
                              small amount of non-HDFS data on those
                              volumes?<br>
                              <br>
                              Thanks,<br>
                              Brian<br>
                              <br>
                            </blockquote>
                          </div>
                          <br>
                        </div>
                      </div>
                    </blockquote>
                    <br>
                  </div>
                </div>
              </div>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>

--------------090905060909010103090208--