hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adaryl \"Bob\" Wakefield, MBA" <adaryl.wakefi...@hotmail.com>
Subject Re: hadoop not using whole disk for HDFS
Date Sat, 07 Nov 2015 22:56:06 GMT
dfs.datanode.data.dir  = /hadoop/hdfs/data,/hdfs/data

Data node 1:
      Filesystem Size Used Avail Use% Mounted on 
      /dev/mapper/centos-root 50G 12G 39G 23% / 
      devtmpfs 16G 0 16G 0% /dev 
      tmpfs 16G 0 16G 0% /dev/shm 
      tmpfs 16G 1.4G 15G 9% /run 
      tmpfs 16G 0 16G 0% /sys/fs/cgroup 
      /dev/sda2 494M 123M 372M 25% /boot 
      /dev/mapper/centos-home 2.7T 33M 2.7T 1% /home 


data node 2:
      Filesystem Size Used Avail Use% Mounted on 
      /dev/mapper/centos-root 50G 24G 27G 48% / 
      devtmpfs 16G 0 16G 0% /dev 
      tmpfs 16G 24K 16G 1% /dev/shm 
      tmpfs 16G 97M 16G 1% /run 
      tmpfs 16G 0 16G 0% /sys/fs/cgroup 
      /dev/sda2 494M 124M 370M 26% /boot 
      /dev/mapper/centos-home 2.7T 33M 2.7T 1% /home 



Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: iain wright 
Sent: Thursday, November 05, 2015 7:56 PM
To: user@hadoop.apache.org 
Subject: Re: hadoop not using whole disk for HDFS

Please post:  
- output of df -h from every datanode in your cluster
- what dfs.datanode.data.dir is currently set too

-- 

Iain Wright



This email message is confidential, intended only for the recipient(s) named above and may
contain information that is privileged, exempt from disclosure under applicable law. If you
are not the intended recipient, do not disclose or disseminate the message to anyone except
the intended recipient. If you have received this message in error, or are not the named recipient(s),
please immediately notify the sender by return email, and delete all copies of this message.


On Thu, Nov 5, 2015 at 5:24 PM, Adaryl "Bob" Wakefield, MBA <adaryl.wakefield@hotmail.com>
wrote:

  Is there a maximum amount of disk space that HDFS will use? Is 100GB that max? When we’re
supposed to be dealing with “big data” why is the amount of data to be held on any one
box such a small number when you’ve got terabytes available?

  Adaryl "Bob" Wakefield, MBA
  Principal
  Mass Street Analytics, LLC
  913.938.6685
  www.linkedin.com/in/bobwakefieldmba
  Twitter: @BobLovesData

  From: Adaryl "Bob" Wakefield, MBA 
  Sent: Wednesday, November 04, 2015 4:38 PM
  To: user@hadoop.apache.org 
  Subject: Re: hadoop not using whole disk for HDFS

  This is an experimental cluster and there isn’t anything I can’t lose. I ran into some
issues. I’m running the Hortonworks distro and am managing things through Ambari. 

  1. I wasn’t able to set the config to /home/hdfs/data. I got an error that told me I’m
not allowed to set that config to the /home directory. So I made it /hdfs/data.
  2. When I restarted, the space available increased by a whopping 100GB.



  Adaryl "Bob" Wakefield, MBA
  Principal
  Mass Street Analytics, LLC
  913.938.6685
  www.linkedin.com/in/bobwakefieldmba
  Twitter: @BobLovesData

  From: Naganarasimha G R (Naga) 
  Sent: Wednesday, November 04, 2015 4:26 PM
  To: user@hadoop.apache.org 
  Subject: RE: hadoop not using whole disk for HDFS

  Better would be to stop the daemons and copy the data from /hadoop/hdfs/data to /home/hdfs/data
, reconfigure dfs.datanode.data.dir to /home/hdfs/data and then start the daemons. If the
data is comparitively less !

  Ensure you have the backup if have any critical data !



  Regards,

  + Naga


------------------------------------------------------------------------------

  From: Adaryl "Bob" Wakefield, MBA [adaryl.wakefield@hotmail.com]
  Sent: Thursday, November 05, 2015 03:40
  To: user@hadoop.apache.org
  Subject: Re: hadoop not using whole disk for HDFS


  So like I can just create a new folder in the home directory like:
  home/hdfs/data
  and then set dfs.datanode.data.dir to:
  /hadoop/hdfs/data,home/hdfs/data

  Restart the node and that should do it correct?

  Adaryl "Bob" Wakefield, MBA
  Principal
  Mass Street Analytics, LLC
  913.938.6685
  www.linkedin.com/in/bobwakefieldmba
  Twitter: @BobLovesData

  From: Naganarasimha G R (Naga) 
  Sent: Wednesday, November 04, 2015 3:59 PM
  To: user@hadoop.apache.org 
  Subject: RE: hadoop not using whole disk for HDFS

  Hi Bob,



  Seems like you have configured to disk dir to be other than an folder in /home, if so try
creating another folder and add to "dfs.datanode.data.dir" seperated by comma instead of trying
to reset the default.

  And its also advised not to use the root partition "/" to be configured for HDFS data dir,
if the Dir usage hits the maximum then OS might fail to function properly.



  Regards,

  + Naga


------------------------------------------------------------------------------

  From: P lva [ruvikal@gmail.com]
  Sent: Thursday, November 05, 2015 03:11
  To: user@hadoop.apache.org
  Subject: Re: hadoop not using whole disk for HDFS


  What does your dfs.datanode.data.dir point to ?



  On Wed, Nov 4, 2015 at 4:14 PM, Adaryl "Bob" Wakefield, MBA <adaryl.wakefield@hotmail.com>
wrote:

          Filesystem Size Used Avail Use% Mounted on 
          /dev/mapper/centos-root 50G 12G 39G 23% / 
          devtmpfs 16G 0 16G 0% /dev 
          tmpfs 16G 0 16G 0% /dev/shm 
          tmpfs 16G 1.4G 15G 9% /run 
          tmpfs 16G 0 16G 0% /sys/fs/cgroup 
          /dev/sda2 494M 123M 372M 25% /boot 
          /dev/mapper/centos-home 2.7T 33M 2.7T 1% /home 


    That’s from one datanode. The second one is nearly identical. I discovered that 50GB
is actually a default. That seems really weird. Disk space is cheap. Why would you not just
use most of the disk and why is it so hard to reset the default?

    Adaryl "Bob" Wakefield, MBA
    Principal
    Mass Street Analytics, LLC
    913.938.6685
    www.linkedin.com/in/bobwakefieldmba
    Twitter: @BobLovesData

    From: Chris Nauroth 
    Sent: Wednesday, November 04, 2015 12:16 PM
    To: user@hadoop.apache.org 
    Subject: Re: hadoop not using whole disk for HDFS

    How are those drives partitioned?  Is it possible that the directories pointed to by the
dfs.datanode.data.dir property in hdfs-site.xml reside on partitions that are sized to only
100 GB?  Running commands like df would be a good way to check this at the OS level, independently
of Hadoop.

    --Chris Nauroth

    From: MBA <adaryl.wakefield@hotmail.com>
    Reply-To: "user@hadoop.apache.org" <user@hadoop.apache.org>
    Date: Tuesday, November 3, 2015 at 11:16 AM
    To: "user@hadoop.apache.org" <user@hadoop.apache.org>
    Subject: Re: hadoop not using whole disk for HDFS


    Yeah. It has the current value of 1073741824 which is like 1.07 gig.

    B.
    From: Chris Nauroth 
    Sent: Tuesday, November 03, 2015 11:57 AM
    To: user@hadoop.apache.org 
    Subject: Re: hadoop not using whole disk for HDFS

    Hi Bob,

    Does the hdfs-site.xml configuration file contain the property dfs.datanode.du.reserved?
 If this is defined, then the DataNode intentionally will not use this space for storage of
replicas.

    <property>
      <name>dfs.datanode.du.reserved</name>
      <value>0</value>
      <description>Reserved space in bytes per volume. Always leave this much space
free for non dfs use.
      </description>
    </property>

    --Chris Nauroth

    From: MBA <adaryl.wakefield@hotmail.com>
    Reply-To: "user@hadoop.apache.org" <user@hadoop.apache.org>
    Date: Tuesday, November 3, 2015 at 10:51 AM
    To: "user@hadoop.apache.org" <user@hadoop.apache.org>
    Subject: hadoop not using whole disk for HDFS


    I’ve got the Hortonworks distro running on a three node cluster. For some reason the
disk available for HDFS is MUCH less than the total disk space. Both of my data nodes have
3TB hard drives. Only 100GB of that is being used for HDFS. Is it possible that I have a setting
wrong somewhere? 
    B.


Mime
View raw message