Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 65CCF181A8 for ; Sat, 7 Nov 2015 22:48:28 +0000 (UTC) Received: (qmail 27754 invoked by uid 500); 7 Nov 2015 22:48:21 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 27639 invoked by uid 500); 7 Nov 2015 22:48:21 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 27629 invoked by uid 99); 7 Nov 2015 22:48:21 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 Nov 2015 22:48:21 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 09D641A08E6 for ; Sat, 7 Nov 2015 22:48:21 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.97 X-Spam-Level: ** X-Spam-Status: No, score=2.97 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id refz852m9V8y for ; Sat, 7 Nov 2015 22:48:13 +0000 (UTC) Received: from BAY004-OMC2S27.hotmail.com (bay004-omc2s27.hotmail.com [65.54.190.102]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id E17F2203BD for ; Sat, 7 Nov 2015 22:48:12 +0000 (UTC) Received: from BAY167-DS8 ([65.54.190.124]) by BAY004-OMC2S27.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Sat, 7 Nov 2015 14:48:06 -0800 X-TMN: [iPHuMJPiWrPkHaiXQon6bqvzgjpRrGWE] X-Originating-Email: [adaryl.wakefield@hotmail.com] Message-ID: From: "Adaryl \"Bob\" Wakefield, MBA" To: References: <516863ED4DFA3149A9F907F1395E72527ADCBC@blreml508-mbx> , , , In-Reply-To: Subject: Re: hadoop not using whole disk for HDFS Date: Sat, 7 Nov 2015 16:47:54 -0600 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0177_01D1197C.0CCA03E0" X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 15.4.3555.308 X-MimeOLE: Produced By Microsoft MimeOLE V15.4.3555.308 X-OriginalArrivalTime: 07 Nov 2015 22:48:06.0115 (UTC) FILETIME=[5DEB2330:01D119AE] ------=_NextPart_000_0177_01D1197C.0CCA03E0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable I think it might help if I had a better understanding of what I=92m = looking at: /dev/mapper/centos-home 2.7T 33M 2.7T 1% /home=20 So /dev/mapper/centos-home is the file system and /home is where it is = mounted. I=92m not sure I even know what that means. Are you saying that = /hdfs/data even though it=92s in root that it=92s still somehow pointing = to /home? So confused. It=92s the part amount mounting a drive to = another folder..on the same disk. Is it kind of like how on Windows you = can have more than one =93drive=94 on a disk? Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Naganarasimha G R (Naga)=20 Sent: Thursday, November 05, 2015 7:50 PM To: user@hadoop.apache.org=20 Subject: RE: hadoop not using whole disk for HDFS Hi Bob, 1. I wasn=92t able to set the config to /home/hdfs/data. I got an error = that told me I=92m not allowed to set that config to the /home = directory. So I made it /hdfs/data. Naga : I am not sure about the HDP Distro but if you make it point to = /hdfs/data, still it will be pointing to the root mount itself i.e. /dev/mapper/centos-root 50G 12G 39G 23% /=20 Other Alternative is to mount the drive to some other folder other than = /home and then try. 2. When I restarted, the space available increased by a whopping 100GB. Naga : I am particularly not sure how this happened may be you can again = recheck if you enter the command "df -h " you will find out how much disk space is available on the = related mount for which the path is configured. Regards, + Naga -------------------------------------------------------------------------= ------- From: Adaryl "Bob" Wakefield, MBA [adaryl.wakefield@hotmail.com] Sent: Friday, November 06, 2015 06:54 To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS Is there a maximum amount of disk space that HDFS will use? Is 100GB = that max? When we=92re supposed to be dealing with =93big data=94 why is = the amount of data to be held on any one box such a small number when = you=92ve got terabytes available? Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Adaryl "Bob" Wakefield, MBA=20 Sent: Wednesday, November 04, 2015 4:38 PM To: user@hadoop.apache.org=20 Subject: Re: hadoop not using whole disk for HDFS This is an experimental cluster and there isn=92t anything I can=92t = lose. I ran into some issues. I=92m running the Hortonworks distro and = am managing things through Ambari.=20 1. I wasn=92t able to set the config to /home/hdfs/data. I got an error = that told me I=92m not allowed to set that config to the /home = directory. So I made it /hdfs/data. 2. When I restarted, the space available increased by a whopping 100GB. Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Naganarasimha G R (Naga)=20 Sent: Wednesday, November 04, 2015 4:26 PM To: user@hadoop.apache.org=20 Subject: RE: hadoop not using whole disk for HDFS Better would be to stop the daemons and copy the data from = /hadoop/hdfs/data to /home/hdfs/data , reconfigure dfs.datanode.data.dir = to /home/hdfs/data and then start the daemons. If the data is = comparitively less ! Ensure you have the backup if have any critical data ! Regards, + Naga -------------------------------------------------------------------------= ------- From: Adaryl "Bob" Wakefield, MBA [adaryl.wakefield@hotmail.com] Sent: Thursday, November 05, 2015 03:40 To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS So like I can just create a new folder in the home directory like: home/hdfs/data and then set dfs.datanode.data.dir to: /hadoop/hdfs/data,home/hdfs/data Restart the node and that should do it correct? Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Naganarasimha G R (Naga)=20 Sent: Wednesday, November 04, 2015 3:59 PM To: user@hadoop.apache.org=20 Subject: RE: hadoop not using whole disk for HDFS Hi Bob, Seems like you have configured to disk dir to be other than an folder in = /home, if so try creating another folder and add to = "dfs.datanode.data.dir" seperated by comma instead of trying to reset = the default. And its also advised not to use the root partition "/" to be configured = for HDFS data dir, if the Dir usage hits the maximum then OS might fail = to function properly. Regards, + Naga -------------------------------------------------------------------------= ------- From: P lva [ruvikal@gmail.com] Sent: Thursday, November 05, 2015 03:11 To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS What does your dfs.datanode.data.dir point to ? On Wed, Nov 4, 2015 at 4:14 PM, Adaryl "Bob" Wakefield, MBA = wrote: Filesystem Size Used Avail Use% Mounted on=20 /dev/mapper/centos-root 50G 12G 39G 23% /=20 devtmpfs 16G 0 16G 0% /dev=20 tmpfs 16G 0 16G 0% /dev/shm=20 tmpfs 16G 1.4G 15G 9% /run=20 tmpfs 16G 0 16G 0% /sys/fs/cgroup=20 /dev/sda2 494M 123M 372M 25% /boot=20 /dev/mapper/centos-home 2.7T 33M 2.7T 1% /home=20 That=92s from one datanode. The second one is nearly identical. I = discovered that 50GB is actually a default. That seems really weird. = Disk space is cheap. Why would you not just use most of the disk and why = is it so hard to reset the default? Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Chris Nauroth=20 Sent: Wednesday, November 04, 2015 12:16 PM To: user@hadoop.apache.org=20 Subject: Re: hadoop not using whole disk for HDFS How are those drives partitioned? Is it possible that the directories = pointed to by the dfs.datanode.data.dir property in hdfs-site.xml reside = on partitions that are sized to only 100 GB? Running commands like df = would be a good way to check this at the OS level, independently of = Hadoop. --Chris Nauroth From: MBA Reply-To: "user@hadoop.apache.org" Date: Tuesday, November 3, 2015 at 11:16 AM To: "user@hadoop.apache.org" Subject: Re: hadoop not using whole disk for HDFS Yeah. It has the current value of 1073741824 which is like 1.07 gig. B. From: Chris Nauroth=20 Sent: Tuesday, November 03, 2015 11:57 AM To: user@hadoop.apache.org=20 Subject: Re: hadoop not using whole disk for HDFS Hi Bob, Does the hdfs-site.xml configuration file contain the property = dfs.datanode.du.reserved? If this is defined, then the DataNode = intentionally will not use this space for storage of replicas. dfs.datanode.du.reserved 0 Reserved space in bytes per volume. Always leave this = much space free for non dfs use. --Chris Nauroth From: MBA Reply-To: "user@hadoop.apache.org" Date: Tuesday, November 3, 2015 at 10:51 AM To: "user@hadoop.apache.org" Subject: hadoop not using whole disk for HDFS I=92ve got the Hortonworks distro running on a three node cluster. For = some reason the disk available for HDFS is MUCH less than the total disk = space. Both of my data nodes have 3TB hard drives. Only 100GB of that is = being used for HDFS. Is it possible that I have a setting wrong = somewhere?=20 B. ------=_NextPart_000_0177_01D1197C.0CCA03E0 Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable
I think it might help if I had a better understanding of what I=92m = looking=20 at:
/dev/mapper/centos-home 2.7T 33M 2.7T 1% /home
 
So=20 /dev/mapper/centos-home is the file system and /home is where it is = mounted. I=92m=20 not sure I even know what that means. Are you saying that /hdfs/data = even though=20 it=92s in root that it=92s still somehow pointing to /home? So confused. = It=92s the=20 part amount mounting a drive to another folder..on the same disk. Is it = kind of=20 like how on Windows you can have more than one =93drive=94 on a = disk?
 
Adaryl=20 "Bob" Wakefield, MBA
Principal
Mass Street Analytics,=20 LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter:=20 @BobLovesData
 
From: Naganarasimha G R = (Naga)
Sent: Thursday, November 05, 2015 7:50 PM
Subject: RE: hadoop not using whole disk for = HDFS
 

Hi Bob,

 

1. I wasn=92t able to set the config to /home/hdfs/data. I got an = error=20 that told me I=92m not allowed to set that config to the /home = directory. So I=20 made it /hdfs/data.

Naga : I am not sure about the HDP Distro but if you = make it=20 point to /hdfs/data, still it will be pointing to the root = mount itself=20 i.e.

   =20 /dev/mapper/centos-root 50G 12G 39G 23% /

Other Alternative is to mount the drive to some other folder other = than /home=20 and then try.

 

2. When I restarted, the space available increased by a = whopping=20 100GB.

Naga : I am particularly not sure how this happened = may be=20 you can again recheck if you enter the command "df -h <path of = the NM=20 data dir configured>" you will find out how much disk space is = available=20 on the related mount for which the path is configured.

 

Regards,

+ Naga

 

 

 


From: Adaryl "Bob" Wakefield, MBA=20 [adaryl.wakefield@hotmail.com]
Sent: Friday, November 06, 2015 = 06:54
To: user@hadoop.apache.org
Subject: Re: hadoop = not=20 using whole disk for HDFS

Is there a maximum amount of disk space that HDFS will use? Is = 100GB that=20 max? When we=92re supposed to be dealing with =93big data=94 why is the = amount of data=20 to be held on any one box such a small number when you=92ve got = terabytes=20 available?
 
Adaryl=20 "Bob" Wakefield, MBA
Principal
Mass Street Analytics,=20 LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter:=20 @BobLovesData
 
Sent: Wednesday, November 04, 2015 4:38 PM
Subject: Re: hadoop not using whole disk for = HDFS
 
This is an experimental cluster and there isn=92t anything I = can=92t lose. I=20 ran into some issues. I=92m running the Hortonworks distro and am = managing things=20 through Ambari.
 
1. I wasn=92t able to set the config to /home/hdfs/data. I got an = error that=20 told me I=92m not allowed to set that config to the /home directory. So = I made it=20 /hdfs/data.
2. When I restarted, the space available increased by a whopping=20 100GB.
 
 
 
Adaryl=20 "Bob" Wakefield, MBA
Principal
Mass Street Analytics,=20 LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter:=20 @BobLovesData
 
Sent: Wednesday, November 04, 2015 4:26 PM
Subject: RE: hadoop not using whole disk for = HDFS
 

Better would be to stop the daemons and copy the data from=20 /hadoop/hdfs/data to /home/hdfs/data , reconfigure=20 dfs.datanode.data.dir to /home/hdfs/data and then = start the=20 daemons. If the data is comparitively less !

Ensure you have the backup if have any critical data !

 

Regards,

+ Naga


From: Adaryl "Bob" Wakefield, MBA=20 [adaryl.wakefield@hotmail.com]
Sent: Thursday, November 05, = 2015=20 03:40
To: user@hadoop.apache.org
Subject: Re: hadoop = not=20 using whole disk for HDFS

So like I can just create a new folder in the home directory = like:
home/hdfs/data
and then set dfs.datanode.data.dir to:
/hadoop/hdfs/data,home/hdfs/data
 
Restart the node and that should do it correct?
 
Adaryl=20 "Bob" Wakefield, MBA
Principal
Mass Street Analytics,=20 LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter:=20 @BobLovesData
 
Sent: Wednesday, November 04, 2015 3:59 PM
Subject: RE: hadoop not using whole disk for = HDFS
 

Hi Bob,

 

Seems like you have configured to disk dir to be other than an folder = in /home, if so try creating another folder and add to=20 "dfs.datanode.data.dir" seperated by comma instead of = trying to=20 reset the default.

And its also advised not to use the root partition "/" to be = configured for=20 HDFS data dir, if the Dir usage hits the maximum then OS might fail to = function=20 properly.

 

Regards,

+ Naga


From: P lva [ruvikal@gmail.com]
Sent: = Thursday,=20 November 05, 2015 03:11
To: = user@hadoop.apache.org
Subject:=20 Re: hadoop not using whole disk for HDFS

What does your dfs.datanode.data.dir point to ?
 
 
On Wed, Nov 4, 2015 at 4:14 PM, Adaryl "Bob" = Wakefield,=20 MBA <adaryl.wakefield@hotmail.com> wrote:
Filesystem Size Used Avail Use% Mounted = on/dev/mapper/centos-root 50G 12G 39G 23% /devtmpfs 16G 0 16G 0% /devtmpfs 16G 0 16G 0% /dev/shmtmpfs 16G 1.4G 15G 9% /runtmpfs 16G 0 16G 0% /sys/fs/cgroup/dev/sda2 494M 123M 372M 25% /boot/dev/mapper/centos-home 2.7T 33M 2.7T 1% /home
 
That=92s from one datanode. The second one is nearly identical. I = discovered that 50GB is actually a default. That seems really weird. = Disk=20 space is cheap. Why would you not just use most of the disk and why is = it so=20 hard to reset the default?
 
Adaryl=20 "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter:=20 @BobLovesData
 
Sent: Wednesday, November 04, 2015 12:16 PM
Subject: Re: hadoop not using whole disk for=20 HDFS
 
How are those drives partitioned?  Is it possible that the=20 directories pointed to by the dfs.datanode.data.dir property in = hdfs-site.xml=20 reside on partitions that are sized to only 100 GB?  Running = commands=20 like df would be a good way to check this at the OS level, = independently of=20 Hadoop.
 
--Chris=20 Nauroth
 
From: MBA <adaryl.wakefield@hotmail.com>
Reply-To: "user@hadoop.apache.org"=20 <user@hadoop.apache.org>
Date: Tuesday, November 3, 2015 at = 11:16=20 AM
To: "user@hadoop.apache.org"=20 <user@hadoop.apache.org>
Subject: Re: hadoop not using whole = disk for=20 HDFS
 
Yeah. It has the current value of 1073741824 which is like 1.07=20 gig.
 
B.
Sent: Tuesday, November 03, 2015 11:57 AM
Subject: Re: hadoop not using whole disk for=20 HDFS
 
Hi Bob,
 
Does the hdfs-site.xml configuration file contain the property=20 dfs.datanode.du.reserved?  If this is defined, then the DataNode=20 intentionally will not use this space for storage of replicas.
 
<property>
  <name>dfs.datanode.du.reserved</name>
  <value>0</value>
  <description>Reserved space in bytes per volume. = Always=20 leave this much space free for non dfs use.
  </description>
</property>
 
--Chris=20 Nauroth
 
From: MBA <adaryl.wakefield@hotmail.com>
Reply-To: "user@hadoop.apache.org"=20 <user@hadoop.apache.org>
Date: Tuesday, November 3, 2015 at = 10:51=20 AM
To: "user@hadoop.apache.org"=20 <user@hadoop.apache.org>
Subject: hadoop not using whole = disk for=20 HDFS
 
I=92ve=20 got the Hortonworks distro running on a three node cluster. For some = reason=20 the disk available for HDFS is MUCH less than the total disk space. = Both of my=20 data nodes have 3TB hard drives. Only 100GB of that is being used for = HDFS. Is=20 it possible that I have a setting wrong somewhere?
 
B.
 
------=_NextPart_000_0177_01D1197C.0CCA03E0--