Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3551B18E13 for ; Fri, 6 Nov 2015 04:04:51 +0000 (UTC) Received: (qmail 44561 invoked by uid 500); 6 Nov 2015 04:04:47 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 44444 invoked by uid 500); 6 Nov 2015 04:04:47 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 44433 invoked by uid 99); 6 Nov 2015 04:04:46 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 06 Nov 2015 04:04:46 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 419191A2FF2 for ; Fri, 6 Nov 2015 04:04:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.99 X-Spam-Level: ** X-Spam-Status: No, score=2.99 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id kFv5W8NjhQqN for ; Fri, 6 Nov 2015 04:04:36 +0000 (UTC) Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [119.145.14.66]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id A03B623811 for ; Fri, 6 Nov 2015 04:04:30 +0000 (UTC) Received: from 172.24.1.47 (EHLO szxeml434-hub.china.huawei.com) ([172.24.1.47]) by szxrg03-dlp.huawei.com (MOS 4.4.3-GA FastPath queued) with ESMTP id BQM51185; Fri, 06 Nov 2015 12:04:24 +0800 (CST) Received: from SZXEML505-MBX.china.huawei.com ([169.254.1.20]) by szxeml434-hub.china.huawei.com ([10.82.67.225]) with mapi id 14.03.0235.001; Fri, 6 Nov 2015 12:04:19 +0800 From: "Naganarasimha G R (Naga)" To: "user@hadoop.apache.org" Subject: RE: hadoop not using whole disk for HDFS Thread-Topic: hadoop not using whole disk for HDFS Thread-Index: AQHRFyzpUeymbsqHkkirYtogbq54oZ6L1zwAgAAHUACAAIhz1v//f+SAgACJH7b//36zAIABwJQAgACKa+T//41UgAATJdbd Date: Fri, 6 Nov 2015 04:04:19 +0000 Message-ID: References: <516863ED4DFA3149A9F907F1395E72527ADCBC@blreml508-mbx> , , ,,,<8AD4EE147886274A8B495D6AF407DF698E49EEC2@szxeml510-mbx.china.huawei.com> In-Reply-To: <8AD4EE147886274A8B495D6AF407DF698E49EEC2@szxeml510-mbx.china.huawei.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.18.250.107] Content-Type: multipart/alternative; boundary="_000_AD354F56741A1B47882A625909A59C692BE3CCB7SZXEML505MBXchi_" MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.563C26C9.001D,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=169.254.1.20, so=2013-05-26 15:14:31, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 50129570aeb28d091c24db3e9b3a3806 --_000_AD354F56741A1B47882A625909A59C692BE3CCB7SZXEML505MBXchi_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Thanks Brahma, dint realize he might have configured both directories and i= was assuming bob has configured single new directory "/hdfs/data". So virtually its showing additional space, manually try to add a data dir in /home, for your usecase, and restart data= nodes. Not sure about the impacs in Ambari but worth a try! , more permanent solut= ion would be better remount Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-home 2.7T 33M 2.7T 1% /home ________________________________ From: Brahma Reddy Battula [brahmareddy.battula@huawei.com] Sent: Friday, November 06, 2015 08:19 To: user@hadoop.apache.org Subject: RE: hadoop not using whole disk for HDFS For each configured dfs.datanode.data.dir , HDFS thinks its in separate par= tiotion and counts the capacity separately. So when another dir is added /h= dfs/data, HDFS thinks new partition is added, So it increased the capacity = 50GB per node. i.e. 100GB for 2 Nodes. Not allowing /home directory to configure for data.dir might be ambari's co= nstraint, instead you can manually try to add a data dir in /home, for your= usecase, and restart datanodes. Thanks & Regards Brahma Reddy Battula ________________________________ From: Naganarasimha G R (Naga) [garlanaganarasimha@huawei.com] Sent: Friday, November 06, 2015 7:20 AM To: user@hadoop.apache.org Subject: RE: hadoop not using whole disk for HDFS Hi Bob, 1. I wasn=92t able to set the config to /home/hdfs/data. I got an error tha= t told me I=92m not allowed to set that config to the /home directory. So I= made it /hdfs/data. Naga : I am not sure about the HDP Distro but if you make it point to /hdfs= /data, still it will be pointing to the root mount itself i.e. /dev/mapper/centos-root 50G 12G 39G 23% / Other Alternative is to mount the drive to some other folder other than /ho= me and then try. 2. When I restarted, the space available increased by a whopping 100GB. Naga : I am particularly not sure how this happened may be you can again re= check if you enter the command "df -h "= you will find out how much disk space is available on the related mount f= or which the path is configured. Regards, + Naga ________________________________ From: Adaryl "Bob" Wakefield, MBA [adaryl.wakefield@hotmail.com] Sent: Friday, November 06, 2015 06:54 To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS Is there a maximum amount of disk space that HDFS will use? Is 100GB that m= ax? When we=92re supposed to be dealing with =93big data=94 why is the amou= nt of data to be held on any one box such a small number when you=92ve got = terabytes available? Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Adaryl "Bob" Wakefield, MBA Sent: Wednesday, November 04, 2015 4:38 PM To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS This is an experimental cluster and there isn=92t anything I can=92t lose. = I ran into some issues. I=92m running the Hortonworks distro and am managin= g things through Ambari. 1. I wasn=92t able to set the config to /home/hdfs/data. I got an error tha= t told me I=92m not allowed to set that config to the /home directory. So I= made it /hdfs/data. 2. When I restarted, the space available increased by a whopping 100GB. Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Naganarasimha G R (Naga) Sent: Wednesday, November 04, 2015 4:26 PM To: user@hadoop.apache.org Subject: RE: hadoop not using whole disk for HDFS Better would be to stop the daemons and copy the data from /hadoop/hdfs/dat= a to /home/hdfs/data , reconfigure dfs.datanode.data.dir to /home/hdfs/data= and then start the daemons. If the data is comparitively less ! Ensure you have the backup if have any critical data ! Regards, + Naga ________________________________ From: Adaryl "Bob" Wakefield, MBA [adaryl.wakefield@hotmail.com] Sent: Thursday, November 05, 2015 03:40 To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS So like I can just create a new folder in the home directory like: home/hdfs/data and then set dfs.datanode.data.dir to: /hadoop/hdfs/data,home/hdfs/data Restart the node and that should do it correct? Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Naganarasimha G R (Naga) Sent: Wednesday, November 04, 2015 3:59 PM To: user@hadoop.apache.org Subject: RE: hadoop not using whole disk for HDFS Hi Bob, Seems like you have configured to disk dir to be other than an folder in /h= ome, if so try creating another folder and add to "dfs.datanode.data.dir" s= eperated by comma instead of trying to reset the default. And its also advised not to use the root partition "/" to be configured for= HDFS data dir, if the Dir usage hits the maximum then OS might fail to fun= ction properly. Regards, + Naga ________________________________ From: P lva [ruvikal@gmail.com] Sent: Thursday, November 05, 2015 03:11 To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS What does your dfs.datanode.data.dir point to ? On Wed, Nov 4, 2015 at 4:14 PM, Adaryl "Bob" Wakefield, MBA > wrote: Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 50G 12G 39G 23% / devtmpfs 16G 0 16G 0% /dev tmpfs 16G 0 16G 0% /dev/shm tmpfs 16G 1.4G 15G 9% /run tmpfs 16G 0 16G 0% /sys/fs/cgroup /dev/sda2 494M 123M 372M 25% /boot /dev/mapper/centos-home 2.7T 33M 2.7T 1% /home That=92s from one datanode. The second one is nearly identical. I discovere= d that 50GB is actually a default. That seems really weird. Disk space is c= heap. Why would you not just use most of the disk and why is it so hard to = reset the default? Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Chris Nauroth Sent: Wednesday, November 04, 2015 12:16 PM To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS How are those drives partitioned? Is it possible that the directories poin= ted to by the dfs.datanode.data.dir property in hdfs-site.xml reside on par= titions that are sized to only 100 GB? Running commands like df would be a= good way to check this at the OS level, independently of Hadoop. --Chris Nauroth From: MBA > Reply-To: "user@hadoop.apache.org" > Date: Tuesday, November 3, 2015 at 11:16 AM To: "user@hadoop.apache.org" > Subject: Re: hadoop not using whole disk for HDFS Yeah. It has the current value of 1073741824 which is like 1.07 gig. B. From: Chris Nauroth Sent: Tuesday, November 03, 2015 11:57 AM To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS Hi Bob, Does the hdfs-site.xml configuration file contain the property dfs.datanode= .du.reserved? If this is defined, then the DataNode intentionally will not= use this space for storage of replicas. dfs.datanode.du.reserved 0 Reserved space in bytes per volume. Always leave this much s= pace free for non dfs use. --Chris Nauroth From: MBA > Reply-To: "user@hadoop.apache.org" > Date: Tuesday, November 3, 2015 at 10:51 AM To: "user@hadoop.apache.org" > Subject: hadoop not using whole disk for HDFS I=92ve got the Hortonworks distro running on a three node cluster. For some= reason the disk available for HDFS is MUCH less than the total disk space.= Both of my data nodes have 3TB hard drives. Only 100GB of that is being us= ed for HDFS. Is it possible that I have a setting wrong somewhere? B. --_000_AD354F56741A1B47882A625909A59C692BE3CCB7SZXEML505MBXchi_ Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable
Thanks Brahma, dint realize he might have configured both directorie= s and i was assuming bob has configured single new directory "/hdfs/data&quo= t;.  
So virtually its showing additional space, 
m= anually try to add a data dir in /home, for your usecase, and restart da= tanodes.
Not sure about the impacs in Ambari but worth a= try! , more permanent solution would be better remount 
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-home 2.7T 33M 2.7T 1% /home
From: Brahma Reddy Battula [brahmareddy.b= attula@huawei.com]
Sent: Friday, November 06, 2015 08:19
To: user@hadoop.apache.org
Subject: RE: hadoop not using whole disk for HDFS


For each configured dfs.datanode.data.dir , HDFS thinks its in separ= ate partiotion and counts the capacity separately. So when another dir is a= dded /hdfs/data, HDFS thinks new partition is added, So it increased the ca= pacity 50GB per node. i.e. 100GB for 2 Nodes.

Not allowing /home directory to configure for data.dir might be ambari's co= nstraint, instead you can manually try to add a data dir in /home, for your usecase, and resta= rt datanodes.



Thanks & Regards

 Brahma Reddy Battul= a

 



From: Naganarasimha G R (Naga) [garlanagana= rasimha@huawei.com]
Sent: Friday, November 06, 2015 7:20 AM
To: user@hadoop.apache.org
Subject: RE: hadoop not using whole disk for HDFS

Hi Bob,

 

1. I wasn=92t able to set the config to /home/hdfs/data. I got an er= ror that told me I=92m not allowed to set that config to the /home director= y. So I made it /hdfs/data.

Naga : I am not sure about the HDP Distro but if you ma= ke it point to /hdfs/data, still it will be pointing to the root mount itself i.e= .

    /dev/mapper/cent= os-root 50G 12G 39G 23% /

Other Alternative is to mount the drive to some other folder other = than /home and then try.

 

2. When I restarted, the space available increased by a whopping 1= 00GB.

Naga : I am particularly not sure how this happened may= be you can again recheck if you enter the command "df -h <path of the NM data dir configured>" = you will find out how much disk space is available on the related mou= nt for which the path is configured.

 

Regards,

+ Naga

 

 

 


From: Adaryl "Bob" Wakefield, MBA= [adaryl.wakefield@hotmail.com]
Sent: Friday, November 06, 2015 06:54
To: user@hadoop.apache.org
Subject: Re: hadoop not using whole disk for HDFS

Is there a maximum amount of disk space that HDFS will use? Is 100GB t= hat max? When we=92re supposed to be dealing with =93big data=94 why is the= amount of data to be held on any one box such a small number when you=92ve= got terabytes available?
 
Adaryl = "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
 
Sent: Wednesday, November 04, 2015 4:38 PM
Subject: Re: hadoop not using whole disk for HDFS
 
This is an experimental cluster and there isn=92t anything I can=92t l= ose. I ran into some issues. I=92m running the Hortonworks distro and am ma= naging things through Ambari.
 
1. I wasn=92t able to set the config to /home/hdfs/data. I got an erro= r that told me I=92m not allowed to set that config to the /home directory.= So I made it /hdfs/data.
2. When I restarted, the space available increased by a whopping 100GB= .
 
 
 
Adaryl = "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
 
Sent: Wednesday, November 04, 2015 4:26 PM
Subject: RE: hadoop not using whole disk for HDFS
 

Better would be to stop the daemons and copy the data from /hadoop/h= dfs/data to /home/hdfs/data , reconfigure dfs.datanode.data.dir to /home/hdfs/data and then start the daemons. If the data is comparitive= ly less !

Ensure you have the backup if have any critical data !

 

Regards,

+ Naga


From: Adaryl "Bob" Wakefield, MBA= [adaryl.wakefield@hotmail.com]
Sent: Thursday, November 05, 2015 03:40
To: user@hadoop.apache.org
Subject: Re: hadoop not using whole disk for HDFS

So like I can just create a new folder in the home directory like:
home/hdfs/data
and then set dfs.datanode.data.dir to:
/hadoop/hdfs/data,home/hdfs/data
 
Restart the node and that should do it correct?
 
Adaryl = "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
 
Sent: Wednesday, November 04, 2015 3:59 PM
Subject: RE: hadoop not using whole disk for HDFS
 

Hi Bob,

 

Seems like you have configured to disk dir to be other than an folder in= /home, if so try creating another folder and add to "dfs.datanode.data.dir" seperated by comma inste= ad of trying to reset the default.

And its also advised not to use the root partition "/" to be c= onfigured for HDFS data dir, if the Dir usage hits the maximum then OS migh= t fail to function properly.

 

Regards,

+ Naga


From: P lva [ruvikal@gmail.com]
Sent: Thursday, November 05, 2015 03:11
To: user@hadoop.apache.org
Subject: Re: hadoop not using whole disk for HDFS

What does your dfs.datanode.data.dir point to ?
 
 
On Wed, Nov 4, 2015 at 4:14 PM, Adaryl "Bob= " Wakefield, MBA <adaryl.wakefield@hotmail.com> wrote:
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 50G 12G 39G 23% /
devtmpfs 16G 0 16G 0% /dev
tmpfs 16G 0 16G 0% /dev/shm
tmpfs 16G 1.4G 15G 9% /run
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/sda2 494M 123M 372M 25% /boot
/dev/mapper/centos-home 2.7T 33M 2.7T 1% /home
 
That=92s from one datanode. The second one is nearly identical. I disc= overed that 50GB is actually a default. That seems really weird. Disk space= is cheap. Why would you not just use most of the disk and why is it so har= d to reset the default?
 
Adaryl = "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
9= 13.938.6685
ww= w.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
 
Sent: Wednesday, November 04, 2015 12:16 PM
Subject: Re: hadoop not using whole disk for HDFS
 
How are those drives partitioned?  Is it possible that the direct= ories pointed to by the dfs.datanode.data.dir property in hdfs-site.xml res= ide on partitions that are sized to only 100 GB?  Running commands lik= e df would be a good way to check this at the OS level, independently of Hadoop.
 
--Chris Nauroth<= /font>
 
From: MBA <adaryl.wakefield@hotmail.com>
Reply-To: "
user@hadoop.apache.org" &= lt;user@hadoop.= apache.org>
Date: Tuesday, November 3, 2015 at = 11:16 AM
To: "user@hadoop.apache.org" <user@hadoop.apache= .org>
Subject: Re: hadoop not using whole= disk for HDFS
 
Yeah. It has the current value of 1073741824 which is like 1.07 gig.
 
B.
Sent: Tuesday, November 03, 2015 11:57 AM
Subject: Re: hadoop not using whole disk for HDFS
 
Hi Bob,
 
Does the hdfs-site.xml configuration file contain the property dfs.dat= anode.du.reserved?  If this is defined, then the DataNode intentionall= y will not use this space for storage of replicas.
 
<property>
  <name>dfs.datanode.du.reserved</name>
  <value>0</value>
  <description>Reserved space in bytes per volume. Always l= eave this much space free for non dfs use.
  </description>
</property>
 
--Chris Nauroth<= /font>
 
From: MBA <adaryl.wakefield@hotmail.com>
Reply-To: "
user@hadoop.apache.org" &= lt;user@hadoop.= apache.org>
Date: Tuesday, November 3, 2015 at = 10:51 AM
To: "user@hadoop.apache.org" <user@hadoop.apache= .org>
Subject: hadoop not using whole dis= k for HDFS
 
I=92ve got the Hortonworks distro running on a three node cluster. For some= reason the disk available for HDFS is MUCH less than the total disk space.= Both of my data nodes have 3TB hard drives. Only 100GB of that is being us= ed for HDFS. Is it possible that I have a setting wrong somewhere?
 
B.
 
--_000_AD354F56741A1B47882A625909A59C692BE3CCB7SZXEML505MBXchi_--