Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3B31A18ACC for ; Wed, 4 Nov 2015 22:39:25 +0000 (UTC) Received: (qmail 27082 invoked by uid 500); 4 Nov 2015 22:39:20 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 26969 invoked by uid 500); 4 Nov 2015 22:39:19 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 26959 invoked by uid 99); 4 Nov 2015 22:39:19 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Nov 2015 22:39:19 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 3A2BBC8AF9 for ; Wed, 4 Nov 2015 22:39:19 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.97 X-Spam-Level: ** X-Spam-Status: No, score=2.97 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id mSqjEIdlHF4g for ; Wed, 4 Nov 2015 22:39:07 +0000 (UTC) Received: from BAY004-OMC2S12.hotmail.com (bay004-omc2s12.hotmail.com [65.54.190.87]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id 2E2B220F5E for ; Wed, 4 Nov 2015 22:39:06 +0000 (UTC) Received: from BAY167-DS17 ([65.54.190.125]) by BAY004-OMC2S12.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Wed, 4 Nov 2015 14:38:59 -0800 X-TMN: [OuLoZ+3iplzB0Lg/f61XybDcEbzx6ckv] X-Originating-Email: [adaryl.wakefield@hotmail.com] Message-ID: From: "Adaryl \"Bob\" Wakefield, MBA" To: References: <516863ED4DFA3149A9F907F1395E72527ADCBC@blreml508-mbx> , , In-Reply-To: Subject: Re: hadoop not using whole disk for HDFS Date: Wed, 4 Nov 2015 16:38:54 -0600 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_048F_01D1171F.4B99CA10" X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 15.4.3555.308 X-MimeOLE: Produced By Microsoft MimeOLE V15.4.3555.308 X-OriginalArrivalTime: 04 Nov 2015 22:38:59.0217 (UTC) FILETIME=[98B3E010:01D11751] This is a multi-part message in MIME format. ------=_NextPart_000_048F_01D1171F.4B99CA10 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable This is an experimental cluster and there isn=92t anything I can=92t = lose. I ran into some issues. I=92m running the Hortonworks distro and = am managing things through Ambari.=20 1. I wasn=92t able to set the config to /home/hdfs/data. I got an error = that told me I=92m not allowed to set that config to the /home = directory. So I made it /hdfs/data. 2. When I restarted, the space available increased by a whopping 100GB. Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Naganarasimha G R (Naga)=20 Sent: Wednesday, November 04, 2015 4:26 PM To: user@hadoop.apache.org=20 Subject: RE: hadoop not using whole disk for HDFS Better would be to stop the daemons and copy the data from = /hadoop/hdfs/data to /home/hdfs/data , reconfigure dfs.datanode.data.dir = to /home/hdfs/data and then start the daemons. If the data is = comparitively less ! Ensure you have the backup if have any critical data ! Regards, + Naga -------------------------------------------------------------------------= ------- From: Adaryl "Bob" Wakefield, MBA [adaryl.wakefield@hotmail.com] Sent: Thursday, November 05, 2015 03:40 To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS So like I can just create a new folder in the home directory like: home/hdfs/data and then set dfs.datanode.data.dir to: /hadoop/hdfs/data,home/hdfs/data Restart the node and that should do it correct? Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Naganarasimha G R (Naga)=20 Sent: Wednesday, November 04, 2015 3:59 PM To: user@hadoop.apache.org=20 Subject: RE: hadoop not using whole disk for HDFS Hi Bob, Seems like you have configured to disk dir to be other than an folder in = /home, if so try creating another folder and add to = "dfs.datanode.data.dir" seperated by comma instead of trying to reset = the default. And its also advised not to use the root partition "/" to be configured = for HDFS data dir, if the Dir usage hits the maximum then OS might fail = to function properly. Regards, + Naga -------------------------------------------------------------------------= ------- From: P lva [ruvikal@gmail.com] Sent: Thursday, November 05, 2015 03:11 To: user@hadoop.apache.org Subject: Re: hadoop not using whole disk for HDFS What does your dfs.datanode.data.dir point to ? On Wed, Nov 4, 2015 at 4:14 PM, Adaryl "Bob" Wakefield, MBA = wrote: Filesystem Size Used Avail Use% Mounted on=20 /dev/mapper/centos-root 50G 12G 39G 23% /=20 devtmpfs 16G 0 16G 0% /dev=20 tmpfs 16G 0 16G 0% /dev/shm=20 tmpfs 16G 1.4G 15G 9% /run=20 tmpfs 16G 0 16G 0% /sys/fs/cgroup=20 /dev/sda2 494M 123M 372M 25% /boot=20 /dev/mapper/centos-home 2.7T 33M 2.7T 1% /home=20 That=92s from one datanode. The second one is nearly identical. I = discovered that 50GB is actually a default. That seems really weird. = Disk space is cheap. Why would you not just use most of the disk and why = is it so hard to reset the default? Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: Chris Nauroth=20 Sent: Wednesday, November 04, 2015 12:16 PM To: user@hadoop.apache.org=20 Subject: Re: hadoop not using whole disk for HDFS How are those drives partitioned? Is it possible that the directories = pointed to by the dfs.datanode.data.dir property in hdfs-site.xml reside = on partitions that are sized to only 100 GB? Running commands like df = would be a good way to check this at the OS level, independently of = Hadoop. --Chris Nauroth From: MBA Reply-To: "user@hadoop.apache.org" Date: Tuesday, November 3, 2015 at 11:16 AM To: "user@hadoop.apache.org" Subject: Re: hadoop not using whole disk for HDFS Yeah. It has the current value of 1073741824 which is like 1.07 gig. B. From: Chris Nauroth=20 Sent: Tuesday, November 03, 2015 11:57 AM To: user@hadoop.apache.org=20 Subject: Re: hadoop not using whole disk for HDFS Hi Bob, Does the hdfs-site.xml configuration file contain the property = dfs.datanode.du.reserved? If this is defined, then the DataNode = intentionally will not use this space for storage of replicas. dfs.datanode.du.reserved 0 Reserved space in bytes per volume. Always leave this = much space free for non dfs use. --Chris Nauroth From: MBA Reply-To: "user@hadoop.apache.org" Date: Tuesday, November 3, 2015 at 10:51 AM To: "user@hadoop.apache.org" Subject: hadoop not using whole disk for HDFS I=92ve got the Hortonworks distro running on a three node cluster. For = some reason the disk available for HDFS is MUCH less than the total disk = space. Both of my data nodes have 3TB hard drives. Only 100GB of that is = being used for HDFS. Is it possible that I have a setting wrong = somewhere?=20 B. ------=_NextPart_000_048F_01D1171F.4B99CA10 Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable
This is an experimental cluster and there isn=92t anything I = can=92t lose. I=20 ran into some issues. I=92m running the Hortonworks distro and am = managing things=20 through Ambari.
 
1. I wasn=92t able to set the config to /home/hdfs/data. I got an = error that=20 told me I=92m not allowed to set that config to the /home directory. So = I made it=20 /hdfs/data.
2. When I restarted, the space available increased by a whopping=20 100GB.
 
 
 
Adaryl=20 "Bob" Wakefield, MBA
Principal
Mass Street Analytics,=20 LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter:=20 @BobLovesData
 
From: Naganarasimha G R = (Naga)
Sent: Wednesday, November 04, 2015 4:26 PM
Subject: RE: hadoop not using whole disk for = HDFS
 

Better would be to stop the daemons and copy the data from=20 /hadoop/hdfs/data to /home/hdfs/data , reconfigure=20 dfs.datanode.data.dir to /home/hdfs/data and then = start the=20 daemons. If the data is comparitively less !

Ensure you have the backup if have any critical data !

 

Regards,

+ Naga


From: Adaryl "Bob" Wakefield, MBA=20 [adaryl.wakefield@hotmail.com]
Sent: Thursday, November 05, = 2015=20 03:40
To: user@hadoop.apache.org
Subject: Re: hadoop = not=20 using whole disk for HDFS

So like I can just create a new folder in the home directory = like:
home/hdfs/data
and then set dfs.datanode.data.dir to:
/hadoop/hdfs/data,home/hdfs/data
 
Restart the node and that should do it correct?
 
Adaryl=20 "Bob" Wakefield, MBA
Principal
Mass Street Analytics,=20 LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter:=20 @BobLovesData
 
Sent: Wednesday, November 04, 2015 3:59 PM
Subject: RE: hadoop not using whole disk for = HDFS
 

Hi Bob,

 

Seems like you have configured to disk dir to be other than an folder = in /home, if so try creating another folder and add to=20 "dfs.datanode.data.dir" seperated by comma instead of = trying to=20 reset the default.

And its also advised not to use the root partition "/" to be = configured for=20 HDFS data dir, if the Dir usage hits the maximum then OS might fail to = function=20 properly.

 

Regards,

+ Naga


From: P lva [ruvikal@gmail.com]
Sent: = Thursday,=20 November 05, 2015 03:11
To: = user@hadoop.apache.org
Subject:=20 Re: hadoop not using whole disk for HDFS

What does your dfs.datanode.data.dir point to ?
 
 
On Wed, Nov 4, 2015 at 4:14 PM, Adaryl "Bob" = Wakefield,=20 MBA <adaryl.wakefield@hotmail.com> wrote:
Filesystem Size Used Avail Use% Mounted = on/dev/mapper/centos-root 50G 12G 39G 23% /devtmpfs 16G 0 16G 0% /devtmpfs 16G 0 16G 0% /dev/shmtmpfs 16G 1.4G 15G 9% /runtmpfs 16G 0 16G 0% /sys/fs/cgroup/dev/sda2 494M 123M 372M 25% /boot/dev/mapper/centos-home 2.7T 33M 2.7T 1% /home
 
That=92s from one datanode. The second one is nearly identical. I = discovered that 50GB is actually a default. That seems really weird. = Disk=20 space is cheap. Why would you not just use most of the disk and why is = it so=20 hard to reset the default?
 
Adaryl=20 "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter:=20 @BobLovesData
 
Sent: Wednesday, November 04, 2015 12:16 PM
Subject: Re: hadoop not using whole disk for=20 HDFS
 
How are those drives partitioned?  Is it possible that the=20 directories pointed to by the dfs.datanode.data.dir property in = hdfs-site.xml=20 reside on partitions that are sized to only 100 GB?  Running = commands=20 like df would be a good way to check this at the OS level, = independently of=20 Hadoop.
 
--Chris=20 Nauroth
 
From: MBA <adaryl.wakefield@hotmail.com>
Reply-To: "user@hadoop.apache.org"=20 <user@hadoop.apache.org>
Date: Tuesday, November 3, 2015 at = 11:16=20 AM
To: "user@hadoop.apache.org"=20 <user@hadoop.apache.org>
Subject: Re: hadoop not using whole = disk for=20 HDFS
 
Yeah. It has the current value of 1073741824 which is like 1.07=20 gig.
 
B.
Sent: Tuesday, November 03, 2015 11:57 AM
Subject: Re: hadoop not using whole disk for=20 HDFS
 
Hi Bob,
 
Does the hdfs-site.xml configuration file contain the property=20 dfs.datanode.du.reserved?  If this is defined, then the DataNode=20 intentionally will not use this space for storage of replicas.
 
<property>
  <name>dfs.datanode.du.reserved</name>
  <value>0</value>
  <description>Reserved space in bytes per volume. = Always=20 leave this much space free for non dfs use.
  </description>
</property>
 
--Chris=20 Nauroth
 
From: MBA <adaryl.wakefield@hotmail.com>
Reply-To: "user@hadoop.apache.org"=20 <user@hadoop.apache.org>
Date: Tuesday, November 3, 2015 at = 10:51=20 AM
To: "user@hadoop.apache.org"=20 <user@hadoop.apache.org>
Subject: hadoop not using whole = disk for=20 HDFS
 
I=92ve=20 got the Hortonworks distro running on a three node cluster. For some = reason=20 the disk available for HDFS is MUCH less than the total disk space. = Both of my=20 data nodes have 3TB hard drives. Only 100GB of that is being used for = HDFS. Is=20 it possible that I have a setting wrong somewhere?
 
B.
 
------=_NextPart_000_048F_01D1171F.4B99CA10--