hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ViSolve Hadoop Support <hadoop.supp...@visolve.com>
Subject Re: No space when running a hadoop job
Date Fri, 03 Oct 2014 06:29:52 GMT
Hello,

If you want to use drive /dev/xvda4 only, then add file location for 
'/dev/xvda4' and remove the file location for '/dev/xvda2' under 
"dfs.datanode.data.dir".

After the changes restart the hadoop services and check the available 
space using the below command.
      # hadoop fs -df -h

Regards,
ViSolve Hadoop Team

On 10/3/2014 4:36 AM, Abdul Navaz wrote:
> Hello,
>
> As you suggested I have changed the hdfs-site.xml file of datanodes 
> and name node as below and formatted the name node.
>
> </property>
>
> <property>
>
> <name>dfs.datanode.data.dir</name>
>
> <value>/mnt</value>
>
> <description>Comma separated list of paths. Use the list of 
> directories from $DFS_DATA_DIR.
>
>                 For example, 
> /grid/hadoop/hdfs/dn,/grid1/hadoop/hdfs/dn.</description>
>
> </property>
>
>
>
> hduser@dn1:~$ df -h
>
> Filesystem                             Size  Used Avail Use% Mounted on
>
> /dev/xvda2                             5.9G  5.3G  258M  96% /
>
> udev                             98M  4.0K   98M   1% /dev
>
> tmpfs                             48M  196K   48M   1% /run
>
> none                             5.0M     0  5.0M   0% /run/lock
>
> none                             120M     0  120M   0% /run/shm
>
> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G  113G   70G  62% 
> /groups/ch-geni-net/Hadoop-NET
>
> 172.17.253.254:/q/proj/ch-geni-net               198G  113G   70G  62% 
> /proj/ch-geni-net
>
> /dev/xvda4                             7.9G  147M  7.4G   2% /mnt
>
> hduser@dn1:~$
>
>
>
> Even after doing so, the file is copied only to /dev/xvda2 instead of 
> /dev/xvda4.
>
> Once /dev/xvda2 is full I am getting the below error message.
>
> hduser@nn:~$ hadoop fs -put file.txtac /user/hduser/getty/file12.txt
>
> Warning: $HADOOP_HOME is deprecated.
>
>
> 14/10/02 16:52:52 WARN hdfs.DFSClient: DataStreamer Exception: 
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File 
> /user/hduser/getty/file12.txt could only be replicated to 0 nodes, 
> instead of 1
>
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1639)
>
>
>
>
> Let me say like this: I don't want to use /dev/xvda2 as it has 
> capacity of 5.9GB , I want to use only /dev/xvda4. How can I do this ?
>
>
>
>
> Thanks & Regards,
>
> Abdul Navaz
> Research Assistant
> University of Houston Main Campus, Houston TX
> Ph: 281-685-0388
>
>
> From: Abdul Navaz <navaz.enc@gmail.com <mailto:navaz.enc@gmail.com>>
> Date: Monday, September 29, 2014 at 1:53 PM
> To: <user@hadoop.apache.org <mailto:user@hadoop.apache.org>>
> Subject: Re: No space when running a hadoop job
>
> Dear All,
>
> I am not doing load balancing here. I am just copying a file and it is 
> throwing me an error no space left on the device.
>
>
> hduser@dn1:~$ df -h
>
> Filesystem                                     Size  Used Avail Use% 
> Mounted on
>
> /dev/xvda2               5.9G  5.1G  533M  91% /
>
> udev                                     98M  4.0K   98M   1% /dev
>
> tmpfs                                     48M  196K   48M   1% /run
>
> none                                     5.0M     0  5.0M   0% /run/lock
>
> none                                     120M     0  120M   0% /run/shm
>
> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G  116G   67G  64% 
> /groups/ch-geni-net/Hadoop-NET
>
> 172.17.253.254:/q/proj/ch-geni-net               198G  116G   67G  64% 
> /proj/ch-geni-net
>
> /dev/xvda4               7.9G  147M  7.4G   2% /mnt
>
> hduser@dn1:~$
>
> hduser@dn1:~$
>
> hduser@dn1:~$
>
> hduser@dn1:~$ cp data2.txt data3.txt
>
> cp: writing `data3.txt': No space left on device
>
> cp: failed to extend `data3.txt': No space left on device
>
> hduser@dn1:~$
>
>
> I guess by default it is copying to default location. Why I am getting 
> this error ? How can I fix this ?
>
>
> Thanks & Regards,
>
> Abdul Navaz
> Research Assistant
> University of Houston Main Campus, Houston TX
> Ph: 281-685-0388
>
>
> From: Aitor Cedres <acedres@pivotal.io <mailto:acedres@pivotal.io>>
> Reply-To: <user@hadoop.apache.org <mailto:user@hadoop.apache.org>>
> Date: Monday, September 29, 2014 at 7:53 AM
> To: <user@hadoop.apache.org <mailto:user@hadoop.apache.org>>
> Subject: Re: No space when running a hadoop job
>
>
> I think they way it works when HDFS has a list 
> in dfs.datanode.data.dir, it's basically a round robin between disks. 
> And yes, it may not be perfect balanced cause of different file sizes.
>
>
> On 29 September 2014 13:15, Susheel Kumar Gadalay <skgadalay@gmail.com 
> <mailto:skgadalay@gmail.com>> wrote:
>
>     Thank Aitor.
>
>     That is what is my observation too.
>
>     I added a new disk location and manually moved some files.
>
>     But if 2 locations are given at the beginning itself for
>     dfs.datanode.data.dir, will hadoop balance the disks usage, if not
>     perfect because file sizes may differ.
>
>     On 9/29/14, Aitor Cedres <acedres@pivotal.io
>     <mailto:acedres@pivotal.io>> wrote:
>     > Hi Susheel,
>     >
>     > Adding a new directory to "dfs.datanode.data.dir" will not
>     balance your
>     > disks straightforward. Eventually, by HDFS activity
>     (deleting/invalidating
>     > some block, writing new ones), the disks will become balanced.
>     If you want
>     > to balance them right after adding the new disk and changing the
>     > "dfs.datanode.data.dir"
>     > value, you have to shutdown the DN and manually move (mv) some
>     files in the
>     > old directory to the new one.
>     >
>     > The balancer will try to balance the usage between HDFS nodes,
>     but it won't
>     > care about "internal" node disks utilization. For your
>     particular case, the
>     > balancer won't fix your issue.
>     >
>     > Hope it helps,
>     > Aitor
>     >
>     > On 29 September 2014 05:53, Susheel Kumar Gadalay
>     <skgadalay@gmail.com <mailto:skgadalay@gmail.com>>
>     > wrote:
>     >
>     >> You mean if multiple directory locations are given, Hadoop will
>     >> balance the distribution of files across these different
>     directories.
>     >>
>     >> But normally we start with 1 directory location and once it is
>     >> reaching the maximum, we add new directory.
>     >>
>     >> In this case how can we balance the distribution of files?
>     >>
>     >> One way is to list the files and move.
>     >>
>     >> Will start balance script will work?
>     >>
>     >> On 9/27/14, Alexander Pivovarov <apivovarov@gmail.com
>     <mailto:apivovarov@gmail.com>> wrote:
>     >> > It can read/write in parallel to all drives. More hdd more io
>     speed.
>     >> >  On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay"
>     <skgadalay@gmail.com <mailto:skgadalay@gmail.com>>
>     >> > wrote:
>     >> >
>     >> >> Correct me if I am wrong.
>     >> >>
>     >> >> Adding multiple directories will not balance the files
>     distributions
>     >> >> across these locations.
>     >> >>
>     >> >> Hadoop will add exhaust the first directory and then start
>     using the
>     >> >> next, next ..
>     >> >>
>     >> >> How can I tell Hadoop to evenly balance across these
>     directories.
>     >> >>
>     >> >> On 9/26/14, Matt Narrell <matt.narrell@gmail.com
>     <mailto:matt.narrell@gmail.com>> wrote:
>     >> >> > You can add a comma separated list of paths to the
>     >> >> "dfs.datanode.data.dir"
>     >> >> > property in your hdfs-site.xml
>     >> >> >
>     >> >> > mn
>     >> >> >
>     >> >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz
>     <navaz.enc@gmail.com <mailto:navaz.enc@gmail.com>>
>     >> >> > wrote:
>     >> >> >
>     >> >> >> Hi
>     >> >> >>
>     >> >> >> I am facing some space issue when I saving file into HDFS
>     and/or
>     >> >> >> running
>     >> >> >> map reduce job.
>     >> >> >>
>     >> >> >> root@nn:~# df -h
>     >> >> >> Filesystem                              Size  Used Avail
>     >> Use%
>     >> >> >> Mounted on
>     >> >> >> /dev/xvda2                              5.9G  5.9G   
 0
>     >> 100%
>     >> >> >> /
>     >> >> >> udev                               98M  4.0K   98M
>     >>  1%
>     >> >> >> /dev
>     >> >> >> tmpfs                                48M  192K   48M
>     >>  1%
>     >> >> >> /run
>     >> >> >> none                              5.0M     0  5.0M
>     >>  0%
>     >> >> >> /run/lock
>     >> >> >> none                              120M     0  120M
>     >>  0%
>     >> >> >> /run/shm
>     >> >> >> overflow                              1.0M  4.0K 1020K
>     >>  1%
>     >> >> >> /tmp
>     >> >> >> /dev/xvda4                              7.9G  147M  7.4G
>     >>  2%
>     >> >> >> /mnt
>     >> >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET 198G 
>     108G   75G
>     >> 59%
>     >> >> >> /groups/ch-geni-net/Hadoop-NET
>     >> >> >> 172.17.253.254:/q/proj/ch-geni-net    198G  108G   75G
>     >> 59%
>     >> >> >> /proj/ch-geni-net
>     >> >> >> root@nn:~#
>     >> >> >>
>     >> >> >>
>     >> >> >> I can see there is no space left on /dev/xvda2.
>     >> >> >>
>     >> >> >> How can I make hadoop to see newly mounted /dev/xvda4
?
>     Or do I
>     >> >> >> need
>     >> >> >> to
>     >> >> >> move the file manually from /dev/xvda2 to xvda4 ?
>     >> >> >>
>     >> >> >>
>     >> >> >>
>     >> >> >> Thanks & Regards,
>     >> >> >>
>     >> >> >> Abdul Navaz
>     >> >> >> Research Assistant
>     >> >> >> University of Houston Main Campus, Houston TX
>     >> >> >> Ph: 281-685-0388
>     >> >> >>
>     >> >> >
>     >> >> >
>     >> >>
>     >> >
>     >>
>     >
>
>


Mime
View raw message