hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aitor Cedres <aced...@pivotal.io>
Subject Re: No space when running a hadoop job
Date Mon, 29 Sep 2014 12:53:43 GMT
I think they way it works when HDFS has a list in dfs.datanode.data.dir,
it's basically a round robin between disks. And yes, it may not be perfect
balanced cause of different file sizes.


On 29 September 2014 13:15, Susheel Kumar Gadalay <skgadalay@gmail.com>
wrote:

> Thank Aitor.
>
> That is what is my observation too.
>
> I added a new disk location and manually moved some files.
>
> But if 2 locations are given at the beginning itself for
> dfs.datanode.data.dir, will hadoop balance the disks usage, if not
> perfect because file sizes may differ.
>
> On 9/29/14, Aitor Cedres <acedres@pivotal.io> wrote:
> > Hi Susheel,
> >
> > Adding a new directory to “dfs.datanode.data.dir” will not balance your
> > disks straightforward. Eventually, by HDFS activity
> (deleting/invalidating
> > some block, writing new ones), the disks will become balanced. If you
> want
> > to balance them right after adding the new disk and changing the
> > “dfs.datanode.data.dir”
> > value, you have to shutdown the DN and manually move (mv) some files in
> the
> > old directory to the new one.
> >
> > The balancer will try to balance the usage between HDFS nodes, but it
> won't
> > care about "internal" node disks utilization. For your particular case,
> the
> > balancer won't fix your issue.
> >
> > Hope it helps,
> > Aitor
> >
> > On 29 September 2014 05:53, Susheel Kumar Gadalay <skgadalay@gmail.com>
> > wrote:
> >
> >> You mean if multiple directory locations are given, Hadoop will
> >> balance the distribution of files across these different directories.
> >>
> >> But normally we start with 1 directory location and once it is
> >> reaching the maximum, we add new directory.
> >>
> >> In this case how can we balance the distribution of files?
> >>
> >> One way is to list the files and move.
> >>
> >> Will start balance script will work?
> >>
> >> On 9/27/14, Alexander Pivovarov <apivovarov@gmail.com> wrote:
> >> > It can read/write in parallel to all drives. More hdd more io speed.
> >> >  On Sep 27, 2014 7:28 AM, "Susheel Kumar Gadalay" <
> skgadalay@gmail.com>
> >> > wrote:
> >> >
> >> >> Correct me if I am wrong.
> >> >>
> >> >> Adding multiple directories will not balance the files distributions
> >> >> across these locations.
> >> >>
> >> >> Hadoop will add exhaust the first directory and then start using the
> >> >> next, next ..
> >> >>
> >> >> How can I tell Hadoop to evenly balance across these directories.
> >> >>
> >> >> On 9/26/14, Matt Narrell <matt.narrell@gmail.com> wrote:
> >> >> > You can add a comma separated list of paths to the
> >> >> “dfs.datanode.data.dir”
> >> >> > property in your hdfs-site.xml
> >> >> >
> >> >> > mn
> >> >> >
> >> >> > On Sep 26, 2014, at 8:37 AM, Abdul Navaz <navaz.enc@gmail.com>
> >> >> > wrote:
> >> >> >
> >> >> >> Hi
> >> >> >>
> >> >> >> I am facing some space issue when I saving file into HDFS
and/or
> >> >> >> running
> >> >> >> map reduce job.
> >> >> >>
> >> >> >> root@nn:~# df -h
> >> >> >> Filesystem                                       Size  Used
Avail
> >> Use%
> >> >> >> Mounted on
> >> >> >> /dev/xvda2                                       5.9G  5.9G
    0
> >> 100%
> >> >> >> /
> >> >> >> udev                                              98M  4.0K
  98M
> >>  1%
> >> >> >> /dev
> >> >> >> tmpfs                                             48M  192K
  48M
> >>  1%
> >> >> >> /run
> >> >> >> none                                             5.0M    
0  5.0M
> >>  0%
> >> >> >> /run/lock
> >> >> >> none                                             120M    
0  120M
> >>  0%
> >> >> >> /run/shm
> >> >> >> overflow                                         1.0M  4.0K
1020K
> >>  1%
> >> >> >> /tmp
> >> >> >> /dev/xvda4                                       7.9G  147M
 7.4G
> >>  2%
> >> >> >> /mnt
> >> >> >> 172.17.253.254:/q/groups/ch-geni-net/Hadoop-NET  198G  108G
  75G
> >> 59%
> >> >> >> /groups/ch-geni-net/Hadoop-NET
> >> >> >> 172.17.253.254:/q/proj/ch-geni-net               198G  108G
  75G
> >> 59%
> >> >> >> /proj/ch-geni-net
> >> >> >> root@nn:~#
> >> >> >>
> >> >> >>
> >> >> >> I can see there is no space left on /dev/xvda2.
> >> >> >>
> >> >> >> How can I make hadoop to see newly mounted /dev/xvda4 ? Or
do I
> >> >> >> need
> >> >> >> to
> >> >> >> move the file manually from /dev/xvda2 to xvda4 ?
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> Thanks & Regards,
> >> >> >>
> >> >> >> Abdul Navaz
> >> >> >> Research Assistant
> >> >> >> University of Houston Main Campus, Houston TX
> >> >> >> Ph: 281-685-0388
> >> >> >>
> >> >> >
> >> >> >
> >> >>
> >> >
> >>
> >
>

Mime
View raw message