hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer <awittena...@linkedin.com>
Subject Re: full disk woes
Date Wed, 21 Jul 2010 21:01:33 GMT

On Jul 21, 2010, at 12:45 PM, Travis Crawford wrote:
> Does anyone else run into machines with overfull disks?

It was a common problem when I was at Yahoo!.  As the drives get more full, the NN starts
getting slower and slower, since it is going to have problems with block placement.

> Any tips on how to avoid getting into this situation?

What we started to do was two-fold:

a) During every maintenance, we'd blow away the mapred temp dirs.  The TaskTracker does a
very bad job of cleaning up after jobs and there is usually a lot of cruft.  If you have a
'flat' disk/fs structure such that MR temp and HDFS is shared, this is a huge problem.

b) Blowing away /tmp on a regular basis.  Here at LI, I've got a perl script that I wrote
that reads the output of ls /tmp, finds files/dirs older than 3 days, and removes them.  Since
pig is a little piggy and leaves a ton of useless data in /tmp, I often see 15TB or more disappear
just by doing this.

> /dev/cciss/c0d0       275G  217G   45G  83% /data/disk000

The bigger problem is that Hadoop just really doesn't work well with such small filesystems.
 You might want to check your fs reserved size.  You might be able to squeak out a bit more
space that way too.

> /dev/cciss/c0d14      275G  248G   14G  95% /data/disk014

I'd probably shutdown this data node and manually move blocks off of this drive onto ...

> /dev/cciss/c1d1p1     275G  184G   78G  71% /data/disk025
> /dev/cciss/c1d2p1     275G  176G   86G  68% /data/disk026
> /dev/cciss/c1d3p1     275G  178G   84G  68% /data/disk027
> /dev/cciss/c1d4p1     275G  177G   85G  68% /data/disk028
> /dev/cciss/c1d5p1     275G  179G   83G  69% /data/disk029
> /dev/cciss/c1d6p1     275G  181G   81G  70% /data/disk030

... one of these.
View raw message