hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kester, Scott" <SKes...@weather.com>
Subject Re: Rapid growth in Non DFS Used disk space
Date Fri, 13 May 2011 20:41:21 GMT
We have a job that cleans up the mapred.local directory, so that¹s not it.
 I have done some further looking at data usage on the datanodes and 99%
of the space used is under the dfs.data.dir/current directory.  What would
be under 'current' that wasn't part of HDFS?

On 5/13/11 3:12 PM, "Allen Wittenauer" <aw@apache.org> wrote:

>
>On May 13, 2011, at 10:48 AM, Todd Lipcon wrote:
>> 
>> 
>>> 2) Any ideas on what is driving the growth in Non DFS Used space?   I
>>> looked for things like growing log files on the datanodes but didn't
>>>find
>>> anything.
>>> 
>> 
>> Logs are one possible culprit. Another is to look for old files that
>>might
>> be orphaned in your mapred.local.dir - there have been bugs in the past
>> where we've leaked files. If you shut down the TaskTrackers, you can
>>safely
>> delete everything from within mapred.local.dirs.
>
>	Part of our S.O.P. during Hadoop bounces is to wipe mapred.local out.
>The TT doesn't properly clean up after itself.


Mime
View raw message