hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: How do I remove "Non DFS Used"?
Date Wed, 07 Jul 2010 14:02:36 GMT
On Wed, Jul 7, 2010 at 9:48 AM, Michael Segel <michael_segel@hotmail.com> wrote:
>
> Non DFS used tends to be logging or some other information on the disk.
>
> So you can't use hadoop commands to remove the files from the disk.
>
>
>
>> Date: Wed, 7 Jul 2010 17:11:38 +0900
>> Subject: How do I remove "Non DFS Used"?
>> From: mp2893@gmail.com
>> To: common-user@hadoop.apache.org
>>
>> I was looking at the web interface and found that some of my nodes have
>> enormous amount of "Non DFS Used".
>>
>> There is even a node with 800GB of "Non DFS Used" which is just ridiculous.
>>
>> I tried to remove them by doing:
>>
>> "hadoop namenode -format"
>>
>> and I also tried deleting "hadoop.tmp.dir" (in my case, which is
>> /home/hadoop/hadoop_storage/tmp/).
>>
>> But when I start my cluster again, there it is again with thousands of giga
>> bytes of "Non DFS Used".
>>
>> Can anyone tell me what "Non DFS Used" is and how to remove them forever?
>>
>> Thanks in advance.
>
> _________________________________________________________________
> The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.
> http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4

I always suggest running tune2fs -m2

http://old.nabble.com/Optimal-Filesystem-(and-Settings)-for-HDFS-td23600272.html

On a 1TB disk you can free up about 30 GB.

If you have been running for a while another thing you can do is check
your task tracker directories for relics. I find distributed cache
jars and task attempts that do not clean up (with my version all the
time), then I use find mtime +7 to find files and remove them.

Mime
View raw message