hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niels Basjes <Ni...@basjes.nl>
Subject Re: Using df instead of du to calculate datanode space
Date Sat, 21 May 2011 20:41:01 GMT
Hi,

Although I like the thought of doing things smarter I'm never ever
going to change core Unix/Linux applications for the sake of a
specific application. Linux handles scripts and binaries completely
different with regards to security. So how do you know for sure (I
mean 100% sure, not just 99.99999999% sure) that you haven't broken
any other functionality needed to keep your system sane?

Why don't you issue a feature request so this "needless disk io" can
be fixed as part of the base code of Hadoop (instead of breaking the
underlying OS)?

Niels

2011/5/21 Edward Capriolo <edlinuxguru@gmail.com>:
> Good job. I brought this up an another thread, but was told it was not a
> problem. Good thing I'm not crazy.
>
> On Sat, May 21, 2011 at 12:42 AM, Joe Stein
> <charmalloc@allthingshadoop.com>wrote:
>
>> I came up with a nice little hack to trick hadoop into calculating disk
>> usage with df instead of du
>>
>>
>> http://allthingshadoop.com/2011/05/20/faster-datanodes-with-less-wait-io-using-df-instead-of-du/
>>
>> I am running this in production, works like a charm and already
>> seeing benefit, woot!
>>
>> I hope it works well for others too.
>>
>> /*
>> Joe Stein
>> http://www.twitter.com/allthingshadoop
>> */
>>
>



-- 
Met vriendelijke groeten,

Niels Basjes

Mime
View raw message