hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sridhar basam <...@basam.org>
Subject Re: How is hadoop going to handle the next generation disks?
Date Fri, 08 Apr 2011 16:24:13 GMT
BTW this is on systems which have a lot of RAM and aren't under high load.

If you find that your system is evicting dentries/inodes from its cache, you
might want to experiment with drop vm.vfs_cache_pressure from its default so
that the they are preferred over the pagecache. At the extreme, setting it
to 0 means they are never evicted.


On Fri, Apr 8, 2011 at 11:37 AM, sridhar basam <sri@basam.org> wrote:

> How many files do you have per node? What i find is that most of my
> inodes/dentries are almost always cached so calculating the 'du -sk' on a
> host even with hundreds of thousands of files the du -sk generally uses high
> i/o for a couple of seconds. I am using 2TB disks too.
>  Sridhar
> On Fri, Apr 8, 2011 at 12:15 AM, Edward Capriolo <edlinuxguru@gmail.com>wrote:
>> I have a 0.20.2 cluster. I notice that our nodes with 2 TB disks waste
>> tons of disk io doing a 'du -sk' of each data directory. Instead of
>> 'du -sk' why not just do this with java.io.file? How is this going to
>> work with 4TB 8TB disks and up ? It seems like calculating used and
>> free disk space could be done a better way.
>> Edward

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message