hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Urso <anthony.u...@gmail.com>
Subject Re: Cluster size for Linux file system
Date Wed, 09 Sep 2009 05:06:19 GMT
There is nothing really preventing you from filling your HDFS with a
lot of very small files*, so it would depend on your use case;
however, typical usage of Hadoop would prescribe as large of a block
size as is available, in order to stream very large files off the disk

* Except namenode heap space


On Tue, Sep 8, 2009 at 10:31 AM, CubicDesign<cubicdesign@gmail.com> wrote:
> Which is the best disk cluster size for a Linux partition (let's say ext3)
> when using Hadoop on top of it?
> The default size is 4KB. It will Hadoop get an advantage if I format the
> disk cluster to 8 or 16KB, or even more?

View raw message