hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From elton sky <eltonsky9...@gmail.com>
Subject Re: Block Size
Date Sat, 18 Jun 2011 01:56:08 GMT
This is a tradition from native file system, for avoiding the waste of disk
space. In linux, each data block is 4K. A file is sliced into data blocks
and stored on disk. If the tail block has less than 4K data, the rest of
block space is wasted. So if all your files are multiple of 4K in linux, you
have 100% usage of your disk. Otherwise you waste some part of disk.
In HDFS, each block is actually a file on your native file system, say ext4.
And the size of this file is ideally set to multiple of 4K.

On Sat, Jun 18, 2011 at 5:55 AM, snedix <snxrsch@yahoo.com> wrote:

> Hi all,
> I wanna ask question,
> Is there a reason why block size should be set to some 2^N, for some integer N ? Does
it help
> with block defragmentation etc. ?
> Thanks in advance..

View raw message