hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Why block sizes shown by 'fsck' and '-stat' are inconsistent?
Date Sat, 05 Apr 2014 17:03:38 GMT
The block size is a meta attribute. If you append to the file later,
it still needs to know when to split further - so it keeps that value
as a mere metadata it can use to advise itself on write boundaries.

On Sat, Apr 5, 2014 at 7:35 PM, sam liu <samliuhadoop@gmail.com> wrote:
> Thanks for your comments!
>
> As I mentioned HDFS use only what it needs on the local file system. For
> example, a 16 KB hdfs file only use 16 KB local file system storage, not 64
> MB(its hdfs block size) storage. In this case, what's the use of the block
> size(64 MB) of the 16 KB file?
>
>
> 2014-04-05 17:12 GMT+08:00 Harsh J <harsh@cloudera.com>:
>
>> The fsck is showing you an "average block size", not the block size
>> metadata attribute of the file like stat shows. In this specific case,
>> the average is just the length of your file, which is lesser than one
>> whole block.
>>
>> On Sat, Apr 5, 2014 at 8:21 AM, sam liu <samliuhadoop@gmail.com> wrote:
>> > Hi Experts,
>> >
>> > First, I believe it's no doubt that HDFS use only what it needs on the
>> > local
>> > file system. For example, we store a file(12 KB size) to HDFS, and HDFS
>> > only
>> > use 12 KB on the local file system, and won't use 64 MB(block size) on
>> > the
>> > local file system for that file.
>> >
>> > However, I found the block sizes shown by 'fsck' and '-stat' are
>> > inconsistent:
>> >
>> > 1) hadoop fsck /user/user1/filesize/derby.jar -files -blocks -locations:
>> > output:
>> > ...
>> > BP-1600629425-9.30.122.112-1395627917492:blk_1073743264_2443 len=2673375
>> > ...
>> > Total blocks (validated):      1 (avg. block size 2673375 B)
>> > ...
>> > conslusion:
>> > The block size is 2673375 B shown by fsck.
>> >
>> > 2) hadoop dfs -stat "%b %n %o %r %Y" /user/user1/filesize/derby.jar:
>> > output:
>> > 2673375 derby.jar 134217728 2 1396662626191
>> > conslusion:
>> > The block size is 134217728 B shown by stat.
>> >
>> > Also, if I browser this file from http://namenode:50070, the file size
>> > of
>> > /user/user1/filesize/derby.jar equals to 2.5 MB(2673375 B), however the
>> > block size equals to 128 MB(134217728 B).
>> >
>> > Why block sizes shown by 'fsck' and '-stat' are inconsistent?
>> >
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>
>



-- 
Harsh J

Mime
View raw message