hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Theory question: good values for FileStatus.getBlockSize()
Date Mon, 16 Feb 2015 17:44:15 GMT

HADOOP-11601 tightens up the filesystem spec by saying "if len(file) > 0, getFileStatus().getBlockSize()
> 0"

this is to stop filesystems (most recently s3a) returning 0 as a block size, which then kills
any analytics work that tries to partition the workload by blocksize.

I'm currently changing the markdown text to say

MUST be >0 for a file size >0
MAY be 0 for a file of size==0.

+ the relevant tests to check this.

There's one thing I do want to understand from HDFS first: what about small files.? That is:
what does HDFS return as a blocksize if a file is smaller than its block size?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message