hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pranav Prakash (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-10529) Df reports incorrect usage when appending less than block size
Date Wed, 15 Jun 2016 05:15:10 GMT
Pranav Prakash created HDFS-10529:
-------------------------------------

             Summary: Df reports incorrect usage when appending less than block size
                 Key: HDFS-10529
                 URL: https://issues.apache.org/jira/browse/HDFS-10529
             Project: Hadoop HDFS
          Issue Type: Bug
    Affects Versions: 2.7.2, 3.0.0-alpha1
            Reporter: Pranav Prakash
            Priority: Minor


Steps to recreate issue:

1. Create a 100MB file on HDFS cluster with 128MB blocksize and replication factor 3
2. Append 100MB to the file
3. Df reports around 900MB even though it should only be around 600MB.

Looking at the blocks confirms that df is incorrect, as there exist only two blocks on each
DN -- a 128MB block and a 72MB block.

This issue seems to arise because BlockPoolSlice does not account for the delta increase in
dfsUsage when an append happens to a partially-filled block, and instead naively adds the
total block size. For instance, in the example scenario when when block is "filled" from 100
to 128MB, addFinalizedBlock() in BlockPoolSlice adds the size of the newly created block into
the total instead of accounting for the difference/delta in block size between old and new.
 This has the effect of double-counting the old partially-filled block: it is counted once
when it is first created (in the example scenario when the 100MB file is created) and again
when it becomes part of the filled block (in the example scenario when the 128MB block is
formed form the initial 100MB block). Thus the perceived size becomes 100MB + 128MB + 72 =
300 MB for each DN, or 900MB across the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org


Mime
View raw message