hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stanley shi (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-6489) DFS Used space is not correct if there're many append operations
Date Thu, 05 Jun 2014 07:33:01 GMT
stanley shi created HDFS-6489:

             Summary: DFS Used space is not correct if there're many append operations
                 Key: HDFS-6489
                 URL: https://issues.apache.org/jira/browse/HDFS-6489
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode
    Affects Versions: 2.2.0
            Reporter: stanley shi

The current implementation of the Datanode will increase the DFS used space on each block
write operation. This is correct in most scenario (create new file), but sometimes it will
behave in-correct(append small data to a large block).
For example, I have a file with only one block(say, 60M). Then I try to append to it very
frequently but each time I append only 10 bytes;
Then on each append, dfs used will be increased with the length of the block(60M), not teh
actual data length(10bytes).
Consider in a scenario I use many clients to append concurrently to a large number of files
(1000+), assume the block size is 32M (half of the default value), then the dfs used will
be increased 1000*32M = 32G on each append to the files; but actually I only write 10K bytes;
this will cause the datanode to report in-sufficient disk space on data write.
{quote}2014-06-04 15:27:34,719 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock
 BP-1649188734- received exception org.apach
e.hadoop.util.DiskChecker$DiskOutOfSpaceException: Insufficient space for appending to Fin
alizedReplica, blk_1073742834_45306, FINALIZED{quote}

This message was sent by Atlassian JIRA

View raw message