hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-3297) Update free space in the DataBlockScanner rather than using du
Date Wed, 18 Apr 2012 20:14:40 GMT
Update free space in the DataBlockScanner rather than using du

                 Key: HDFS-3297
                 URL: https://issues.apache.org/jira/browse/HDFS-3297
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: data-node
    Affects Versions: 0.23.0
            Reporter: Colin Patrick McCabe
            Assignee: Colin Patrick McCabe
            Priority: Minor

As the DataNode adds new blocks to a BlockPool, it keeps track of how much space that block
pool consumes.  This information gets sent to the NameNode so we can track statistics and
so forth.

Periodically, we check what's actually on the disk to make sure that the counts we are keeping
are accurate.  The DataNode currently kicks off a "du -s" process through the shell every
few minutes and takes the result as the new used space number.

We should do this in the DataBlockScanner, rather than using a separate du process.  The main
reason to do this is so that we don't cause a lot of random I/O operations on the disk.  Since
du has to visit every file in the BlockPool, it is essentially re-doing the work of the block
scanner, for no reason.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message