hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-3290) Use a better local directory layout for the datanode
Date Tue, 17 Apr 2012 20:23:17 GMT
Use a better local directory layout for the datanode

                 Key: HDFS-3290
                 URL: https://issues.apache.org/jira/browse/HDFS-3290
             Project: Hadoop HDFS
          Issue Type: Improvement
    Affects Versions: 0.23.0
            Reporter: Colin Patrick McCabe
            Assignee: Colin Patrick McCabe
            Priority: Minor

When the HDFS DataNode stores chunks in a local directory, it currently puts all of the chunk
files into one big directory.  As the number of files increases, this does not work well at
all.  Local filesystems are not optimized for the case where there are hundreds of thousands
of files in the same directory.  It also makes inspecting directories with standard UNIX tools

Similar to the git version control system, HDFS should create a few different top level directories
keyed off of a few bits in the chunk ID.  Git uses 8 bits.  This substantially cuts down on
the number of chunk files in the same directory and gives increased performance.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message