hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDFS-3290) Use a better local directory layout for the datanode
Date Fri, 15 Jan 2016 23:08:40 GMT

     [ https://issues.apache.org/jira/browse/HDFS-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Colin Patrick McCabe resolved HDFS-3290.
----------------------------------------
    Resolution: Duplicate

> Use a better local directory layout for the datanode
> ----------------------------------------------------
>
>                 Key: HDFS-3290
>                 URL: https://issues.apache.org/jira/browse/HDFS-3290
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 0.23.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>
> When the HDFS DataNode stores chunks in a local directory, it currently puts all of the
chunk files into either one big directory, or a collection of directories.  However, there
is no way to know which directory a given block will end up in, given its ID.  As the number
of files increases, this does not scale well.
> Similar to the git version control system, HDFS should create a few different top level
directories keyed off of a few bits in the chunk ID.  Git uses 8 bits.  This substantially
cuts down on the number of chunk files in the same directory and gives increased performance,
while not compromising O(1) lookup of chunks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message