hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6482) Use block ID-based block layout on datanodes
Date Mon, 07 Jul 2014 20:10:36 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054141#comment-14054141

Suresh Srinivas commented on HDFS-6482:

bq. My understanding of this was that it was an optimization for the cases where the datanode
layout hadn't changed significantly (which was most upgrades).
One of the key requirements of rolling upgrades was to keep datanode upgrade time as short
as possible. Second, current rolling upgrades does not take hardlinks as I mentioned already.
Hence if the assumption is hardlinks will be made, that needs to be factored in.

bq. It should not be interpreted as a hard limitation that prevents us from making any changes
for the datanode layout in the future.
Not all datanode layout changes need massive changes to underlying directory structure. One
solution is to support both directory structures and as the blocks get deleted and re-added,
they will naturally migrate to the new scheme.

> Use block ID-based block layout on datanodes
> --------------------------------------------
>                 Key: HDFS-6482
>                 URL: https://issues.apache.org/jira/browse/HDFS-6482
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.5.0
>            Reporter: James Thomas
>            Assignee: James Thomas
>         Attachments: HDFS-6482.1.patch, HDFS-6482.2.patch, HDFS-6482.3.patch, HDFS-6482.4.patch,
HDFS-6482.5.patch, HDFS-6482.6.patch, HDFS-6482.7.patch, HDFS-6482.patch
> Right now blocks are placed into directories that are split into many subdirectories
when capacity is reached. Instead we can use a block's ID to determine the path it should
go in. This eliminates the need for the LDir data structure that facilitates the splitting
of directories when they reach capacity as well as fields in ReplicaInfo that keep track of
a replica's location.
> An extension of the work in HDFS-3290.

This message was sent by Atlassian JIRA

View raw message