hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HDFS-385) Design a pluggable interface to place replicas of blocks in HDFS
Date Wed, 02 Sep 2009 00:03:32 GMT

     [ https://issues.apache.org/jira/browse/HDFS-385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

dhruba borthakur updated HDFS-385:

    Attachment: BlockPlacementPluggable6.txt

1. FsInodeName#getLocalName should return the full path name of the file not the name of the
last component of the path. Better rename the method name to be getFullPathName. The implementation
of this interface in INode should return the full path name as well. Also to allow FsInodeName
to be able to return more information about a file in the future, better change the interface
name to be FsInodeInfo.

2. FsInodeName should also be a parameter to verifyBlockPlacement and isValidMove. This makes
the interface consistent and makes it easier to get a file name for a placement policy like

3. Would it provide better readability if the name of the class BlockPlacementPolicyDefault
is changed to FirstLocalTwoRemoteBlockPlacementPolicy?
I left it as BlockPlacementPolicy and BlockPlacementPolicyDefault just because if we decide
to change the default policy later on, we would not want to rename the file itself. Please
let me know if this is acceptable to you.

4. For the default policy, isValidDelete should not be empty. It should implement the deletion
policy in trunk. So other placement policy can override the default deletion policy.

5. Even with the inValidMove API, the default balancer does not work with the colocation placement
policy because it moves one block a time. I'd like to propose to remove isValidMove API until
we could figure out a general balancer implementation. For this round, the default balancer
could check with NameNode in the very beginning if NameNode runs the default block placement
policy. If no, it stops to run.
Done. I made the Balancer check that the configured policy is BlockPlacementPolicyDefault.
I stayed away from adding an API to the NN at present, especially because this API is experimental
in nature.

Would appreciate it if you can review it one more time. Thanks.

> Design a pluggable interface to place replicas of blocks in HDFS
> ----------------------------------------------------------------
>                 Key: HDFS-385
>                 URL: https://issues.apache.org/jira/browse/HDFS-385
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.21.0
>         Attachments: BlockPlacementPluggable.txt, BlockPlacementPluggable2.txt, BlockPlacementPluggable3.txt,
BlockPlacementPluggable4.txt, BlockPlacementPluggable4.txt, BlockPlacementPluggable5.txt,
> The current HDFS code typically places one replica on local rack, the second replica
on remote random rack and the third replica on a random node of that remote rack. This algorithm
is baked in the NameNode's code. It would be nice to make the block placement algorithm a
pluggable interface. This will allow experimentation of different placement algorithms based
on workloads, availability guarantees and failure models.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message