hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jingkei Ly (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3799) Design a pluggable interface to place replicas of blocks in HDFS
Date Tue, 12 May 2009 15:17:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708460#action_12708460

Jingkei Ly commented on HADOOP-3799:

> However, in this patch, I would like to expose the name of the file via the BlockPlacement
policy interface. Any ideas here?

I think having two versions of BlockPlacementInterface#chooseTarget() would be the most efficient
- one that accepts the filename (to be called from FSNamesystem#getAdditionalBlock()) and
another that accepts the INode (to be called from FSNamesystem#computeReplicationWorkForBlock()).
As you said, it does make the interface rather inelegant, though.

An alternative is to pass the Block object to chooseTarget() and let the plugin-code look
up the INode itself in the FSNamesystem map - not particularly efficient, but perhaps plugin-code
could cache INodes to filenames to mitigate it a bit.

> Design a pluggable interface to place replicas of blocks in HDFS
> ----------------------------------------------------------------
>                 Key: HADOOP-3799
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3799
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: BlockPlacementPluggable.txt
> The current HDFS code typically places one replica on local rack, the second replica
on remote random rack and the third replica on a random node of that remote rack. This algorithm
is baked in the NameNode's code. It would be nice to make the block placement algorithm a
pluggable interface. This will allow experimentation of different placement algorithms based
on workloads, availability guarantees and failure models.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message