hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-972) Improve the rack-aware replica placement performance
Date Fri, 16 Feb 2007 20:10:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473815

dhruba borthakur commented on HADOOP-972:

+1, looks good.

1. It may be possible to further optimize getLeave() by making it non-recursive. But in the
current case, the network topology map is only two levels deep and this optimization might
not give us any immediate performance gain.

2. In this implementation, if we have a large number of racks, the time that chooseRandom()
takes to pick a node increases when the selected node index lies towards the end of the range
of datanode indices. Again, this probably will have some material impact only when the topology
tree is deep and there are thousands of racks.

> Improve the rack-aware replica placement performance
> ----------------------------------------------------
>                 Key: HADOOP-972
>                 URL: https://issues.apache.org/jira/browse/HADOOP-972
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.11.0
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>             Fix For: 0.12.0
>         Attachments: rack_performance.patch
> This issue aims to improve the rack-aware replica placement performance. A major idea
is to avoid constructing lists of possible targets for random selection in chooseTarget, which
currently needs interating all DatanodeDescriptors. I plan to change the NetworkTopology data
structure as follow:
> 1. each InnerNode stores its childrens as a list;
> 2. each InnerNode adds a new field numberOfLeaves the total number of leaves (i.e. data
nodes) in its subtree. 
> NetworkTopology will support two new methods:
> 1. DatanodeDescriptor chooseRandom( String scope): it randomly choose one leave from
> 2. DatanodeDescriptor chooseRandomExclude(String excludedScope): it randomly choose one
leave from ~scope
> In addition, Issue 971 will also help improve the performance of the rack-aware DFS patch.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message