Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <17640797.1163113838310.JavaMail.jira@brutus>
Date: Thu, 9 Nov 2006 15:10:38 -0800 (PST)
From: "dhruba borthakur (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Commented: (HADOOP-692) Rack-aware Replica Placement
In-Reply-To: <5324842.1163011853487.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

    [ http://issues.apache.org/jira/browse/HADOOP-692?page=comments#action_12448606 ] 
            
dhruba borthakur commented on HADOOP-692:
-----------------------------------------

I have a two questions: 

1. Is the rackId associated with a datanode or is it associated with the data directories in a datanode? 
2. Suppose that all three replicas was allocated in the same rack because there weren't any space available on other racks. Now, if more racks are added to the system, will HDFS automatically rebalance the replicas to conform to your rack-aware logic?


> Rack-aware Replica Placement
> ----------------------------
>
>                 Key: HADOOP-692
>                 URL: http://issues.apache.org/jira/browse/HADOOP-692
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.8.0
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>             Fix For: 0.9.0
>
>
> This issue assumes that HDFS runs on a cluster of computers that spread across many racks. Communication between two nodes on different racks needs to go through switches. Bandwidth in/out of a rack may be less than the total bandwidth of machines in the rack. The purpose of rack-aware replica placement is to improve data reliability, availability, and network bandwidth utilization. The basic idea is that each data node determines to which rack it belongs at the startup time and notifies the name node of the rack id upon registration. The name node maintains a rackid-to-datanode map and tries to place replicas across racks.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira