Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 94723 invoked from network); 10 Nov 2006 23:27:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 10 Nov 2006 23:27:09 -0000 Received: (qmail 56593 invoked by uid 500); 10 Nov 2006 23:27:15 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 56506 invoked by uid 500); 10 Nov 2006 23:27:15 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 56405 invoked by uid 99); 10 Nov 2006 23:27:14 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Nov 2006 15:27:14 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Nov 2006 15:27:02 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B6845714315 for ; Fri, 10 Nov 2006 15:26:42 -0800 (PST) Message-ID: <29486817.1163201202745.JavaMail.jira@brutus> Date: Fri, 10 Nov 2006 15:26:42 -0800 (PST) From: "Doug Cutting (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-692) Rack-aware Replica Placement In-Reply-To: <5324842.1163011853487.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ http://issues.apache.org/jira/browse/HADOOP-692?page=comments#action_12448893 ] Doug Cutting commented on HADOOP-692: ------------------------------------- How about a topology interface? public interface NetworkTopology { HubDistance[] getHubDistances(String host) } public interface HubDistance { public String getHubName(); public int getHops(); } The namenode can use this to refresh things dynamically, or it might be read statically from a config file, depending on the implementation. The implementation class can be specified in the config. The default could just chop up hostnames, so that foo.bar.co.uk yeilds hops , , , . Someone could implement this with DNS too. To compute a distance, you find the common hub between two nodes and sum the hops. To find nearby nodes, look for nodes sharing a nearby hub. Etc. > Rack-aware Replica Placement > ---------------------------- > > Key: HADOOP-692 > URL: http://issues.apache.org/jira/browse/HADOOP-692 > Project: Hadoop > Issue Type: Improvement > Components: dfs > Affects Versions: 0.8.0 > Reporter: Hairong Kuang > Assigned To: Hairong Kuang > Fix For: 0.9.0 > > > This issue assumes that HDFS runs on a cluster of computers that spread across many racks. Communication between two nodes on different racks needs to go through switches. Bandwidth in/out of a rack may be less than the total bandwidth of machines in the rack. The purpose of rack-aware replica placement is to improve data reliability, availability, and network bandwidth utilization. The basic idea is that each data node determines to which rack it belongs at the startup time and notifies the name node of the rack id upon registration. The name node maintains a rackid-to-datanode map and tries to place replicas across racks. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira