hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Serge Blazhievsky <hadoop...@gmail.com>
Subject Re: Configure Rack Numbers
Date Mon, 17 Nov 2014 05:24:04 GMT
The effective technique to fix block distribution after changes in rack awareness is to increase
replication factor and decrease it back

Regards,
Serge

> On Nov 16, 2014, at 21:10, Brahma Reddy Battula <brahmareddy.battula@huawei.com>
wrote:
> 
> Hi Navaz,
> 
> you have to configure the following two properties in namenode(after that you need to
restart the namenode).
> 
>  <property>
>   <name>topology.node.switch.mapping.impl</name>
>   <value>org.apache.hadoop.net.ScriptBasedMapping</value>
>   <description> The default implementation of the DNSToSwitchMapping. It
>     invokes a script specified in topology.script.file.name to resolve
>     node names. If the value for topology.script.file.name is not set, the
>     default value of DEFAULT_RACK is returned for all node names.
>   </description>
> </property>
> 
> <property>
>   <name>topology.script.file.name</name>
>   <value>/path/to/topo.sh</value>
>   <description> The script name that should be invoked to resolve DNS names to
>     NetworkTopology names. Example: the script would take host.foo.bar as an
>     argument, and return /rack1 as the output.
>   </description>
> </property>
> 
> 
> Example script file.
> 
> 
> topo.sh
> =======
> 
> #!/bin/bash
> 
> python <TOPOLOGY_SCRIPT_HOME>/topology.py "$@"
> 
> 
> topology.py 
> ===========
>  import sys 
> from string import join 
> 
> DEFAULT_RACK = '/default/rack0'; 
> 
> RACK_MAP = { '208.94.2.10' : '/datacenter1/rack0', 
>              '1.2.3.4' : '/datacenter1/rack1', 
>              '1.2.3.5' : '/datacenter1/rack1', 
>              '1.2.3.6' : '/datacenter1/rack1', 
> 
>              '10.2.3.4' : '/datacenter1/rack2', 
>              '10.2.3.4' : '/datacenter1/rack2' 
>     } 
> 
> if len(sys.argv)==1: 
>     print DEFAULT_RACK 
> else: 
>     print join([RACK_MAP.get(i, DEFAULT_RACK) for i in sys.argv[1:]]," ") 
> 
> 
> Please check the following link for more details.
> 
> 
> https://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_proposal.pdf
> 
> 
> 
> Thanks & Regards
> 
>  Brahma Reddy Battula
> 
>  
> 
> HUAWEI TECHNOLOGIES INDIA PVT.LTD.  
> Ground,1&2 floors,Solitaire,  
> 139/26,Amarjyoti Layout,Intermediate Ring Road,Domlur  
> Bangalore - 560 071 , India  
> Tel : +91- 80- 3980 9600  Ext No: 4905 
>  Fax : +91-80-41118578 
> 
> From: Abdul Navaz [navaz.enc@gmail.com]
> Sent: Monday, November 17, 2014 4:48 AM
> To: user@hadoop.apache.org
> Subject: Configure Rack Numbers
> 
> Hello,
> 
> I have hadoop cluster with 9 nodes. All belongs to /default racks. But I want the setup
something similar to this.
> 
> (All are in same subnets)
> 
>  Rack 0: DataNode1,Datanode2,DataNode3 and top of rack switch1.
>  Rack 1: DataNode4,Datanode5,DataNode6 and top of rack switch2.
>  Rack 3: DataNode7,Datanode8,DataNode9 and top of rack switch3.
> I am trying to check the Hadoop rack awareness and how it copies the single block of
data in one rack and replicas in some other rack. I want to analyse some network performance
from this.
> 
> So how can we separate this DNs based on rack numbers. Where can we configure this rack
numbers and say this DN belongs to this rack number.
> 
> 
> 
> Thanks & Regards,
> 
> Abdul Navaz
> 

Mime
View raw message