hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-1962) Enhance MiniDFSCluster to improve testing of network topology distance related issues.
Date Thu, 19 May 2011 15:40:47 GMT
Enhance MiniDFSCluster to improve testing of network topology distance related issues.
--------------------------------------------------------------------------------------

                 Key: HDFS-1962
                 URL: https://issues.apache.org/jira/browse/HDFS-1962
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: test
    Affects Versions: 0.22.0
            Reporter: Eric Payne
             Fix For: 0.23.0


In Jira HDFS-1875, Tanping Wang added the following comment. In order to keep the scope of
HDFS-1875 small, I have created this Jira to capture this need.

-------------------------------------------------
It would be really useful if we can have multiple simulated data nodes binded to different
hosts and dfs client binded to a particular host. And futher down the road, some of the simulated
data nodes on different hosts, but the same rack. We can use this to test network topology
distance related issues.

One of the related problem that I ran into was that the order of data nodes in LocatedBlock
returned by name nodes is sorted by NetworkTopology#pseudoSortByDistance(). In current Mini
dfs cluster, there is no way I can bind the client to a host or bind a simulated data node
to a particular host/rack. It would be nice if mini dfs cluster can make this possible, so
that the network topology distance of client to each data node is fixed. Therefore, the order
of data nodes returned within a LocatedBlock on MiniDFS cluster is fixed. Currently the order
of data nodes in LocatedBlock is randomly sorted which means NetworkTopology understand the
DFSClient and simulated datanodes are not different hosts and different racks. 

Also in currently Mini DFS client provides the option of -racks when starting data nodes.
But we can not bind multiple simulated data nodes to one rack... so it is not really that
useful.
-------------------------------------------------


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message