hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xi Fang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-5001) Branch-1-Win TestAzureBlockPlacementPolicy and TestReplicationPolicyWithNodeGroup failed
Date Tue, 16 Jul 2013 22:30:48 GMT

     [ https://issues.apache.org/jira/browse/HDFS-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xi Fang updated HDFS-5001:
--------------------------

    Description: 
After the backport patch of HDFS-4975 was committed, TestAzureBlockPlacementPolicy and TestReplicationPolicyWithNodeGroup
failed. 
The cause for the failure of TestReplicationPolicyWithNodeGroup is that some part in the patch
of HDFS-3941 is missing. Our patch for HADOOP-495 makes methods in super class to be called
incorrectly. More specifically, HDFS-4975 backported HDFS-4350, HDFS-4351, and HDFS-3912 to
enable the method parameter "boolean avoidStaleNodes", and updated the APIs in BlockPlacementPolicyDefault.
However, the override methods in ReplicationPolicyWithNodeGroup weren't updated.

The cause for the failure of TestAzureBlockPlacementPolicy is similar.

In addition, TestAzureBlockPlacementPolicy has an error. Here is the error info.

Testcase: testPolicyWithDefaultRacks took 0.005 sec
Caused an ERROR
Invalid network topology. You cannot have a rack and a non-rack node at the same level of
the network topology.
org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid network topology.
You cannot have a rack and a non-rack node at the same level of the network topology.
at org.apache.hadoop.net.NetworkTopology.add(NetworkTopology.java:396)
at org.apache.hadoop.hdfs.server.namenode.TestAzureBlockPlacementPolicy.testPolicyWithDefaultRacks(TestAzureBlockPlacementPolicy.java:779)

The error is caused by a check in NetworkTopology#add(Node node)
{code}
if (depthOfAllLeaves != node.getLevel()) {
  LOG.error("Error: can't add leaf node at depth " +
      node.getLevel() + " to topology:\n" + oldTopoStr);
  throw new InvalidTopologyException("Invalid network topology. " +
      "You cannot have a rack and a non-rack node at the same " +
      "level of the network topology.");
}
{code}

The problem of this check is that when we use NetworkTopology#remove(Node node) to remove
a node from the cluster, depthOfAllLeaves won't change. As a result, we can't reset the value
of NetworkTopology#depathOfAllLeaves of the old topology of a cluster by just removing all
its dataNode. See TestAzureBlockPlacementPolicy#testPolicyWithDefaultRacks()
// clear the old topology
for (Node node : dataNodes) {
  cluster.remove(node);
}



  was:
After the backport patch of HDFS-4975 was committed, TestAzureBlockPlacementPolicy and TestReplicationPolicyWithNodeGroup
failed. 
The cause for the failure of TestReplicationPolicyWithNodeGroup is that some part in the patch
of HDFS-3941 is missing. Our patch for HADOOP-495 makes methods in super class to be called
incorrectly. More specifically, HDFS-4975 backported HDFS-4350, HDFS-4351, and HDFS-3912 to
enable the method parameter "boolean avoidStaleNodes", and updated the APIs in BlockPlacementPolicyDefault.
However, the override methods in AzureBlockPlacementPolicy and ReplicationPolicyWithNodeGroup
weren't updated.

The cause for the failure of TestAzureBlockPlacementPolicy is similar.

In addition, TestAzureBlockPlacementPolicy has an error. Here is the error info.

Testcase: testPolicyWithDefaultRacks took 0.005 sec
Caused an ERROR
Invalid network topology. You cannot have a rack and a non-rack node at the same level of
the network topology.
org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid network topology.
You cannot have a rack and a non-rack node at the same level of the network topology.
at org.apache.hadoop.net.NetworkTopology.add(NetworkTopology.java:396)
at org.apache.hadoop.hdfs.server.namenode.TestAzureBlockPlacementPolicy.testPolicyWithDefaultRacks(TestAzureBlockPlacementPolicy.java:779)

The error is caused by a check in NetworkTopology#add(Node node)
{code}
if (depthOfAllLeaves != node.getLevel()) {
  LOG.error("Error: can't add leaf node at depth " +
      node.getLevel() + " to topology:\n" + oldTopoStr);
  throw new InvalidTopologyException("Invalid network topology. " +
      "You cannot have a rack and a non-rack node at the same " +
      "level of the network topology.");
}
{code}

The problem of this check is that when we use NetworkTopology#remove(Node node) to remove
a node from the cluster, depthOfAllLeaves won't change. As a result, we can't reset the value
of NetworkTopology#depathOfAllLeaves of the old topology of a cluster by just removing all
its dataNode. See TestAzureBlockPlacementPolicy#testPolicyWithDefaultRacks()
// clear the old topology
for (Node node : dataNodes) {
  cluster.remove(node);
}



    
> Branch-1-Win TestAzureBlockPlacementPolicy and TestReplicationPolicyWithNodeGroup failed
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-5001
>                 URL: https://issues.apache.org/jira/browse/HDFS-5001
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 1-win
>            Reporter: Xi Fang
>             Fix For: 1-win
>
>
> After the backport patch of HDFS-4975 was committed, TestAzureBlockPlacementPolicy and
TestReplicationPolicyWithNodeGroup failed. 
> The cause for the failure of TestReplicationPolicyWithNodeGroup is that some part in
the patch of HDFS-3941 is missing. Our patch for HADOOP-495 makes methods in super class to
be called incorrectly. More specifically, HDFS-4975 backported HDFS-4350, HDFS-4351, and HDFS-3912
to enable the method parameter "boolean avoidStaleNodes", and updated the APIs in BlockPlacementPolicyDefault.
However, the override methods in ReplicationPolicyWithNodeGroup weren't updated.
> The cause for the failure of TestAzureBlockPlacementPolicy is similar.
> In addition, TestAzureBlockPlacementPolicy has an error. Here is the error info.
> Testcase: testPolicyWithDefaultRacks took 0.005 sec
> Caused an ERROR
> Invalid network topology. You cannot have a rack and a non-rack node at the same level
of the network topology.
> org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid network topology.
You cannot have a rack and a non-rack node at the same level of the network topology.
> at org.apache.hadoop.net.NetworkTopology.add(NetworkTopology.java:396)
> at org.apache.hadoop.hdfs.server.namenode.TestAzureBlockPlacementPolicy.testPolicyWithDefaultRacks(TestAzureBlockPlacementPolicy.java:779)
> The error is caused by a check in NetworkTopology#add(Node node)
> {code}
> if (depthOfAllLeaves != node.getLevel()) {
>   LOG.error("Error: can't add leaf node at depth " +
>       node.getLevel() + " to topology:\n" + oldTopoStr);
>   throw new InvalidTopologyException("Invalid network topology. " +
>       "You cannot have a rack and a non-rack node at the same " +
>       "level of the network topology.");
> }
> {code}
> The problem of this check is that when we use NetworkTopology#remove(Node node) to remove
a node from the cluster, depthOfAllLeaves won't change. As a result, we can't reset the value
of NetworkTopology#depathOfAllLeaves of the old topology of a cluster by just removing all
its dataNode. See TestAzureBlockPlacementPolicy#testPolicyWithDefaultRacks()
> // clear the old topology
> for (Node node : dataNodes) {
>   cluster.remove(node);
> }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message