[ https://issues.apache.org/jira/browse/HDFS-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089757#comment-13089757
]
Hadoop QA commented on HDFS-1480:
---------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12491018/hdfs-1480.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit
warnings.
-1 core tests. The patch failed these core unit tests:
+1 contrib tests. The patch passed contrib unit tests.
+1 system test framework. The patch passed system test framework compile.
Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1148//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/1148//artifact/trunk/target/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1148//console
This message is automatically generated.
> All replicas for a block with repl=2 end up in same rack
> --------------------------------------------------------
>
> Key: HDFS-1480
> URL: https://issues.apache.org/jira/browse/HDFS-1480
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 0.20.2
> Reporter: T Meyarivan
> Assignee: Todd Lipcon
> Fix For: 0.23.0
>
> Attachments: hdfs-1480-test.txt, hdfs-1480.txt, hdfs-1480.txt, hdfs-1480.txt
>
>
> It appears that all replicas of a block can end up in the same rack. The likelihood of
such replicas seems to be directly related to decommissioning of nodes.
> Post rolling OS upgrade (decommission 3-10% of nodes, re-install etc, add them back)
of a running cluster, all replicas of about 0.16% of blocks ended up in the same rack.
> Hadoop Namenode UI etc doesn't seem to know about such incorrectly replicated blocks.
"hadoop fsck .." does report that the blocks must be replicated on additional racks.
> Looking at ReplicationTargetChooser.java, following seem suspect:
> snippet-01:
> {code}
> int maxNodesPerRack =
> (totalNumOfReplicas-1)/clusterMap.getNumOfRacks()+2;
> {code}
> snippet-02:
> {code}
> case 2:
> if (clusterMap.isOnSameRack(results.get(0), results.get(1))) {
> chooseRemoteRack(1, results.get(0), excludedNodes,
> blocksize, maxNodesPerRack, results);
> } else if (newBlock){
> chooseLocalRack(results.get(1), excludedNodes, blocksize,
> maxNodesPerRack, results);
> } else {
> chooseLocalRack(writer, excludedNodes, blocksize,
> maxNodesPerRack, results);
> }
> if (--numOfReplicas == 0) {
> break;
> }
> {code}
> snippet-03:
> {code}
> do {
> DatanodeDescriptor[] selectedNodes =
> chooseRandom(1, nodes, excludedNodes);
> if (selectedNodes.length == 0) {
> throw new NotEnoughReplicasException(
> "Not able to place enough replicas");
> }
> result = (DatanodeDescriptor)(selectedNodes[0]);
> } while(!isGoodTarget(result, blocksize, maxNodesPerRack, results));
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
|