hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oleg Danilov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5970) callers of NetworkTopology's chooseRandom method to expect null return value
Date Wed, 17 Feb 2016 13:22:18 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150485#comment-15150485
] 

Oleg Danilov commented on HDFS-5970:
------------------------------------

We just "reproduced" this issue accidentally using Hadoop 2.3.0:

...
2016-02-16 11:21:37,217 INFO org.apache.hadoop.net.NetworkTopology: Removing a node: /default-rack/10.5.68.40:1004
2016-02-16 11:21:37,217 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* removeDeadDatanode:
lost heartbeat from 10.5.68.45:1004
2016-02-16 11:21:37,217 INFO org.apache.hadoop.net.NetworkTopology: Removing a node: /default-rack/10.5.68.45:1004
2016-02-16 11:21:37,218 FATAL org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
ReplicationMonitor thread received Runtime exception. 
java.lang.NullPointerException
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:507)
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:455)
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:278)
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:212)
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:117)
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.chooseTargets(BlockManager.java:3309)
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationWork.access$200(BlockManager.java:3277)
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1283)
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1190)
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3250)
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3204)
	at java.lang.Thread.run(Thread.java:745)
2016-02-16 11:21:37,246 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2016-02-16 11:21:37,260 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:


Unfortunately it causes the namenode shutdown.

> callers of NetworkTopology's chooseRandom method to expect null return value
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-5970
>                 URL: https://issues.apache.org/jira/browse/HDFS-5970
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.0.0
>            Reporter: Yongjun Zhang
>            Priority: Minor
>
> Class NetworkTopology's method
>    public Node chooseRandom(String scope) 
> calls 
>    private Node chooseRandom(String scope, String excludedScope)
> which may return null value.
> Callers of this method such as BlockPlacementPolicyDefault etc need to be aware that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message