hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8770) ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates
Date Thu, 12 Nov 2015 18:39:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15002599#comment-15002599
] 

Xiao Chen commented on HDFS-8770:
---------------------------------

Hi [~aderen],
Thanks for reporting the issue and providing a patch. The fix makes sense to me.
Could you add a unit test to reproduce the scenario that you're trying to fix?

> ReplicationMonitor thread received Runtime exception: NullPointerException when BlockManager.chooseExcessReplicates
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8770
>                 URL: https://issues.apache.org/jira/browse/HDFS-8770
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.6.0, 2.7.0
>            Reporter: ade
>            Assignee: ade
>            Priority: Critical
>         Attachments: HDFS-8770_v1.patch
>
>
> Namenode shutdown when ReplicationMonitor thread received Runtime exception:
> {quote}
> 2015-07-08 16:43:55,167 ERROR org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
ReplicationMonitor thread received Runtime exception.
> java.lang.NullPointerException
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:189)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseExcessReplicates(BlockManager.java:2911)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processOverReplicatedBlock(BlockManager.java:2849)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processMisReplicatedBlock(BlockManager.java:2780)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.rescanPostponedMisreplicatedBlocks(BlockManager.java:1931)
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3628)
>         at java.lang.Thread.run(Thread.java:744)
> {quote}
> We use hadoop-2.6.0 configured with heterogeneous storages and setStoragePolicy some
path One_SSD. When a block has excess replicated like 2 SSD replica on different rack(exactlyOne
set) and 2 Disk on same rack(moreThanOne set), BlockPlacementPolicyDefault.chooseReplicaToDelete
return null because only moreThanOne set be chosen to find SSD replica



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message