hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10512) VolumeScanner may terminate to due NPE in DataNode.reportBadBlocks
Date Fri, 10 Jun 2016 14:47:21 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15324552#comment-15324552
] 

Wei-Chiu Chuang commented on HDFS-10512:
----------------------------------------

Thanks [~linyiqun] The check in the patch looks good to me.
I think that if the volume is null for some reason and can't report the bad block to the NN,
it should throw an IOException so that this not ignored. At this point, I am not sure if it's
some race condition in a bug somewhere.

> VolumeScanner may terminate to due NPE in DataNode.reportBadBlocks
> ------------------------------------------------------------------
>
>                 Key: HDFS-10512
>                 URL: https://issues.apache.org/jira/browse/HDFS-10512
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>            Reporter: Wei-Chiu Chuang
>            Assignee: Yiqun Lin
>         Attachments: HDFS-10512.001.patch
>
>
> VolumeScanner may terminate due to unexpected NullPointerException thrown in {{DataNode.reportBadBlocks()}}.
This is different from HDFS-8850/HDFS-9190
> I observed this bug in a production CDH 5.5.1 cluster and the same bug still persist
in upstream trunk.
> {noformat}
> 2016-04-07 20:30:53,830 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting
bad BP-1800173197-10.204.68.5-1444425156296:blk_1170134484_96468685 on /dfs/dn
> 2016-04-07 20:30:53,831 ERROR org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn,
DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting because of exception
> java.lang.NullPointerException
>         at org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
>         at org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
>         at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
>         at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
>         at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
> 2016-04-07 20:30:53,832 INFO org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn,
DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting.
> {noformat}
> I think the NPE comes from the volume variable in the following code snippet. Somehow
the volume scanner know the volume, but the datanode can not lookup the volume using the block.
> {code}
> public void reportBadBlocks(ExtendedBlock block) throws IOException{
>     BPOfferService bpos = getBPOSForBlock(block);
>     FsVolumeSpi volume = getFSDataset().getVolume(block);
>     bpos.reportBadBlocks(
>         block, volume.getStorageID(), volume.getStorageType());
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message