hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rushabh S Shah (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7916) 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop
Date Mon, 12 Oct 2015 19:58:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953640#comment-14953640

Rushabh S Shah commented on HDFS-7916:

{quote}We deployed this fix to one of our cluster and unfortunately the datanode were still
spamming the namenode with the same stack trace as before.
We debugged the issue and found out that the Datanode were receiving StandbyException wrapped
in RemoteException.
And the patch was checking for StandbyException and not RemoteException.
Inititally we were catching specifically StandbyException. At that time we thought not to
catch StandbyException in ErrorReportAction.
But then we discovered that the namenode was throwing StandbyException wrapped in RemoteException.
So we chose to ignore all the RemoteException in both the class and just log it as WARN.

Hope this helps.

> 'reportBadBlocks' from datanodes to standby Node BPServiceActor goes for infinite loop
> --------------------------------------------------------------------------------------
>                 Key: HDFS-7916
>                 URL: https://issues.apache.org/jira/browse/HDFS-7916
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.7.0
>            Reporter: Vinayakumar B
>            Assignee: Rushabh S Shah
>            Priority: Critical
>             Fix For: 2.7.1
>         Attachments: HDFS-7916-01.patch, HDFS-7916-1.patch
> if any badblock found, then BPSA for StandbyNode will go for infinite times to report
> {noformat}2015-03-11 19:43:41,528 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:
Failed to report bad block BP-1384821822-
to namenode: stobdtserver3/
> org.apache.hadoop.hdfs.server.datanode.BPServiceActorActionException: Failed to report
bad block BP-1384821822- to namenode:
>         at org.apache.hadoop.hdfs.server.datanode.ReportBadBlockAction.reportTo(ReportBadBlockAction.java:63)
>         at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processQueueMessages(BPServiceActor.java:1020)
>         at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:762)
>         at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:856)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}

This message was sent by Atlassian JIRA

View raw message