hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10627) Volume Scanner mark a block as "suspect" even if the block sender encounters 'Broken pipe' or 'Connection reset by peer' exception
Date Fri, 15 Jul 2016 20:30:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15380056#comment-15380056

Wei-Chiu Chuang commented on HDFS-10627:

Hi Daryn, I totally get that.

Out of curiosity, why isn't a packet responder instantiated for block transfer operations?
Looking at the code, a packet responder is only instantiated for writing a pipeline.

I was relatively concerned about removing it, because [~yzhangal] and I have been diagnosing
a block corruption bug very similar to HDFS-4660 and HDFS-9220, and a volume scanner that
is called up to scan a suspect block in these cases is useful.

> Volume Scanner mark a block as "suspect" even if the block sender encounters 'Broken
pipe' or 'Connection reset by peer' exception
> ----------------------------------------------------------------------------------------------------------------------------------
>                 Key: HDFS-10627
>                 URL: https://issues.apache.org/jira/browse/HDFS-10627
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 2.7.0
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: HDFS-10627.patch
> In the BlockSender code,
> {code:title=BlockSender.java|borderStyle=solid}
>         if (!ioem.startsWith("Broken pipe") && !ioem.startsWith("Connection reset"))
>           LOG.error("BlockSender.sendChunks() exception: ", e);
>         }
>         datanode.getBlockScanner().markSuspectBlock(
>               volumeRef.getVolume().getStorageID(),
>               block);
> {code}
> Before HDFS-7686, the block was marked as suspect only if the exception message doesn't
start with Broken pipe or Connection reset.
> But after HDFS-7686, the block is marked as corrupt irrespective of the exception message.
> In one of our datanode, it took approximately a whole day (22 hours) to go through all
the suspect blocks to scan one corrupt block.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message