Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Tue, 9 Aug 2016 17:01:20 +0000 (UTC)
From: "Wei-Chiu Chuang (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12823150.1429732274000.263135.1470762080674@Atlassian.JIRA>
In-Reply-To: <JIRA.12823150.1429732274000@Atlassian.JIRA>
References: <JIRA.12823150.1429732274000@Atlassian.JIRA> <JIRA.12823150.1429732274963@arcas>
Subject: [jira] [Commented] (HDFS-8224) Any IOException in
 DataTransfer#run() will run diskError thread even if it is not disk error
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Tue, 09 Aug 2016 17:01:22 -0000


    [ https://issues.apache.org/jira/browse/HDFS-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413845#comment-15413845 ] 

Wei-Chiu Chuang commented on HDFS-8224:
---------------------------------------

Sure I'll review soon. Thx for the patch.

> Any IOException in DataTransfer#run() will run diskError thread even if it is not disk error
> --------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8224
>                 URL: https://issues.apache.org/jira/browse/HDFS-8224
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>             Fix For: 2.8.0
>
>         Attachments: HDFS-8224-trunk-1.patch, HDFS-8224-trunk.patch
>
>
> This happened in our 2.6 cluster.
> One of the block and its metadata file were corrupted.
> The disk was healthy in this case.
> Only the block was corrupt.
> Namenode tried to copy that block to another datanode but failed with the following stack trace:
> 2015-04-20 01:04:04,421 [org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@11319bc4] WARN datanode.DataNode: DatanodeRegistration(a.b.c.d, datanodeUuid=e8c5135c-9b9f-4d05-a59d-e5525518aca7, infoPort=1006, infoSecurePort=0, ipcPort=8020, storageInfo=lv=-56;cid=CID-e7f736ac-158e-446e-9091-7e66f3cddf3c;nsid=358250775;c=1428471998571):Failed to transfer BP-xxx-1351096255769:blk_2697560713_1107108863999 to a1.b1.c1.d1:1004 got 
> java.io.IOException: Could not create DataChecksum of type 0 with bytesPerChecksum 0
>         at org.apache.hadoop.util.DataChecksum.newDataChecksum(DataChecksum.java:125)
>         at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:175)
>         at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readHeader(BlockMetadataHeader.java:140)
>         at org.apache.hadoop.hdfs.server.datanode.BlockMetadataHeader.readDataChecksum(BlockMetadataHeader.java:102)
>         at org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:287)
>         at org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1989)
>         at java.lang.Thread.run(Thread.java:722)
> The following catch block in DataTransfer#run method will treat every IOException as disk error fault and run disk errror
> {noformat}
> catch (IOException ie) {
>         LOG.warn(bpReg + ":Failed to transfer " + b + " to " +
>             targets[0] + " got ", ie);
>         // check if there are any disk problem
>         checkDiskErrorAsync();
>       } 
> {noformat}
> This block was never scanned by BlockPoolSliceScanner otherwise it would have reported as corrupt block.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org