Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 37264 invoked from network); 13 Oct 2008 15:33:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Oct 2008 15:33:38 -0000 Received: (qmail 79626 invoked by uid 500); 13 Oct 2008 15:33:35 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 79596 invoked by uid 500); 13 Oct 2008 15:33:35 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 79584 invoked by uid 99); 13 Oct 2008 15:33:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Oct 2008 08:33:35 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Oct 2008 15:32:37 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 2B982234C217 for ; Mon, 13 Oct 2008 08:32:44 -0700 (PDT) Message-ID: <1054064215.1223911964164.JavaMail.jira@brutus> Date: Mon, 13 Oct 2008 08:32:44 -0700 (PDT) From: "Brian Bockelman (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Created: (HADOOP-4402) Namenode is unaware of FS corruption MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Namenode is unaware of FS corruption ------------------------------------ Key: HADOOP-4402 URL: https://issues.apache.org/jira/browse/HADOOP-4402 Project: Hadoop Core Issue Type: Bug Components: dfs Affects Versions: 0.18.1 Reporter: Brian Bockelman Fix For: 0.18.2 I think the name node is not told when there is block corruption. I found a huge number of files corrupted when I restarted my namenode. Digging through the datanode logs, I saw the following: 2008-10-13 03:30:44,266 INFO org.apache.hadoop.dfs.DataBlockScanner: Reporting bad block blk_-54103619973430645_3038 to namenode. 2008-10-13 03:57:11,447 WARN org.apache.hadoop.dfs.DataBlockScanner: First Verification failed for blk_7657563767222456337_3165. Exception : java.io.IOException: Block blk_7657563767222456337_3165 is not valid. at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:716) at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:704) at org.apache.hadoop.dfs.DataNode$BlockSender.(DataNode.java:1678) at org.apache.hadoop.dfs.DataBlockScanner.verifyBlock(DataBlockScanner.java:408) at org.apache.hadoop.dfs.DataBlockScanner.verifyFirstBlock(DataBlockScanner.java:474) at org.apache.hadoop.dfs.DataBlockScanner.run(DataBlockScanner.java:565) at java.lang.Thread.run(Thread.java:595) 2008-10-13 03:57:11,448 WARN org.apache.hadoop.dfs.DataBlockScanner: Second Verification failed for blk_7657563767222456337_3165. Exception : java.io.IOException: Block blk_7657563767222456337_3165 is not valid. at org.apache.hadoop.dfs.FSDataset.getBlockFile(FSDataset.java:716) at org.apache.hadoop.dfs.FSDataset.getLength(FSDataset.java:704) at org.apache.hadoop.dfs.DataNode$BlockSender.(DataNode.java:1678) at org.apache.hadoop.dfs.DataBlockScanner.verifyBlock(DataBlockScanner.java:408) at org.apache.hadoop.dfs.DataBlockScanner.verifyFirstBlock(DataBlockScanner.java:474) at org.apache.hadoop.dfs.DataBlockScanner.run(DataBlockScanner.java:565) at java.lang.Thread.run(Thread.java:595) 2008-10-13 03:57:11,448 INFO org.apache.hadoop.dfs.DataBlockScanner: Reporting bad block blk_7657563767222456337_3165 to namenode. So, node099 found a bad block. However, if I grep the namenode information for that block: /scratch/hadoop/logs/hadoop-root-namenode-hadoop-name.log.2008-10-10:2008-10-10 20:21:20,002 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.allocateBlock: /user/uscms01/LoadTestDownload/LoadTest07_FNAL_01_MyQXiu5a22TJQlcB_508. blk_7657563767222456337_3165 /scratch/hadoop/logs/hadoop-root-namenode-hadoop-name.log.2008-10-10:2008-10-10 20:21:32,150 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 172.16.1.110:50010 is added to blk_7657563767222456337_3165 size 67108864 /scratch/hadoop/logs/hadoop-root-namenode-hadoop-name.log.2008-10-10:2008-10-10 20:21:32,151 INFO org.apache.hadoop.dfs.StateChange: BLOCK* NameSystem.addStoredBlock: blockMap updated: 172.16.1.99:50010 is added to blk_7657563767222456337_3165 size 67108864 /scratch/hadoop/logs/hadoop-root-namenode-hadoop-name.log.2008-10-12:2008-10-12 05:05:26,898 INFO org.apache.hadoop.dfs.StateChange: BLOCK* ask 172.16.1.99:50010 to replicate blk_7657563767222456337_3165 to datanode(s) 172.16.1.18:50010 /scratch/hadoop/logs/hadoop-root-namenode-hadoop-name.log.2008-10-12:2008-10-12 05:05:40,742 INFO org.apache.hadoop.dfs.NameNode: Error report from 172.16.1.99:50010: Can't send invalid block blk_7657563767222456337_3165 /scratch/hadoop/logs/hadoop-root-namenode-hadoop-name.log.2008-10-12:2008-10-12 05:12:43,759 WARN org.apache.hadoop.fs.FSNamesystem: PendingReplicationMonitor timed out block blk_7657563767222456337_3165 To summarize: - Block is allocated and written successfully to node100, then replicated to node099. - Name node asks node099 to replicate block to node018 - Name node is told it can't send invalid block to node018! A few minutes later, the PendingReplicationMonitor times out - No new replications are launched!!! - Block is found to be corrupted on node099 a few days later. Data node claims to inform the namenode of this, but nothing is listed in the namenode logs. - Block is suspiciously missing on node110 as well Perhaps there are a few bugs here? 1) Name node doesn't get notified of the corrupted blocks - even though the datanode claims to! 2) On replication failure, no new replicas are created. 3) Corruption events are much, much too common. We have a specific dataset which the namenode now claims is mostly-corrupted (100/167 files); before I restarted the namenode, we had jobs run against it continuously for the entire weekend. The jobs were all successful, and the binary data format does internal integrity checks as it reads the files. If the corruption was real, jobs would have failed. I'm concerned that the corruption-detection systems of Hadoop are seriously busted for me. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.