Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CE044913E for ; Thu, 15 Mar 2012 20:25:01 +0000 (UTC) Received: (qmail 83772 invoked by uid 500); 15 Mar 2012 20:25:01 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 83719 invoked by uid 500); 15 Mar 2012 20:25:01 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 83355 invoked by uid 99); 15 Mar 2012 20:25:00 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Mar 2012 20:25:00 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Mar 2012 20:24:58 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 62B0E22239 for ; Thu, 15 Mar 2012 20:24:37 +0000 (UTC) Date: Thu, 15 Mar 2012 20:24:37 +0000 (UTC) From: "Aaron T. Myers (Commented) (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <687961548.20977.1331843077405.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <163560687.41129.1331248197160.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230514#comment-13230514 ] Aaron T. Myers commented on HDFS-3067: -------------------------------------- Looks to me like the TestDatanodeBlockScanner failure was indeed unrelated. +1, the latest patch looks good to me. I'm going to commit this momentarily. > NPE in DFSInputStream.readBuffer if read is repeated on corrupted block > ----------------------------------------------------------------------- > > Key: HDFS-3067 > URL: https://issues.apache.org/jira/browse/HDFS-3067 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client > Affects Versions: 0.24.0 > Reporter: Henry Robinson > Assignee: Henry Robinson > Attachments: HDFS-3067.1.patch, HDFS-3607.patch > > > With a singly-replicated block that's corrupted, issuing a read against it twice in succession (e.g. if ChecksumException is caught by the client) gives a NullPointerException. > Here's the body of a test that reproduces the problem: > {code} > final short REPL_FACTOR = 1; > final long FILE_LENGTH = 512L; > cluster.waitActive(); > FileSystem fs = cluster.getFileSystem(); > Path path = new Path("/corrupted"); > DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L); > DFSTestUtil.waitReplication(fs, path, REPL_FACTOR); > ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path); > int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block); > assertEquals("All replicas not corrupted", REPL_FACTOR, blockFilesCorrupted); > InetSocketAddress nnAddr = > new InetSocketAddress("localhost", cluster.getNameNodePort()); > DFSClient client = new DFSClient(nnAddr, conf); > DFSInputStream dis = client.open(path.toString()); > byte[] arr = new byte[(int)FILE_LENGTH]; > boolean sawException = false; > try { > dis.read(arr, 0, (int)FILE_LENGTH); > } catch (ChecksumException ex) { > sawException = true; > } > > assertTrue(sawException); > sawException = false; > try { > dis.read(arr, 0, (int)FILE_LENGTH); // <-- NPE thrown here > } catch (ChecksumException ex) { > sawException = true; > } > {code} > The stack: > {code} > java.lang.NullPointerException > at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492) > at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545) > [snip test stack] > {code} > and the problem is that currentNode is null. It's left at null after the first read, which fails, and then is never refreshed because the condition in read that protects blockSeekTo is only triggered if the current position is outside the block's range. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira