hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3067) NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
Date Sat, 10 Mar 2012 02:24:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226724#comment-13226724
] 

Aaron T. Myers commented on HDFS-3067:
--------------------------------------

Looks pretty good to me, Hank. Just a few small nits. +1 once these are addressed.

# A few lines are over 80 chars.
# Indent 4 lines on lines that go over 80 chars, instead of 2.
# Rather than use the "sawException" boolean, add an explicit call to "fail()" after the dis.read(),
and call GenericTestUtils.assertExceptionContains(...) in the catch clause.
# Put some white space around "=" and "<" in the for loop.
                
> NPE in DFSInputStream.readBuffer if read is repeated on corrupted block
> -----------------------------------------------------------------------
>
>                 Key: HDFS-3067
>                 URL: https://issues.apache.org/jira/browse/HDFS-3067
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.24.0
>            Reporter: Henry Robinson
>            Assignee: Henry Robinson
>         Attachments: HDFS-3607.patch
>
>
> With a singly-replicated block that's corrupted, issuing a read against it twice in succession
(e.g. if ChecksumException is caught by the client) gives a NullPointerException.
> Here's the body of a test that reproduces the problem:
> {code}
>     final short REPL_FACTOR = 1;
>     final long FILE_LENGTH = 512L;
>     cluster.waitActive();
>     FileSystem fs = cluster.getFileSystem();
>     Path path = new Path("/corrupted");
>     DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
>     DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);
>     ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
>     int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
>     assertEquals("All replicas not corrupted", REPL_FACTOR, blockFilesCorrupted);
>     InetSocketAddress nnAddr =
>         new InetSocketAddress("localhost", cluster.getNameNodePort());
>     DFSClient client = new DFSClient(nnAddr, conf);
>     DFSInputStream dis = client.open(path.toString());
>     byte[] arr = new byte[(int)FILE_LENGTH];
>     boolean sawException = false;
>     try {
>       dis.read(arr, 0, (int)FILE_LENGTH);
>     } catch (ChecksumException ex) {     
>       sawException = true;
>     }
>     
>     assertTrue(sawException);
>     sawException = false;
>     try {
>       dis.read(arr, 0, (int)FILE_LENGTH); // <-- NPE thrown here
>     } catch (ChecksumException ex) {     
>       sawException = true;
>     } 
> {code}
> The stack:
> {code}
> java.lang.NullPointerException
> 	at org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
> 	at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
>         [snip test stack]
> {code}
> and the problem is that currentNode is null. It's left at null after the first read,
which fails, and then is never refreshed because the condition in read that protects blockSeekTo
is only triggered if the current position is outside the block's range. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message