hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5995) TestFSEditLogLoader#testValidateEditLogWithCorruptBody gets OutOfMemoryError and dumps heap.
Date Tue, 04 Mar 2014 00:59:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918851#comment-13918851

Colin Patrick McCabe commented on HDFS-5995:

bq. A potentially interesting enhancement would be to serialize each op as <checksum><length><payload>.

I'm not sure if that would fully solve this kind of issue.  The problem we always had in the
past was people grabbing input from the stream and using it to create giant arrays that killed
java.  I.e. 
myFunArray = new int[stream.readInt()];

I think we actually had this exactly pattern in a few places, which also led to interesting
error message puzzles when -1 was returned.

This would happen just the same way if {{stream.readInt}} were {{buffer.readInt()}}.  I don't
think there's any substitute for just fixing these kind of issues as they occur.  We also
have an edit log fuzzing test that helped to flush out a lot of these kind of issues in the
past, {{TestEditLog#testFuzzSequences}}.

> TestFSEditLogLoader#testValidateEditLogWithCorruptBody gets OutOfMemoryError and dumps
> --------------------------------------------------------------------------------------------
>                 Key: HDFS-5995
>                 URL: https://issues.apache.org/jira/browse/HDFS-5995
>             Project: Hadoop HDFS
>          Issue Type: Test
>          Components: namenode, test
>    Affects Versions: 3.0.0
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>            Priority: Minor
>         Attachments: HDFS-5995.1.patch
> {{TestFSEditLogLoader#testValidateEditLogWithCorruptBody}} is experiencing {{OutOfMemoryError}}
and dumping heap since the merge of HDFS-4685.  This doesn't actually cause the test to fail,
because it's a failure test that corrupts an edit log intentionally.  Still, this might cause
confusion if someone reviews the build logs and thinks this is a more serious problem.

This message was sent by Atlassian JIRA

View raw message