hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2433) TestFileAppend4 fails intermittently
Date Tue, 11 Oct 2011 16:25:15 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125162#comment-13125162

Todd Lipcon commented on HDFS-2433:

my hunch is that this is related to HDFS-1172.
> TestFileAppend4 fails intermittently
> ------------------------------------
>                 Key: HDFS-2433
>                 URL: https://issues.apache.org/jira/browse/HDFS-2433
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, name-node, test
>    Affects Versions:,
>            Reporter: Robert Joseph Evans
>            Priority: Critical
>         Attachments: failed.tar.bz2
> A Jenkins build we have running failed twice in a row with issues form TestFileAppend4.testAppendSyncReplication1
in an attempt to reproduce the error I ran TestFileAppend4 in a loop over night saving the
results away.  (No clean was done in between test runs)
> When TestFileAppend4 is run in a loop the testAppendSyncReplication[012] tests fail about
10% of the time (14 times out of 130 tries)  They all fail with something like the following.
 Often it is only one of the tests that fail, but I have seen as many as two fail in one run.
> {noformat}
> Testcase: testAppendSyncReplication2 took 32.198 sec
>         FAILED
> Should have 2 replicas for that block, not 1
> junit.framework.AssertionFailedError: Should have 2 replicas for that block, not 1
>         at org.apache.hadoop.hdfs.TestFileAppend4.replicationTest(TestFileAppend4.java:477)
>         at org.apache.hadoop.hdfs.TestFileAppend4.testAppendSyncReplication2(TestFileAppend4.java:425)
> {noformat}
> I also saw several other tests that are a part of TestFileApped4 fail during this experiment.
 They may all be related to one another so I am filing them in the same JIRA.  If it turns
out that they are not related then they can be split up later.
> testAppendSyncBlockPlusBbw failed 6 out of the 130 times or about 5% of the time
> {noformat}
> Testcase: testAppendSyncBlockPlusBbw took 1.633 sec
>         FAILED
> unexpected file size! received=0 , expected=1024
> junit.framework.AssertionFailedError: unexpected file size! received=0 , expected=1024
>         at org.apache.hadoop.hdfs.TestFileAppend4.assertFileSize(TestFileAppend4.java:136)
>         at org.apache.hadoop.hdfs.TestFileAppend4.testAppendSyncBlockPlusBbw(TestFileAppend4.java:401)
> {noformat}
> testAppendSyncChecksum[012] failed 2 out of the 130 times or about 1.5% of the time
> {noformat}
> Testcase: testAppendSyncChecksum1 took 32.385 sec
>         FAILED
> Should have 1 replica for that block, not 2
> junit.framework.AssertionFailedError: Should have 1 replica for that block, not 2
>         at org.apache.hadoop.hdfs.TestFileAppend4.checksumTest(TestFileAppend4.java:556)
>         at org.apache.hadoop.hdfs.TestFileAppend4.testAppendSyncChecksum1(TestFileAppend4.java:500)
> {noformat}
> I will attach logs for all of the failures.  Be aware that I did change some of the logging
messages in this test so I could better see when testAppendSyncReplication started and ended.
 Other then that the code is stock 0.20.205 RC2

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message