hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-2433) TestFileAppend4 fails intermittently
Date Tue, 11 Oct 2011 14:07:12 GMT
TestFileAppend4 fails intermittently

                 Key: HDFS-2433
                 URL: https://issues.apache.org/jira/browse/HDFS-2433
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: data-node, name-node, test
    Affects Versions:,
            Reporter: Robert Joseph Evans
            Priority: Critical

A Jenkins build we have running failed twice in a row with issues form TestFileAppend4.testAppendSyncReplication1
in an attempt to reproduce the error I ran TestFileAppend4 in a loop over night saving the
results away.  (No clean was done in between test runs)

When TestFileAppend4 is run in a loop the testAppendSyncReplication[012] tests fail about
10% of the time (14 times out of 130 tries)  They all fail with something like the following.
 Often it is only one of the tests that fail, but I have seen as many as two fail in one run.

Testcase: testAppendSyncReplication2 took 32.198 sec
Should have 2 replicas for that block, not 1
junit.framework.AssertionFailedError: Should have 2 replicas for that block, not 1
        at org.apache.hadoop.hdfs.TestFileAppend4.replicationTest(TestFileAppend4.java:477)
        at org.apache.hadoop.hdfs.TestFileAppend4.testAppendSyncReplication2(TestFileAppend4.java:425)

I also saw several other tests that are a part of TestFileApped4 fail during this experiment.
 They may all be related to one another so I am filing them in the same JIRA.  If it turns
out that they are not related then they can be split up later.

testAppendSyncBlockPlusBbw failed 6 out of the 130 times or about 5% of the time

Testcase: testAppendSyncBlockPlusBbw took 1.633 sec
unexpected file size! received=0 , expected=1024
junit.framework.AssertionFailedError: unexpected file size! received=0 , expected=1024
        at org.apache.hadoop.hdfs.TestFileAppend4.assertFileSize(TestFileAppend4.java:136)
        at org.apache.hadoop.hdfs.TestFileAppend4.testAppendSyncBlockPlusBbw(TestFileAppend4.java:401)

testAppendSyncChecksum[012] failed 2 out of the 130 times or about 1.5% of the time

Testcase: testAppendSyncChecksum1 took 32.385 sec
Should have 1 replica for that block, not 2
junit.framework.AssertionFailedError: Should have 1 replica for that block, not 2
        at org.apache.hadoop.hdfs.TestFileAppend4.checksumTest(TestFileAppend4.java:556)
        at org.apache.hadoop.hdfs.TestFileAppend4.testAppendSyncChecksum1(TestFileAppend4.java:500)

I will attach logs for all of the failures.  Be aware that I did change some of the logging
messages in this test so I could better see when testAppendSyncReplication started and ended.
 Other then that the code is stock 0.20.205 RC2

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message