hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5561) org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl testcase failing on trunk
Date Tue, 22 Oct 2013 14:45:42 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801874#comment-13801874
] 

Jason Lowe commented on MAPREDUCE-5561:
---------------------------------------

Yes, the test is definitely racy.  There's no guarantee the job will be in the FAIL_ABORT
state while when we look at it asynchronously.  A couple of approaches to fixing this:

# As [~kkambatl] points out, we can skip the FAIL_ABORT check.  The real purpose of this test
is to verify we eventually get to the FAILED state without hanging when tasks fail.
# A more deterministic, explicit test for FAIL_ABORT would be to use an output committer with
a barrier, similar to TestingOutputCommitter but with the barrier in the abortJob method,
so we can guarantee the job will pause in the FAIL_ABORT state.  Then we can release the committer
from the barrier and verify the job proceeds to failed.



> org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl testcase failing on trunk
> ---------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5561
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5561
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Cindy Li
>            Assignee: Karthik Kambatla
>            Priority: Critical
>         Attachments: org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl-output.txt
>
>
> Running org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl
> Tests run: 15, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.029 sec <<<
FAILURE! - in org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl
> testFailAbortDoesntHang(org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl)  Time
elapsed: 5.507 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<FAIL_ABORT> but was:<FAILED>
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.failNotEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:128)
> at org.junit.Assert.assertEquals(Assert.java:147)
> at org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.assertJobState(TestJobImpl.java:817)
> at org.apache.hadoop.mapreduce.v2.app.job.impl.TestJobImpl.testFailAbortDoesntHang(TestJobImpl.java:418)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message