tez-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TEZ-3102) Fetch failure of a speculated task causes job hang
Date Tue, 23 Feb 2016 23:10:18 GMT

    [ https://issues.apache.org/jira/browse/TEZ-3102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15159835#comment-15159835
] 

Bikas Saha commented on TEZ-3102:
---------------------------------

+1.

I think testTaskSucceedAndRetroActiveFailure() should be covering the new code changes in
the success attempt code path. In the small chance that its not, would you please update the
test. Thanks!

> Fetch failure of a speculated task causes job hang
> --------------------------------------------------
>
>                 Key: TEZ-3102
>                 URL: https://issues.apache.org/jira/browse/TEZ-3102
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Critical
>         Attachments: TEZ-3102.001.patch, TEZ-3102.002.patch
>
>
> If a task speculates then succeeds, one task will be marked successful and the other
killed. Then if the task retroactively fails due to fetch failures the Tez AM will fail to
reschedule another task. This results in a hung job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message