hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-1472) Timed-out tasks are marked as 'KILLED' rather than as 'FAILED' which means the framework doesn't fail a TIP with 4 or more timed-out attempts
Date Thu, 07 Jun 2007 19:37:27 GMT
Timed-out tasks are marked as 'KILLED' rather than as 'FAILED' which means the framework doesn't
fail a TIP with 4 or more timed-out attempts
---------------------------------------------------------------------------------------------------------------------------------------------

                 Key: HADOOP-1472
                 URL: https://issues.apache.org/jira/browse/HADOOP-1472
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.13.0
            Reporter: Arun C Murthy
            Assignee: Arun C Murthy
            Priority: Blocker
             Fix For: 0.13.0


Timed-out tasks (and also tasks which fail with {{FSError}}) are marked as {{KILLED}} rather
than as {{FAILED}}. The major issue with this is that post HADOOP-1050 only {{FAILED}} task-attempts
are considered to decide if the {{TIP}} has failed, and hence there exists a corner case where
a {{TIP}} which has 4 timed-out tasks isn't marked as {{FAILED}} and thus the job keeps running
too...

Considering this is a corner-case and is going to entail not-too-insignificant changes to
{{TaskTracker}}'s control-flow (ugly as it is right now), I'm proposing to fix this either
for 0.13.1 (if need be) or better: 0.14.

Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message