hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chang Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5003) AM recovery should recreate records for attempts that were incomplete
Date Sat, 31 Oct 2015 05:10:27 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chang Li updated MAPREDUCE-5003:
--------------------------------
    Attachment: MAPREDUCE-5003.10.patch

.10 patch fix some checkstyle.
broken test of TestJobHistoryEventHandler is not related to my change. It may be transient
since it pass in my local machine with my patch on. broken test of TestRecovery also appear
to be transient becaue it pass on my local machine with my patch on. I update testMultipleCrashes
of TestRecovery to improve its stability. 
testDetermineCacheVisibilities of TestClientDistributedCacheManager is broken without applying
my patch. Will file jira for that broken test

> AM recovery should recreate records for attempts that were incomplete
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5003
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5003
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am
>            Reporter: Jason Lowe
>            Assignee: Chang Li
>         Attachments: MAPREDUCE-5003.1.patch, MAPREDUCE-5003.10.patch, MAPREDUCE-5003.2.patch,
MAPREDUCE-5003.3.patch, MAPREDUCE-5003.4.patch, MAPREDUCE-5003.5.patch, MAPREDUCE-5003.5.patch,
MAPREDUCE-5003.6.patch, MAPREDUCE-5003.7.patch, MAPREDUCE-5003.8.patch, MAPREDUCE-5003.9.patch,
MAPREDUCE-5003.9.patch
>
>
> As discussed in MAPREDUCE-4992, it would be nice if the AM recovered task attempt entries
for *all* task attempts launched by the prior app attempt even if those task attempts did
not complete.  The attempts would have to be marked as killed or something similar to indicate
it is no longer running.  Having records for the task attempts enables the user to see what
nodes were associated with the attempts and potentially access their logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message