hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3245) Provide ability to persist running jobs (extend HADOOP-1876)
Date Wed, 10 Sep 2008 15:54:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629838#action_12629838

Amar Kamat commented on HADOOP-3245:

bq. 1) Make the check for status as RUNNING explicitly in JobTracker.RecoveryManager.JobRecoveryListener.checkAndInit
bq. Rename the variable 'cause' in JobHistory.Task.LogFailed failedDueToAttempt
bq. Call JobInProgressListener.jobUpdated after the job recovery
bq. ReduceTask need not check for copied maps upon restart as copyOutput already does it.
bq. Overall, this patch should be tested thoroughly under various conditions like map partially
complete, reduces partially complete
Updated the test case to make sure that the reducer sees map events (~50%) before killing
the jobtracker. This will now test the rollback logic as the test case checks if the order
in which the events are available at the tasktracker is same as the order available at the
(restarted) jobtracker.

I will manually test the patch to see the job-level counters, that are reconstructed from
the history, match across restarts. I am waiting for HADOOP-4112 as its a blocker for this

> Provide ability to persist running jobs (extend HADOOP-1876)
> ------------------------------------------------------------
>                 Key: HADOOP-3245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3245
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Amar Kamat
>         Attachments: HADOOP-3245-v2.5.patch, HADOOP-3245-v2.6.5.patch, HADOOP-3245-v2.6.9.patch,
HADOOP-3245-v4.1.patch, HADOOP-3245-v5.13.patch, HADOOP-3245-v5.14.patch, HADOOP-3245-v5.26.patch,
HADOOP-3245-v5.30-nolog.patch, HADOOP-3245-v5.31.3-nolog.patch, HADOOP-3245-v5.33.1.patch,
> This could probably extend the work done in HADOOP-1876. This feature can be applied
for things like jobs being able to survive jobtracker restarts.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message