hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5505) Clients should be notified job finished only after job successfully unregistered
Date Tue, 24 Sep 2013 03:40:05 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhijie Shen updated MAPREDUCE-5505:
-----------------------------------

    Attachment: MAPREDUCE-5505.4.patch

Thanks [~bikassaha] for reviewing the patch. I've updated it accordingly.

bq. Typo - already ...

Fixed

bq. Lets use a true value to be back-compatible in case it gets used.

Fixed

bq. Typo - mock

Fixed

bq. Please put a comment for this non-obvious code. We need it because noone is calling shutdown
job right. Alternatively, can MRApp.createJob() be changed to call MRAppMaster.shutdown()
or set the boolean value to true. This would be closer to ideal that the current approach.

Agree the code is non-obvious. Instead of moving setting safeToReportTerminationToUser to
MRApp.createJob(), I moved it to the constructor of MRApp, because safeToReportTerminationToUser
is the per MRAppMaster variable.

bq. Can we have a test that verifies the main straightline case. Job succeeds and returns
running until the boolean is set.?

Added TestMRApp#testJobSuccess

bq. How can we be sure that the previous state == RUNNING

Its a general issue. To solve it in all the transitions of JobImpl, I added the code to remember
the last non-final state. Then, whenever safeToReportTerminationToUser is false, JobImpl returns
the stored previous state instead of the final state, i.e., SUCCEEDED, FAILED, KILLED and
ERROR.

bq. Has this been tested on a single node cluster with a real job?

Tested locally. The job client saw RUNNING until AM got unregistered.
                
> Clients should be notified job finished only after job successfully unregistered 
> ---------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5505
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5505
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Jian He
>            Assignee: Zhijie Shen
>         Attachments: MAPREDUCE-5505.1.patch, MAPREDUCE-5505.1.patch, MAPREDUCE-5505.3.patch,
MAPREDUCE-5505.4.patch
>
>
> This is to make sure user is notified job finished after job is really done. This does
increase client latency but can reduce some races during unregister like YARN-540

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message