falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sowmya Ramesh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-740) Entity kill job calls OozieClient.kill on bundle coord job ids before calling kill on bundle job id
Date Mon, 22 Sep 2014 19:20:35 GMT

    [ https://issues.apache.org/jira/browse/FALCON-740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143657#comment-14143657
] 

Sowmya Ramesh commented on FALCON-740:
--------------------------------------

Issue is reproducible with Oozie 4.1. 

Oozie behavior changed in 4.1. In 4.0 Oozie didn't allow to rerun a killed coord job. From
4.1 and onwards Oozie allows rerun of killed coord job. After this change if user tries to
update (like setting end time) Oozie throws exception "coord cannot be changed since it's
in killed state" to indicate update didn't go through for the coord job. 

Current code flow in Falcon
1> kill all coords in bundle
2> Set end time of bundle
3> kill the bundle

If its changed to
1> Set end time of bundle (This action is sync now after oozie-1807)
2> kill all coords in bundle 
3> Kill the bundle

issue will be fixed. Uploaded the patch with fix.

> Entity kill job calls OozieClient.kill on bundle coord job ids before calling kill on
bundle job id
> ---------------------------------------------------------------------------------------------------
>
>                 Key: FALCON-740
>                 URL: https://issues.apache.org/jira/browse/FALCON-740
>             Project: Falcon
>          Issue Type: Bug
>          Components: webapp
>    Affects Versions: 0.6
>            Reporter: Balu Vellanki
>            Assignee: Sowmya Ramesh
>         Attachments: FALCON-740.patch
>
>
> When Falcon user makes an entity kill API call, Falcon does the following in org.apache.falcon.workflow.engine.OozieWorkflowEngine.killBundle(String
clusterName, BundleJob job)
> {code}
>  //kill all coords
>             for (CoordinatorJob coord : job.getCoordinators()) {
>                 client.kill(coord.getId());
>                 LOG.debug("Killed coord {} on cluster {}", coord.getId(), clusterName);
>             }
>             //set end time of bundle
>             client.change(job.getId(), OozieClient.CHANGE_VALUE_ENDTIME + "=" + SchemaHelper.formatDateUTC(new
Date()));
>             LOG.debug("Changed end time of bundle {} on cluster {}", job.getId(), clusterName);
>             //kill bundle
>             client.kill(job.getId());
>             LOG.debug("Killed bundle {} on cluster {}", job.getId(), clusterName);
> {code}
> Two questions.
> 1. Why should we kill the coordinator jobs before killing the bundle job? OozieClient.kill(bundle_job_id)
should kill all the bundle's coord jobs.
> 2. Why is the endtime changed for  bundle job? https://oozie.apache.org/docs/4.0.1/DG_CommandLineTool.html#Changing_pausetime_of_a_Bundle_Job
does not say that endtime can be changed for bundlejob. 
> I think this code should be updated, please comment if you think I made any wrong assumptions.
> Thank you



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message