falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shaik Idris Ali (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FALCON-766) Falcon workflow rerun by default should rerun only Failed nodes
Date Wed, 01 Oct 2014 16:17:34 GMT

    [ https://issues.apache.org/jira/browse/FALCON-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14155045#comment-14155045
] 

Shaik Idris Ali commented on FALCON-766:
----------------------------------------

Even if the sub-flow has issue, but Falcon should still re-run only Failed nodes.

In few reported cases, Falcon "post-processing" Failed even though user's workflow (costly
job) succeeded. In such cases users are left with 2 options:
1. Ignore the Killed status of the instance as the user workflow has already succeeded.
2. Or rerun the workflow so the Instance status is reset to succeeded.

Ignoring the killed status is not good idea as the Monitoring systems like Nagios keep on
alerting based on status check. I would prefer the default option of rerun to be "Failed nodes"
only.


> Falcon workflow rerun by default should rerun only Failed nodes
> ---------------------------------------------------------------
>
>                 Key: FALCON-766
>                 URL: https://issues.apache.org/jira/browse/FALCON-766
>             Project: Falcon
>          Issue Type: Bug
>          Components: client, oozie
>            Reporter: Shaik Idris Ali
>            Assignee: Shaik Idris Ali
>              Labels: rerun
>         Attachments: FALCON-766.patch
>
>
> Falcon workflow instance rerun, reruns all the nodes in workflow which is very costly,
the default behaviour should be; job should be restarted from the failed node.
> However user may still override this behaviour by passing property to rerun.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message