falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pragya Mittal (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FALCON-1810) Instance status response is KILLED instead of FAILED
Date Tue, 02 Feb 2016 12:23:39 GMT

     [ https://issues.apache.org/jira/browse/FALCON-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pragya Mittal updated FALCON-1810:
----------------------------------
    Description: 
In workflow if one of the action fails then the intance status is KILLED instead of FAILED.
{noformat}
2016-02-02T11:34Z 	ProcessMultipleClustersTest-corp-3a97a54e	-	KILLED	2016-02-02T11:57Z	2016-02-02T11:57Z
-	http://8RPCG32.corp.inmobi.com:11000/oozie?job=0000938-160202121302340-oozie-oozi-W
actions:
 hdfscommands	OK	-
 aggregator	FAILED/KILLED	http://8RPCG32.corp.inmobi.com:8088/proxy/application_1454048976510_6891/
{noformat}

Such instances retry after some time (given retry is enabled)

This leads to inconsistencies like :
1. Manual kill will assume this instance to be killed and would not kill it. Although retry
will kick off in future and the instance will run while user expects the instance to be killed
forever.
2. Manual suspend will assume this instance to be killed and would not suspend it (KILLED
instance cannot be suspended).  Although retry will kick off in future and the instance will
run while user expects the instance to be suspended until someone resumes it.

> Instance status response is KILLED instead of FAILED
> ----------------------------------------------------
>
>                 Key: FALCON-1810
>                 URL: https://issues.apache.org/jira/browse/FALCON-1810
>             Project: Falcon
>          Issue Type: Bug
>          Components: client, oozie
>    Affects Versions: trunk, 0.9
>            Reporter: Pragya Mittal
>
> In workflow if one of the action fails then the intance status is KILLED instead of FAILED.
> {noformat}
> 2016-02-02T11:34Z 	ProcessMultipleClustersTest-corp-3a97a54e	-	KILLED	2016-02-02T11:57Z
2016-02-02T11:57Z	-	http://8RPCG32.corp.inmobi.com:11000/oozie?job=0000938-160202121302340-oozie-oozi-W
> actions:
>  hdfscommands	OK	-
>  aggregator	FAILED/KILLED	http://8RPCG32.corp.inmobi.com:8088/proxy/application_1454048976510_6891/
> {noformat}
> Such instances retry after some time (given retry is enabled)
> This leads to inconsistencies like :
> 1. Manual kill will assume this instance to be killed and would not kill it. Although
retry will kick off in future and the instance will run while user expects the instance to
be killed forever.
> 2. Manual suspend will assume this instance to be killed and would not suspend it (KILLED
instance cannot be suspended).  Although retry will kick off in future and the instance will
run while user expects the instance to be suspended until someone resumes it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message