hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suma Shivaprasad (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (YARN-8901) Restart "NEVER" policy does not work with component dependency
Date Wed, 17 Oct 2018 22:38:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Suma Shivaprasad reassigned YARN-8901:
--------------------------------------

    Assignee: Suma Shivaprasad

> Restart "NEVER" policy does not work with component dependency
> --------------------------------------------------------------
>
>                 Key: YARN-8901
>                 URL: https://issues.apache.org/jira/browse/YARN-8901
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Yesha Vora
>            Assignee: Suma Shivaprasad
>            Priority: Critical
>
> Scenario:
> 1) Launch an application with two components. master and worker. Here, worker is dependent
on master. ( Worker should be launched only after master is launched )
> 2) Set restart_policy = NEVER for both master and worker. 
> {code:title=sample launch.json}
> {
> 	"name": "mawo-hadoop-ut",
>         "artifact": {
>                 "type": "DOCKER",
>                 "id": "xxx"
>         },
>         "configuration": {
>                 "env": {
>                        "YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK": "hadoop"
>                  },
>                 "properties": {
>                        "docker.network": "hadoop"
>                 }
>         },
> 	"components": [{
> 		"dependencies": [],
> 		"resource": {
> 			"memory": "2048",
> 			"cpus": "1"
> 		},
> 		"name": "master",
>                 "run_privileged_container": true,
> 		"number_of_containers": 1,
> 		"launch_command": "start master",
>                 "restart_policy": "NEVER",
> 	}, {
> 		"dependencies": ["master"],
> 		"resource": {
> 			"memory": "8072",
> 			"cpus": "1"
> 		},
> 		"name": "worker",
>                 "run_privileged_container": true,
> 		"number_of_containers": 10,
> 		"launch_command": "start worker",
>                 "restart_policy": "NEVER",
> 	}],
> 	"lifetime": -1,
> 	"version": 1.0
> }{code}
> When restart policy is selected to NEVER, AM never launches Worker component. It get
stuck with below message. 
> {code}
> 2018-10-17 15:11:58,560 [Component  dispatcher] INFO  component.Component - [COMPONENT
master] Transitioned from FLEXING to STABLE on CHECK_STABLE event.
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  instance.ComponentInstance - [COMPINSTANCE
master-0 : container_e41_1539027682947_0020_01_000002] Transitioned from STARTED to READY
on BECOME_READY event
> 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  component.Component - [COMPONENT worker]:
Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component
has not completed 
> 2018-10-17 15:12:28,556 [pool-7-thread-1] INFO  component.Component - [COMPONENT worker]:
Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component
has not completed 
> 2018-10-17 15:12:58,556 [pool-7-thread-1] INFO  component.Component - [COMPONENT worker]:
Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component
has not completed 
> 2018-10-17 15:13:28,556 [pool-7-thread-1] INFO  component.Component - [COMPONENT worker]:
Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component
has not completed 
> 2018-10-17 15:13:58,556 [pool-7-thread-1] INFO  component.Component - [COMPONENT worker]:
Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component
has not completed 
> 2018-10-17 15:14:28,556 [pool-7-thread-1] INFO  component.Component - [COMPONENT worker]:
Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component
has not completed {code}
> 'NEVER' restart policy expects master component to be finished before starting workers.
Master component can not finish the job without workers. Thus, it create a deadlock.
> The logic for 'NEVER' restart policy should be fixed to allow worker components to be
launched as soon as master component is in READY state. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message