ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMBARI-21204) Yarn stopped by itself after start. HA run
Date Thu, 08 Jun 2017 17:02:21 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-21204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16043039#comment-16043039
] 

Hadoop QA commented on AMBARI-21204:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12872100/AMBARI-21204.patch
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified
tests.
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 core tests{color}.  The patch passed unit tests in ambari-server.

Console output: https://builds.apache.org/job/Ambari-trunk-test-patch/11638//console

This message is automatically generated.

> Yarn stopped by itself after start. HA run
> ------------------------------------------
>
>                 Key: AMBARI-21204
>                 URL: https://issues.apache.org/jira/browse/AMBARI-21204
>             Project: Ambari
>          Issue Type: Bug
>    Affects Versions: 2.5.1
>            Reporter: Dmytro Sen
>            Assignee: Dmytro Sen
>            Priority: Critical
>             Fix For: 2.5.2
>
>         Attachments: AMBARI-21204.patch
>
>
> From RM logs :
> {code}
> 2017-06-07 14:23:19,191 FATAL resourcemanager.ResourceManager (ResourceManager.java:main(1240))
- Error starting ResourceManager
> org.apache.hadoop.service.ServiceStateException: java.io.IOException: Couldn't set ACLs
on parent ZNode: /yarn-leader-election
>         at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
>         at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:152)
>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:281)
>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1236)
> Caused by: java.io.IOException: Couldn't set ACLs on parent ZNode: /yarn-leader-election
>         at org.apache.hadoop.ha.ActiveStandbyElector.ensureParentZNode(ActiveStandbyElector.java:351)
>         at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:103)
>         at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>         ... 7 more
> Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode
= BadVersion for /yarn-leader-election
> {code}
> The problem is that disabling security changes zk ACL for resource manager as part of
AMBARI-19331. After the recent change in HDFS-11403, RM checks znode version and fails if
it's different than expected.
> The correct fix could be to remove znode during security disabling and do not break election
znode consistency by manually changing ACL to all. RM should create it with proper ACL.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message