hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sunil G <sun...@apache.org>
Subject Re: [VOTE] Release Apache Hadoop 2.9.0 (RC0)
Date Tue, 07 Nov 2017 11:14:05 GMT
Hi Subru and Arun.

Thanks for driving 2.9 release. Great work!

I installed cluster built from source.
- Ran few MR jobs with application priority enabled. Runs fine.
- Accessed new UI and it also seems fine.

However I am also getting same issue as Rohith reported.
- Started an HA cluster
- Pushed RM to standby
- Pushed back RM to active then seeing an exception.

org.apache.hadoop.ha.ServiceFailedException: RM could not transition to
Active
        at
org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorServic
    e.becomeActive(ActiveStandbyElectorBasedElectorService.java:146)
        at
org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894
    )

Caused by: org.apache.zookeeper.KeeperException$NoAuthException:
KeeperErrorCode = NoAuth
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
        at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:949)

Will check and post more details,

- Sunil


On Tue, Nov 7, 2017 at 12:47 PM Rohith Sharma K S <rohithsharmaks@apache.org>
wrote:

> Thanks Subru/Arun for the great work!
>
> Downloaded source and built from it. Deployed RM HA non-secured cluster
> along with new YARN UI and ATSv2.
>
> I am facing basic RM HA switch issue after first time successful start.
> *Can
> anyone else is facing this issue?*
>
> When RM is switched from ACTIVE to STANDBY to ACTIVE, RM never switch to
> active successfully. Exception trace I see from the log is
>
> 2017-11-07 12:35:56,540 WARN org.apache.hadoop.ha.ActiveStandbyElector:
> Exception handling the winning of election
> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to
> Active
>     at
>
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:146)
>     at
>
> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:894)
>     at
>
> org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:473)
>     at
>
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
>     at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when
> transitioning to Active mode
>     at
>
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:325)
>     at
>
> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyElectorBasedElectorService.becomeActive(ActiveStandbyElectorBasedElectorService.java:144)
>     ... 4 more
> Caused by: org.apache.hadoop.service.ServiceStateException:
> org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode =
> NoAuth
>     at
>
> org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105)
>     at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:205)
>     at
>
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startActiveServices(ResourceManager.java:1131)
>     at
>
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1171)
>     at
>
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$1.run(ResourceManager.java:1167)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886)
>     at
>
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.transitionToActive(ResourceManager.java:1167)
>     at
>
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.transitionToActive(AdminService.java:320)
>     ... 5 more
> Caused by: org.apache.zookeeper.KeeperException$NoAuthException:
> KeeperErrorCode = NoAuth
>     at
> org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
>     at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:949)
>     at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915)
>     at
>
> org.apache.curator.framework.imps.CuratorTransactionImpl.doOperation(CuratorTransactionImpl.java:159)
>     at
>
> org.apache.curator.framework.imps.CuratorTransactionImpl.access$200(CuratorTransactionImpl.java:44)
>     at
>
> org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:129)
>     at
>
> org.apache.curator.framework.imps.CuratorTransactionImpl$2.call(CuratorTransactionImpl.java:125)
>     at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107)
>     at
>
> org.apache.curator.framework.imps.CuratorTransactionImpl.commit(CuratorTransactionImpl.java:122)
>     at
>
> org.apache.hadoop.util.curator.ZKCuratorManager$SafeTransaction.commit(ZKCuratorManager.java:403)
>     at
>
> org.apache.hadoop.util.curator.ZKCuratorManager.safeSetData(ZKCuratorManager.java:372)
>     at
>
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore.getAndIncrementEpoch(ZKRMStateStore.java:493)
>     at
>
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceStart(ResourceManager.java:754)
>     at
> org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
>     ... 13 more
>
> Thanks & Regards
> Rohith Sharma K S
>
> On 4 November 2017 at 04:20, Arun Suresh <asuresh@apache.org> wrote:
>
> > Hi folks,
> >
> >      Apache Hadoop 2.9.0 is the first stable release of Hadoop 2.9 line
> and
> > will be the latest stable/production release for Apache Hadoop - it
> > includes 30 New Features with 500+ subtasks, 407 Improvements, 787 Bug
> > fixes new fixed issues since 2.8.2 .
> >
> >       More information about the 2.9.0 release plan can be found here:
> > *https://cwiki.apache.org/confluence/display/HADOOP/
> > Roadmap#Roadmap-Version2.9
> > <https://cwiki.apache.org/confluence/display/HADOOP/
> > Roadmap#Roadmap-Version2.9>*
> >
> >       New RC is available at:
> > http://home.apache.org/~asuresh/hadoop-2.9.0-RC0/
> >
> >       The RC tag in git is: release-2.9.0-RC0, and the latest commit id
> is:
> > 6697f0c18b12f1bdb99cbdf81394091f4fef1f0a
> >
> >       The maven artifacts are available via repository.apache.org at:
> > *
> https://repository.apache.org/content/repositories/orgapachehadoop-1065/
> > <
> https://repository.apache.org/content/repositories/orgapachehadoop-1065/
> > >*
> >
> >       Please try the release and vote; the vote will run for the usual 5
> > days, ending on 11/10/2017 4pm PST time.
> >
> > Thanks,
> >
> > Arun/Subru
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message