ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aravindan Vijayan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-21705) Metrics Collector start failed due to "Unable to initialize HA controller"
Date Thu, 10 Aug 2017 17:30:01 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-21705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aravindan Vijayan updated AMBARI-21705:
---------------------------------------
    Attachment: AMBARI-21705.patch

The bug is that when the collector instance is already present in the znode, the instance
treats this as a 'failure' rather than a 'success'.

Manually tested patch.

[~swagle] / [~dsen] / [~sumitmohanty] Can you review this simple patch ? 

> Metrics Collector start failed due to "Unable to initialize HA controller"
> --------------------------------------------------------------------------
>
>                 Key: AMBARI-21705
>                 URL: https://issues.apache.org/jira/browse/AMBARI-21705
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-metrics
>    Affects Versions: 2.5.2
>            Reporter: Aravindan Vijayan
>            Assignee: Aravindan Vijayan
>            Priority: Blocker
>             Fix For: 2.5.2
>
>         Attachments: AMBARI-21705.patch
>
>
> AMS start failed due to "Unable to initialize HA controller"
> STR:
> 1)Enable HA
> 2)Delete all services except HDFS, Yarn, MapReduce, Zookeeper
> 3)Add all services
> Metrics collector start, but after few minutes fail
> {code}
> Error starting ApplicationHistoryServer org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.MetricsSystemInitializationException:
Unable to initialize HA controller at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:118)
at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.serviceInit(HBaseTimelineMetricStore.java:96)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:84)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:137)
at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:147)
Caused by: org.I0Itec.zkclient.exception.ZkException: org.apache.zookeeper.KeeperException$NotEmptyException:
KeeperErrorCode = Directory not empty for /ambari-metrics-cluster at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:68)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:1000) at org.apache.helix.manager.zk.ZkClient.delete(ZkClient.java:347)
at org.I0Itec.zkclient.ZkClient.deleteRecursive(ZkClient.java:791) at org.apache.helix.manager.zk.ZKHelixAdmin.addCluster(ZKHelixAdmin.java:497)
at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.availability.MetricCollectorHAController.initializeHAController(MetricCollectorHAController.java:156)
at org.apache.hadoop.yarn.server.applicationhistoryservice.metrics.timeline.HBaseTimelineMetricStore.initializeSubsystem(HBaseTimelineMetricStore.java:115)
... 7 more Caused by: org.apache.zookeeper.KeeperException$NotEmptyException: KeeperErrorCode
= Directory not empty for /ambari-metrics-cluster at org.apache.zookeeper.KeeperException.create(KeeperException.java:125)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
at org.I0Itec.zkclient.ZkConnection.delete(ZkConnection.java:104) at org.apache.helix.manager.zk.ZkClient$8.call(ZkClient.java:351)
at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:990) ... 12 more
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message