accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-4424) Do not wait to start Thrift servers until lock is acquired
Date Mon, 29 Aug 2016 15:58:21 GMT


Josh Elser commented on ACCUMULO-4424:

Tentatively tagged this to 2.0.0 for now.

I am unsure about the impact of introducing this change in a bugfix version.

> Do not wait to start Thrift servers until lock is acquired
> ----------------------------------------------------------
>                 Key: ACCUMULO-4424
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: rpc
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>             Fix For: 2.0.0
> Had an Accumulo + Ambari user report a funny issue:
> When starting multiple masters, monitors, GC's: they observed that, despite Accumulo
being healthy, Ambari kept reporting that 2/3rd of each service were down. This is because
Ambari is expecting that the Thrift service is up as a service check.
> Presently, for services where only one active instance is allowed, we do not put up the
thrift server until we acquire the leader ZK lock. I propose that we still start these servers
but introduce a barrier to prevent any API calls from succeeding until the leader lock is
obtained. This has a couple of benefits:
> * Better "health" check -- processes might be zombie'd, pidfile check would be insufficient
> * Less confusion around process which is running but not binding the port (have personally
dealt with a case where a user was confused and thought the services where incorrectly stuck
on startup)
> I believe this would also be pretty simple to do since the leader election is already
implemented in one place (just the znode differs).

This message was sent by Atlassian JIRA

View raw message