accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Created] (ACCUMULO-4424) Do not wait to start Thrift servers until lock is acquired
Date Thu, 25 Aug 2016 21:31:20 GMT
Josh Elser created ACCUMULO-4424:

             Summary: Do not wait to start Thrift servers until lock is acquired
                 Key: ACCUMULO-4424
             Project: Accumulo
          Issue Type: Improvement
          Components: rpc
            Reporter: Josh Elser
            Assignee: Josh Elser

Had an Accumulo + Ambari user report a funny issue:

When starting multiple masters, monitors, GC's: they observed that, despite Accumulo being
healthy, Ambari kept reporting that 2/3rd of each service were down. This is because Ambari
is expecting that the Thrift service is up as a service check.

Presently, for services where only one active instance is allowed, we do not put up the thrift
server until we acquire the leader ZK lock. I propose that we still start these servers but
introduce a barrier to prevent any API calls from succeeding until the leader lock is obtained.
This has a couple of benefits:

* Better "health" check -- processes might be zombie'd, pidfile check would be insufficient
* Less confusion around process which is running but not binding the port (have personally
dealt with a case where a user was confused and thought the services where incorrectly stuck
on startup)

I believe this would also be pretty simple to do since the leader election is already implemented
in one place (just the znode differs).

This message was sent by Atlassian JIRA

View raw message