accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hurley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-3880) Malformed Configuration Causes tservers To Shutdown
Date Mon, 01 Jun 2015 22:29:18 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568144#comment-14568144
] 

Jonathan Hurley commented on ACCUMULO-3880:
-------------------------------------------

Thanks [~elserj] for debugging this problem with me. You are correct in that Ambari is creating
the PID files and managing them. It's how we determine if a process is still running (aside
from the alerts that Ambari has defined for it).

> Malformed Configuration Causes tservers To Shutdown
> ---------------------------------------------------
>
>                 Key: ACCUMULO-3880
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3880
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.6.0, 1.6.1, 1.6.2, 1.7.0
>         Environment: HDP 2.2.7.0 to HDP 2.3.0.0 Upgrade
>            Reporter: Jonathan Hurley
>            Assignee: Josh Elser
>            Priority: Critical
>             Fix For: 1.6.3, 1.7.0, 1.8.0
>
>
> During a rolling upgrade from HDP 2.2 to HDP 2.3, Accumulo tracer fails to start because
it is unable to find any tabletservers. The tabletserver were updated to HDP 2.3 earlier in
the upgrade process and did come online briefly. 
> The PID file still exist, but the tservers are definitely down:
> {noformat}
> [root@c6401 accumulo]# cat accumulo-accumulo-tserver.pid
> 6075
> [root@c6401 accumulo]# ps -a | grep 6075
> {noformat}
> It seems like the problem might be located in the following piece of code:
> {code}
>     private void checkPermission(TCredentials credentials, String lock, final String
request) throws ThriftSecurityException {
>       boolean fatal = false;
>       try {
>         log.trace("Got " + request + " message from user: " + credentials.getPrincipal());
>         if (!security.canPerformSystemActions(credentials)) {
>           log.warn("Got " + request + " message from user: " + credentials.getPrincipal());
>           throw new ThriftSecurityException(credentials.getPrincipal(), SecurityErrorCode.PERMISSION_DENIED);
>         }
>       } catch (ThriftSecurityException e) {
>         log.warn("Got " + request + " message from unauthenticatable user: " + e.getUser());
>         if (getCredentials().getToken().getClass().getName().equals(credentials.getTokenClassName()))
{
>           log.error("Got message from a service with a mismatched configuration. Please
ensure a compatible configuration.", e);
>           fatal = true;
>         }
>         throw e;
>       } finally {
>         if (fatal) {
>           Halt.halt(1, new Runnable() {
>             @Override
>             public void run() {
>               gcLogger.logGCInfo(TabletServer.this.getConfiguration());
>             }
>           });
>         }
>       }
> {code}
> Where a malformed principal causes a {{Halt}}.
> From the tserver logs:
> {noformat}
> 2015-06-01 19:25:30,462 [rpc.TServerUtils] DEBUG: Instantiating default, unsecure custom
half-async Thrift server
> 2015-06-01 19:25:30,468 [tserver.TabletServer] INFO : address = c6401.ambari.apache.org:9997
> 2015-06-01 19:25:30,510 [tserver.TabletServer] INFO : Waiting for tablet server lock
> {noformat}
> There is also no content in the *.out or *.err files for tserver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message