Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A607F17D99 for ; Mon, 1 Jun 2015 21:54:18 +0000 (UTC) Received: (qmail 9177 invoked by uid 500); 1 Jun 2015 21:54:18 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 9141 invoked by uid 500); 1 Jun 2015 21:54:18 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 9123 invoked by uid 99); 1 Jun 2015 21:54:18 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 01 Jun 2015 21:54:18 +0000 Date: Mon, 1 Jun 2015 21:54:18 +0000 (UTC) From: "Jonathan Hurley (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (ACCUMULO-3880) Malformed Configuration Causes tservers To Shutdown MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Jonathan Hurley created ACCUMULO-3880: ----------------------------------------- Summary: Malformed Configuration Causes tservers To Shutdown Key: ACCUMULO-3880 URL: https://issues.apache.org/jira/browse/ACCUMULO-3880 Project: Accumulo Issue Type: Bug Components: tserver Environment: HDP 2.2.7.0 to HDP 2.3.0.0 Upgrade Reporter: Jonathan Hurley Priority: Critical During a rolling upgrade from HDP 2.2 to HDP 2.3, Accumulo tracer fails to start because it is unable to find any tabletservers. The tabletserver were updated to HDP 2.3 earlier in the upgrade process and did come online briefly. The PID file still exist, but the tservers are definitely down: {noformat} [root@c6401 accumulo]# cat accumulo-accumulo-tserver.pid 6075 [root@c6401 accumulo]# ps -a | grep 6075 {noformat} It seems like the problem might be located in the following piece of code: {code} private void checkPermission(TCredentials credentials, String lock, final String request) throws ThriftSecurityException { boolean fatal = false; try { log.trace("Got " + request + " message from user: " + credentials.getPrincipal()); if (!security.canPerformSystemActions(credentials)) { log.warn("Got " + request + " message from user: " + credentials.getPrincipal()); throw new ThriftSecurityException(credentials.getPrincipal(), SecurityErrorCode.PERMISSION_DENIED); } } catch (ThriftSecurityException e) { log.warn("Got " + request + " message from unauthenticatable user: " + e.getUser()); if (getCredentials().getToken().getClass().getName().equals(credentials.getTokenClassName())) { log.error("Got message from a service with a mismatched configuration. Please ensure a compatible configuration.", e); fatal = true; } throw e; } finally { if (fatal) { Halt.halt(1, new Runnable() { @Override public void run() { gcLogger.logGCInfo(TabletServer.this.getConfiguration()); } }); } } {code} Where a malformed principal causes a {{Halt}}. >From the tserver logs: {noformat} 2015-06-01 19:25:30,462 [rpc.TServerUtils] DEBUG: Instantiating default, unsecure custom half-async Thrift server 2015-06-01 19:25:30,468 [tserver.TabletServer] INFO : address = c6401.ambari.apache.org:9997 2015-06-01 19:25:30,510 [tserver.TabletServer] INFO : Waiting for tablet server lock {noformat} There is also no content in the *.out or *.err files for tserver. -- This message was sent by Atlassian JIRA (v6.3.4#6332)