Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9CB8A17B28 for ; Thu, 4 Jun 2015 00:13:38 +0000 (UTC) Received: (qmail 53513 invoked by uid 500); 4 Jun 2015 00:13:38 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 53471 invoked by uid 500); 4 Jun 2015 00:13:38 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 53460 invoked by uid 99); 4 Jun 2015 00:13:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jun 2015 00:13:38 +0000 Date: Thu, 4 Jun 2015 00:13:38 +0000 (UTC) From: "Josh Elser (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-3887) Lack of insight into `accumulo admin stop $tserver` MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-3887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14571875#comment-14571875 ] Josh Elser commented on ACCUMULO-3887: -------------------------------------- Also, this was [~jonathan.hurley] again uncovering another good area that needs improvement. > Lack of insight into `accumulo admin stop $tserver` > --------------------------------------------------- > > Key: ACCUMULO-3887 > URL: https://issues.apache.org/jira/browse/ACCUMULO-3887 > Project: Accumulo > Issue Type: Bug > Components: master > Affects Versions: 1.7.0 > Reporter: Josh Elser > Assignee: Josh Elser > Fix For: 1.7.1, 1.8.0 > > > Spent a good bit of time trying to figure out why the master _seemed_ to have shut down a tabletserver for no reason. The best explanation I could come up with is as follows. > * Client calls {{accumulo admin stop $host}} > * TabletServer on $host gets restarted > * Master seeds FATE op to stop $host using only the host:port > * FATE op will kill the fresh tserver on $host > The amount of time between steps 1 and 3 could be arbitrarily long, so this can be a little problematic. > One big thing we can do is to perform the sessionID calculation as early as possible instead of deferring it into the Master. Thankfully, we can also handle this gracefully and remain backwards compatible, so both of the following would work: > * {{accumulo admin stop host:port}} > * {{accumulo admin stop host:port\[session\]}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)