Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BACA11941E for ; Thu, 21 Apr 2016 22:45:14 +0000 (UTC) Received: (qmail 78842 invoked by uid 500); 21 Apr 2016 22:45:13 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 78776 invoked by uid 500); 21 Apr 2016 22:45:13 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 78385 invoked by uid 99); 21 Apr 2016 22:45:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Apr 2016 22:45:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 924252C1F5C for ; Thu, 21 Apr 2016 22:45:13 +0000 (UTC) Date: Thu, 21 Apr 2016 22:45:13 +0000 (UTC) From: "churro morales (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Work started] (HBASE-15618) Abort if security credentials become invalid MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-15618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HBASE-15618 started by churro morales. ---------------------------------------------- > Abort if security credentials become invalid > -------------------------------------------- > > Key: HBASE-15618 > URL: https://issues.apache.org/jira/browse/HBASE-15618 > Project: HBase > Issue Type: Bug > Reporter: Andrew Purtell > Assignee: churro morales > > We are investigating a production incident where a bad keytab push seems to have caused one regionsever, serving hot regions, to lose the ability to communicate with clients. After the fact we see a steady stream of GSS initiation failure messages in the logs. The affected regionserver lingered in an unhealthy state for too long. HBase did not automatically take any corrective action, like an abort of the affected process, which would have recovered service without operator intervention. > Consider detecting and aborting if security credentials like the Kerberos keytab become invalid during runtime. -- This message was sent by Atlassian JIRA (v6.3.4#6332)