From issues-return-333557-archive-asf-public=cust-asf.ponee.io@hbase.apache.org Tue Feb 13 17:15:11 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 0557618067B for ; Tue, 13 Feb 2018 17:15:10 +0100 (CET) Received: (qmail 4124 invoked by uid 500); 13 Feb 2018 16:15:10 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 4113 invoked by uid 99); 13 Feb 2018 16:15:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Feb 2018 16:15:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 89E991800D6 for ; Tue, 13 Feb 2018 16:15:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.311 X-Spam-Level: X-Spam-Status: No, score=-110.311 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id IYBWheO7S-WA for ; Tue, 13 Feb 2018 16:15:08 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 1407C5FAC2 for ; Tue, 13 Feb 2018 16:15:07 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D8F97E021A for ; Tue, 13 Feb 2018 16:15:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 0F7E92410C for ; Tue, 13 Feb 2018 16:15:04 +0000 (UTC) Date: Tue, 13 Feb 2018 16:15:04 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-19533) How to do controlled shutdown in branch-2? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-19533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-19533: -------------------------- Fix Version/s: (was: 2.0.0-beta-2) 2.0.0 > How to do controlled shutdown in branch-2? > ------------------------------------------ > > Key: HBASE-19533 > URL: https://issues.apache.org/jira/browse/HBASE-19533 > Project: HBase > Issue Type: Task > Reporter: stack > Priority: Critical > Fix For: 2.0.0 > > > Before HBASE-18946, setting shutdown of a cluster, the Master would exit immediately. RegionServers would run region closes and then try and notify the Master of the close and would spew exceptions that the Master was unreachable. > This is different to how branch-1 used to do it. It used to keep Master up and it would be like the captain of the ship, the last to go down. As of HBASE-18946, this is again the case but there are still open issues. > # Usually Master does all open and close of regions. On cluster shutdown, it is the one time where the Regions run the region close. Currently, the regions report the close to the Master which disregards the message since it did not start the region closes. Should we do different? Try and update state in hbase:meta setting it to CLOSE? We might not be able to write CLOSE for all regions since hbase:meta will be closing too (the RS that is hosting hbase:meta will close it last.... but that may not be enough). > # Should the Master run the cluster shutdown sending out close for all regions? What if cluster of 1M regions? Untenable? Send a message per server? That might be better. > Anyways, this needs attention. Filing issue in meantime. -- This message was sent by Atlassian JIRA (v7.6.3#76005)