Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3F0C8200C2D for ; Sat, 4 Mar 2017 18:43:51 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 3D91C160B71; Sat, 4 Mar 2017 17:43:51 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 856BF160B61 for ; Sat, 4 Mar 2017 18:43:50 +0100 (CET) Received: (qmail 87619 invoked by uid 500); 4 Mar 2017 17:43:49 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 87608 invoked by uid 99); 4 Mar 2017 17:43:49 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 04 Mar 2017 17:43:49 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id D0B47C0913 for ; Sat, 4 Mar 2017 17:43:48 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.547 X-Spam-Level: X-Spam-Status: No, score=-1.547 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-2.999, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id lwYmvIrUPCAV for ; Sat, 4 Mar 2017 17:43:47 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 8E8645FACA for ; Sat, 4 Mar 2017 17:43:46 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id A9E10E02FD for ; Sat, 4 Mar 2017 17:43:45 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 402FF24160 for ; Sat, 4 Mar 2017 17:43:45 +0000 (UTC) Date: Sat, 4 Mar 2017 17:43:45 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-17733) Undo registering regionservers in zk with ephemeral nodes; its more trouble than its worth MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sat, 04 Mar 2017 17:43:51 -0000 [ https://issues.apache.org/jira/browse/HBASE-17733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15895790#comment-15895790 ] stack commented on HBASE-17733: ------------------------------- [~busbey] For example, why not just solve this problem by having the Master watch for new ephemeral znodes and only add the RS to its internals at that point? Just because we need to ask the master what name it sees for us doesn't mean we have to use that request as the time to consider the RS fully bootstrapped. I considered versions of the above but thought there too much risk involved changing order of registration afraid I'd break something unforeseen and queasy at the thought of doubling-down on two daemons having to go via a third service to establish a connectivity they've already proven. If a refactor/redo, I wanted to do it 'right' -- hence this issue. That said, this simple 'offset' registration suggestion of yours I did not consider. It is a better idea than any I had. I took a look. In Master, we have RegionServerTracker. It already takes care of setting watchers looking for changes in the zk RS znode. Currently whenever a change, we refresh our list of online servers without regard. I could do a parse and if the server is a 'new' entry not seen before, call a new version of ServerManager#regionServerStartup (would have to make sure watcher triggers properly in all cases). So far not bad. But before we register a server, we check its current timestamp to check for clockskew (Heartbeat includes the current RS ts. Ephemeral znode is stamped once with time of its stamping). If the skew is unacceptable, we throw a ClockOutOfSyncException. This goes back as the result of the RS's first check in ('reportForDuty') and the RS will shut itself down. We'd have to build up a replacement channel for this clock skew check if we register new servers via changed child watcher (Just noticed: our clock skew check is poor; it is only done on first server registration -- we should check on every heartbeat). So, it looks like your suggestion involves a good bit of work and risk. I'd rather spend the effort on removing zk from equation. WDYT > Undo registering regionservers in zk with ephemeral nodes; its more trouble than its worth > ------------------------------------------------------------------------------------------ > > Key: HBASE-17733 > URL: https://issues.apache.org/jira/browse/HBASE-17733 > Project: HBase > Issue Type: Brainstorming > Reporter: stack > > Elsewhere, we are undoing the use of ZK (replication current WAL offset, regions-in-transition, etc). > I have another case where using ZK, while convenient (call-backs), has holes. > The scenario is prompted by review of HBASE-9593. > Currently, a RS registers with the Master by calling the Master's reportForDuty. After the Master responds with the name we are to use for ourselves (as well as other properties we need to 'run'), we then turnaround and do a new RPC out to the zk ensemble to register an ephemeral znode for the RS. > We notice a RS has gone away -- crashed -- because its znode evaporates and the Master has a watcher triggered notifying it the RS has gone (after a zk session timeout of tens of seconds). Cumbersome (Setting watchers, zk session timeouts) and indirect. Master then trips the server shutdown handler which does reassign of regions from the crashed server. > In HBASE-9593, we were trying to handle the rare but possible case where the RS would die after registering w/ the Master but before we put up our ephemeral znode. In this case a RS would live in the Master's internals forever because there is no ephemeral znode to expire to do cleanup and removal of the never-started RS. > Lets get ZK out of the loop. Then only the Master and RS involved heartbeating each other. -- This message was sent by Atlassian JIRA (v6.3.15#6346)