Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8131319CF1 for ; Fri, 1 Apr 2016 16:53:27 +0000 (UTC) Received: (qmail 25704 invoked by uid 500); 1 Apr 2016 16:53:26 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 25554 invoked by uid 500); 1 Apr 2016 16:53:26 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 25293 invoked by uid 99); 1 Apr 2016 16:53:26 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Apr 2016 16:53:26 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 092EB2C1F84 for ; Fri, 1 Apr 2016 16:53:26 +0000 (UTC) Date: Fri, 1 Apr 2016 16:53:26 +0000 (UTC) From: "churro morales (JIRA)" To: dev@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (HBASE-12814) Zero downtime upgrade from 94 to 98 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-12814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] churro morales resolved HBASE-12814. ------------------------------------ Resolution: Not A Problem Most likely everyone is off the 94 branch. > Zero downtime upgrade from 94 to 98 > ------------------------------------ > > Key: HBASE-12814 > URL: https://issues.apache.org/jira/browse/HBASE-12814 > Project: HBase > Issue Type: New Feature > Affects Versions: 0.94.26, 0.98.10 > Reporter: churro morales > Assignee: churro morales > Attachments: HBASE-12814-0.94.patch, HBASE-12814-0.98.patch > > > Here at Flurry we want to upgrade our HBase cluster from 94 to 98 while not having any downtime and maintaining master / master replication. > Summary: > Replication is done via thrift RPC between clusters. It is configurable on a peer by peer basis and the one caveat is that a thrift server starts up on every node which proxies the request to the ReplicationSink. > For the upgrade process: > * in hbase-site.xml two new configuration parameters are added: > ** *Required* > *** hbase.replication.sink.enable.thrift -> true > *** hbase.replication.thrift.server.port -> > ** *Optional* > *** hbase.replication.thrift.protection {default: AUTHENTICATION} > *** hbase.replication.thrift.framed {default: false} > *** hbase.replication.thrift.compact {default: true} > - All regionservers can be rolling restarted (no downtime), all clusters must have the respective patch for this to work. > - the hbase shell add_peer command takes an additional parameter for rpc protocol > - example: {code} add_peer '1' "hbase-101:2181:/hbase", "THRIFT" {code} > Now comes the fun part when you want to upgrade your cluster from 94 to 98 you simply pause replication to the cluster being upgraded, do the upgrade and un-pause replication. Once you have a pair of clusters only replicating inbound and outbound with the 98 release. You can start replicating via the native rpc protocol by adding the peer again without the _THRIFT_ parameter and subsequently deleting the peer with the thrift protocol. Because replication is idempotent I don't see any issues as long as you wait for the backlog to drain after un-pausing replication. > Special thanks to Francis Liu at Yahoo for laying the groundwork and Mr. Dave Latham for his invaluable knowledge and assistance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)