Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3885011C8D for ; Mon, 14 Jul 2014 18:00:11 +0000 (UTC) Received: (qmail 90356 invoked by uid 500); 14 Jul 2014 18:00:09 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 90263 invoked by uid 500); 14 Jul 2014 18:00:09 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 89874 invoked by uid 99); 14 Jul 2014 18:00:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Jul 2014 18:00:08 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jinalshah2007@gmail.com designates 74.125.82.175 as permitted sender) Received: from [74.125.82.175] (HELO mail-we0-f175.google.com) (74.125.82.175) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Jul 2014 18:00:04 +0000 Received: by mail-we0-f175.google.com with SMTP id k48so4496853wev.20 for ; Mon, 14 Jul 2014 10:59:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=haTaASkjNndaLa5EdHV556wsI5EgPY5RVfqHcbC1ci8=; b=A6HAjL21al+abqNazouVdhC3LtVFUcRN6tqSRUf+068gqKs8HPWoDlcdHfwkpFAK9f JEjHuo4FNBOE/DDnIQnjYKazYUEEY132hQL8WVE12oHe6EsKmXKkxOzc+wCDgEnwNNjC +QVSVZNmK8DJRGh/y+Im0zMFZyCNMm3B9ZK8izAqqwxjfU5CuQKYaJUoDUWu55Vpu6yg Bmwvq1h++BC3MtYeUgb2h+0r9X9hwMxvlE3mAD+GWVbiouJM/7jRPn25kUPboVL0jCyx h1odg/3R+YIbKO5aRHWjJBuQTg0GDRAk9ajFuD3oCf6rLRGkRFaVrhecq009WoJTeeSO utew== MIME-Version: 1.0 X-Received: by 10.180.218.72 with SMTP id pe8mr26714991wic.63.1405360782687; Mon, 14 Jul 2014 10:59:42 -0700 (PDT) Received: by 10.194.166.97 with HTTP; Mon, 14 Jul 2014 10:59:42 -0700 (PDT) In-Reply-To: References: Date: Mon, 14 Jul 2014 12:59:42 -0500 Message-ID: Subject: Re: HBase Failover From: Jinal Shah To: dev@hbase.apache.org Cc: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a1134c5ce3bc37104fe2b0eb8 X-Virus-Checked: Checked by ClamAV on apache.org --001a1134c5ce3bc37104fe2b0eb8 Content-Type: text/plain; charset=UTF-8 Hi esteban, I don't have access to HBase master logs but I'll try to get it if I can. When the failover occurs only the hbase service goes down. We see the standby Master being active. The clients run on different nodes and have the zookeeper configured correctly. Here is the post I have on stackoverflow to give more information about the error and the hbase-site.xml configuration. http://stackoverflow.com/questions/24726994/hbase-failover-situation cheers, Jinal On Mon, Jul 14, 2014 at 12:10 AM, Esteban Gutierrez wrote: > -dev (bcc) +user > > Hello Jinal, > > Can you pastebin the logs from both HBase masters? When this failover > occurs, was the HBase master process killed or all services in that node > killed? When the HBase master dies it takes about 1 min (default RPC > timeout) for the standby HBase master to transition to active and it is > expected that clients that use the HBase master can get a connection > refused exception until the standby master becomes an active master. > > However if your run other services in the same node like ZooKeeper and you > also run clients on the same node make sure that hbase.zookeeper.quorum is > configured correctly and has the 3 ZooKeeper nodes, otherwise clients > running on this node will get a connection refused from localhost. > > cheers, > esteban. > > > > > > > > > -- > Cloudera, Inc. > > > > On Sun, Jul 13, 2014 at 2:02 PM, Jinal Shah > wrote: > > > Hi everyone, > > > > I'm Jinal Shah. I'm kind of new to HBase and I'm trying to find the > > solution for HBase failover situation. So here is the whole picture of > what > > is happening. We have 3 zookeeper nodes, 2 Hbase master nodes and some > > region servers. When hbase failovers to from 1 master to another we have > > recycle our service in order to get our services to hit hbase otherwise > we > > get ConnectionRefused exception. I'm not sure what we are doing wrong or > if > > we are missing any configuration or something. the same thing happens > when > > we use the hbase shell and if there is a master failover happens then it > > starts throwing the same error. Can anyone please help me in knowing why > > this is happening? FYI We are using hbase 0.94.2 > > > > Thanks > > Jinal > > > --001a1134c5ce3bc37104fe2b0eb8--