hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Esteban Gutierrez <este...@cloudera.com>
Subject Re: HBase Failover
Date Mon, 14 Jul 2014 18:34:01 GMT
Hi Jinal,

I see that the exception occurred while the client was attempting to fetch
the table descriptor via HTable.getTableDescriptor(), operations that
interact with the HBase Master cannot be retried in the version of HBase
that you are using and you need to catch the IOE and retry the call once
the hbase.rpc.timeout has expired. Since HBase 0.95.2 those operations can
be retried, see https://issues.apache.org/jira/browse/HBASE-8764

cheers,
esteban.




--
Cloudera, Inc.



On Mon, Jul 14, 2014 at 10:59 AM, Jinal Shah <jinalshah2007@gmail.com>
wrote:

> Hi esteban,
>
> I don't have access to HBase master logs but I'll try to get it if I can.
> When the failover occurs only the hbase service goes down. We see the
> standby Master being active.
>
> The clients run on different nodes and have the zookeeper configured
> correctly. Here is the post I have on stackoverflow to give more
> information about the error and the hbase-site.xml configuration.
> http://stackoverflow.com/questions/24726994/hbase-failover-situation
>
> cheers,
> Jinal
>
>
> On Mon, Jul 14, 2014 at 12:10 AM, Esteban Gutierrez <esteban@cloudera.com>
> wrote:
>
> > -dev (bcc) +user
> >
> > Hello Jinal,
> >
> > Can you pastebin the logs from both HBase masters? When this failover
> > occurs, was the HBase master process killed or all services in that node
> > killed? When the HBase master dies it takes about 1 min (default RPC
> > timeout)  for the standby HBase master to transition to active and it is
> > expected that clients that use the HBase master can get a connection
> > refused exception until the standby master becomes an active master.
> >
> > However if your run other services in the same node like ZooKeeper and
> you
> > also run clients on the same node make sure that hbase.zookeeper.quorum
> is
> > configured correctly and has the 3 ZooKeeper nodes, otherwise clients
> > running on this node will get a connection refused from localhost.
> >
> > cheers,
> > esteban.
> >
> >
> >
> >
> >
> >
> >
> >
> > --
> > Cloudera, Inc.
> >
> >
> >
> > On Sun, Jul 13, 2014 at 2:02 PM, Jinal Shah <jinalshah2007@gmail.com>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > I'm Jinal Shah. I'm kind of new to HBase and I'm trying to find the
> > > solution for HBase failover situation. So here is the whole picture of
> > what
> > > is happening. We have 3 zookeeper nodes, 2 Hbase master nodes and some
> > > region servers. When hbase failovers to from 1 master to another we
> have
> > > recycle our service in order to get our services to hit hbase otherwise
> > we
> > > get ConnectionRefused exception. I'm not sure what we are doing wrong
> or
> > if
> > > we are missing any configuration or something. the same thing happens
> > when
> > > we use the hbase shell and if there is a master failover happens then
> it
> > > starts throwing the same error. Can anyone please help me in knowing
> why
> > > this is happening? FYI We are using hbase 0.94.2
> > >
> > > Thanks
> > > Jinal
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message