hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhaval Shah <prince_mithi...@yahoo.co.in>
Subject Re: NoRouteToHostException when zookeeper crashes
Date Tue, 06 Aug 2013 15:37:44 GMT
I have 4/5 left in the quorum. So this is not a majority issue.

Also already running services keep running fine for many hours (so this is not an issue with
the new leader either). It seems like the HBase client code, when trying to lookup the zookeeper
quorum to connect to, is not able to handle the NoRouteToHostException and errors out there
itself (does not try retrying other zookeeper servers because of the unhandled exception).
 
Regards,
Dhaval


----- Original Message -----
From: Ted Yu <yuzhihong@gmail.com>
To: user@hbase.apache.org; Dhaval Shah <prince_mithibai@yahoo.co.in>
Cc: 
Sent: Tuesday, 6 August 2013 11:32 AM
Subject: Re: NoRouteToHostException when zookeeper crashes

bq. one of my zookeeper server goes down

How many servers were left in the quorum ? Was the new leader elected
properly afterwards ?

Thanks

On Tue, Aug 6, 2013 at 8:18 AM, Dhaval Shah <prince_mithibai@yahoo.co.in>wrote:

> HBase - 0.92.1
> Zookeeper - 3.4.3
>
> Regards,
> Dhaval
>
>
> ----- Original Message -----
> From: Ted Yu <yuzhihong@gmail.com>
> To: user@hbase.apache.org; Dhaval Shah <prince_mithibai@yahoo.co.in>
> Cc:
> Sent: Tuesday, 6 August 2013 11:08 AM
> Subject: Re: NoRouteToHostException when zookeeper crashes
>
> What HBase / zookeeper versions are you using ?
>
> On Tue, Aug 6, 2013 at 7:48 AM, Dhaval Shah <prince_mithibai@yahoo.co.in
> >wrote:
>
> > I have a weird (and a pretty serious) issue on my HBase cluster. Whenever
> > one of my zookeeper server goes down, already running services work fine
> > for a few hours but when I try to restart any service (be it region
> servers
> > or clients), they fail with a NoRouteToHostException while trying to
> > connect to zookeeper and I cannot restart any service successfully at
> all.
> > I do realize that No Route to host is coming from my network
> infrastructure
> > (ping gives the same error) but why would 1 zookeeper server going down
> > bring down the entire HBase cluster. Why doesn't HBase ride over the
> > exception and try some other zookeeper server?
> >
> > Is this an issue other people face or its just me? We are running these
> on
> > DHCP (but the IPs don't change because we have long leases). Do you guys
> > think its a DHCP specific issue? Do you have pointers to avoid this issue
> > with DHCP or do I have to move to static IPs?
> >
> > Regards,
> > Dhaval
> >
>
>


Mime
View raw message