hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitay <nit...@gmail.com>
Subject Re: [jira] Created: (HBASE-1312) ZooKeeper: Master's ephemeral node went away while it was still up and functioning normally
Date Sun, 05 Apr 2009 22:14:09 GMT
The master did not respond correctly to a SessionExpired event. I don't
think there's a ZK bug. This is like HBASE-1232. Both the master and
regionserver got a SessionExpired event. The bug I fixed for Ryan was just
with the client getting a SessionExpired. Andrew's cluster shows us that
it's just as likely for the master/RS to get this event.

The only thing you can do on a SessionExpired event is to completely restart
the node. SessionExpired means your ZooKeeper handle is dead, and your
ephemeral nodes will go away. Since every server in HBase has some ephemeral
node that indicates it liveness (e.g. /hbase/master, /hbase/rs/...), the
node has to completely restart.

HBASE-1232, HBASE-1311, and HBASE-1312 are all the same problem, just with
three different points of view (client, RS, master).

On Sun, Apr 5, 2009 at 2:32 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:

> ZK keeps the note up as long as the session is still valid.
>
> So the question is:
> - did the master not respond correctly to an expired session?
> - is there a ZK bug (HOPE NOT!)
>
> -ryan
>
> On Sun, Apr 5, 2009 at 2:22 PM, Andrew Purtell (JIRA) <jira@apache.org
> >wrote:
>
> > ZooKeeper: Master's ephemeral node went away while it was still up and
> > functioning normally
> >
> >
> -------------------------------------------------------------------------------------------
> >
> >                 Key: HBASE-1312
> >                 URL: https://issues.apache.org/jira/browse/HBASE-1312
> >             Project: Hadoop HBase
> >          Issue Type: Bug
> >            Reporter: Andrew Purtell
> >
> >
> > Does the master watch its own znode? Right around the time of
> regionserver
> > problems described in HBASE-1311, clients could no longer find the
> master,
> > but according to its log it was up and functionling normally. I think the
> > master and regionserver sessions expired at the same time, as they were
> > started within seconds of each other.
> >
> > --
> > This message is automatically generated by JIRA.
> > -
> > You can reply to this email to add a comment to the issue online.
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message