hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitay <nit...@gmail.com>
Subject Re: ZK rethink?
Date Tue, 07 Apr 2009 20:12:36 GMT
Hi Andrew,

I agree with you that getting a SessionExpired is a problem for us, and we
didn't really consider it when we initially put in the ZooKeeper code.
However, I don't necessarily think a complete rethink is necessary.

The main issue here is how often a SessionExpired is going to happen, and
why it is happening that often. Most people using ZooKeeper use a session
timeout of 2 or 3 seconds. A SessionExpired occurs when you lose connection
to the ZooKeeper instance you were talking to and are unable to connect to
another one within this time frame. In HBase, we use 10 seconds for this
interval. Given that, I think we should do some recon work first to
determine what's going on. When does it happen? Why? Is the ZooKeeper IO
thread getting starved for long periods of time? Can we prevent it? The
ZooKeeper folks describe SessionExpired as a very, very rare event, yet that
does not seem to be the case for us.

Issues like HBASE-1314 are certainly a bug. If we think a node is dead
because its ephemeral ZNode has vanished we should not try talking to it
anymore. We cannot have a case where we both think it's dead and are talking

If, after some investigation, we come to the conclusion that these
SessionExpired events are unavoidable things that will happen quite
frequently, then yes I think something like what you suggest is a good idea.
But if these events only really do happen once in a blue moon as it seems
they're supposed to, then perhaps simply internally restarting the node in
question is not so bad?

Within the solutions you propose I would opt for the timer option. I don't
think that not using ephemeral nodes with watches is a good solution. It
shifts us away from using the power that ZooKeeper provides. Assuming at
some point ZooKeeper gets more reliable with its sessions, we will have a
lot of code to change if we want to undo the decision.

Regardless of what we end up going with, we need to do _something_ on the
RS/master when they get a SessionExpired, because we currently will get
wedged. That's what I'm working on right now (HBASE-1311, HBASE-1312).

Thanks for bringing this up Andrew. I'm glad we have a cluster like yours to
bring out these sorts of problems. I look forward to further discussion on
this topic and hearing other people's thoughts.


On Tue, Apr 7, 2009 at 11:10 AM, Andrew Purtell <apurtell@apache.org> wrote:

> Hi Chad,
> In my testing the session expiration happens due to missed IO
> like as with ZOOKEEPER-344, which is currently open.
>  https://issues.apache.org/jira/browse/ZOOKEEPER-344
> Also a Google search for "zookeeper session expired" turns up
> some conversation already on the topic.
>  - Andy
> > From: Chad Walters
> > Subject: RE: ZK rethink?
> > To: "hbase-dev@hadoop.apache.org" <hbase-dev@hadoop.apache.org>
> > Date: Tuesday, April 7, 2009, 10:57 AM
> >
> > Has this been discussed at all with the ZooKeeper
> > developers?
> >
> > Chad
> >
> > -----Original Message-----
> > From: Andrew Purtell [mailto:apurtell@apache.org]
> > Sent: Tuesday, April 07, 2009 10:53 AM
> > To: hbase-dev@hadoop.apache.org
> > Subject: ZK rethink?
> >
> >
> > I think an assumption about ZK has been made that is wrong:
> > The assumption is that ZK sessions are reliable, so taking
> > immediate action from a watcher when an ephemeral node goes
> > away is safe, but ZK sessions can expire for a number of
> > reasons not related to the process holding the handle going
> > away. So serious issues like HBASE-1314 result.
> >
> > Some problems related to session expiration can be easily
> > handled by having the ZK wrapper reinitialize the ZK handle
> > and recreate ephemeral nodes when it is informed that its
> > session has expired. However the problem with watchers
> > seeing deletions and taking (inappropriate) action remains.
> > In my opinion, every place in the code where watchers on
> > znodes are used to determine the state of something needs
> > to be reworked.
> >
> > One option is to start a timer when a znode disappears and
> > watch for its reappearance while the timer is running. If
> > the timer expires without reappearance, then take action.
> >
> > Another option is to not use ephemeral nodes. Have the
> > readers discover their znodes of interest and then poll
> > them. Include timestamps in the stored data to determine
> > freshness. Declare a node expired beyond some delta between
> > last update and current time, and then take action. (The
> > poller can delete the znode also to clean up.)
> >
> >    - Andy

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message