hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject RE: ZK rethink?
Date Tue, 07 Apr 2009 18:10:38 GMT

Hi Chad,

In my testing the session expiration happens due to missed IO
like as with ZOOKEEPER-344, which is currently open. 


Also a Google search for "zookeeper session expired" turns up
some conversation already on the topic.

  - Andy

> From: Chad Walters
> Subject: RE: ZK rethink?
> To: "hbase-dev@hadoop.apache.org" <hbase-dev@hadoop.apache.org>
> Date: Tuesday, April 7, 2009, 10:57 AM
> Has this been discussed at all with the ZooKeeper
> developers?
> Chad
> -----Original Message-----
> From: Andrew Purtell [mailto:apurtell@apache.org] 
> Sent: Tuesday, April 07, 2009 10:53 AM
> To: hbase-dev@hadoop.apache.org
> Subject: ZK rethink?
> I think an assumption about ZK has been made that is wrong:
> The assumption is that ZK sessions are reliable, so taking
> immediate action from a watcher when an ephemeral node goes
> away is safe, but ZK sessions can expire for a number of
> reasons not related to the process holding the handle going
> away. So serious issues like HBASE-1314 result. 
> Some problems related to session expiration can be easily
> handled by having the ZK wrapper reinitialize the ZK handle
> and recreate ephemeral nodes when it is informed that its
> session has expired. However the problem with watchers
> seeing deletions and taking (inappropriate) action remains.
> In my opinion, every place in the code where watchers on
> znodes are used to determine the state of something needs
> to be reworked.
> One option is to start a timer when a znode disappears and
> watch for its reappearance while the timer is running. If 
> the timer expires without reappearance, then take action.
> Another option is to not use ephemeral nodes. Have the 
> readers discover their znodes of interest and then poll
> them. Include timestamps in the stored data to determine
> freshness. Declare a node expired beyond some delta between
> last update and current time, and then take action. (The
> poller can delete the znode also to clean up.)
>    - Andy


View raw message