On Thu, Feb 3, 2011 at 2:57 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
> The result was the client never realized that it's session was
> actually timed out, and the HBase processes continued to run. Kill -9
> and a restart fixed it.
Hi Ryan,
there are two issues at play here, session timeout and session
expiration. Correct me if I'm wrong but I think you meant to say "the
client never realized that it's session was actually _expired_". Which
is correct behavior. Clients can only determine if a session is
expired once they reconnect to the cluster. Session timeout on the
other hand happens when the server heartbeat is not received by the
client w/in the session timeout period. Clients who are disconnected
from the cluster will attempt to reconnect back to the cluster until
they are successful. When a client is disconnected the client's
watchers will be notified about the disconnect. (same for expiration).
See questions 1 & 2 here in the faq, specifically "Example state
transitions" in question 2:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/FAQ
Your clients were stuck btw steps 4 and 5 (which they will never reach
in your scenario).
Does that help?
Patrick
|