zookeeper-bookkeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <...@yahoo-inc.com>
Subject Re: ZooKeeper Session Expiration
Date Tue, 01 May 2012 13:29:44 GMT
I don't know if this is your case, but we have seen in the past with zookeeper such issues
caused by GC pauses. I remember one case with hbase, and I think it is this one:

	https://issues.apache.org/jira/browse/HBASE-1316

We have seen zookeeper clusters serving thousands of clients, so ~100 shouldn't be a problem.
Still session expiration is part of zookeeper, so we need to deal with here as well.

-Flavio

On May 1, 2012, at 3:14 PM, John Nagro wrote:

> Flavio -
> 
> We're trying to get to the bottom of it. As I understand it, in a properly configured
and operating Zk Cluster we should never see a session expiration exception. Globally (including
all systems) we see them perhaps once a week for the last month - and it causes some issues
in our system. We saw one last night, and bookkeeper had an issue a couple days ago.
> 
> We do have a lot of nodes connecting to zookeeper for various things. We have a home-built
configuration management tool that uses zk as the data store, the bookkeeper stuff obviously
does, my coordination on top of the bookkeeper ledgers uses it, etc. So yes, lots of machines
(dozens up to ~100) talk to this zk cluster in some fashion or another - we have other clusters
too. Ultimately, more machines will talk to the configuration stuff in the long term. I could
potentially move my zk stuff off that cluster if you think it would help.
> 
> -John
> 
> On Tue, May 1, 2012 at 8:44 AM, Flavio Junqueira <fpj@yahoo-inc.com> wrote:
> This is definitely not ideal. If you lose your zookeeper session, then you're not able
to close your open ledgers, which will force ledger recovery. It is not a correctness issue,
but certainly inconvenient. We need to fix, and I'm glad that Uma is already looking into
it.
> 
> I'm curious about why you're getting session expirations, though. Is it frequent or you
got it once? Do you have many nodes connecting to your ZooKeeper instance?
> 
> -Flavio
> 
> 
> On May 1, 2012, at 2:07 PM, John Nagro wrote:
> 
>> Thanks Uma - that is exactly what i am looking for. The way i am handing it now is
to pass a bookkeeper client factory rather than an instance. When i encounter zk session expiration,
i create a new client and discard the old one - getting a fresh set of connections to zk.
Perhaps not idea, but gets the job done.
>> 
>> thanks!
>> 
>> -John
>> 
>> On Tue, May 1, 2012 at 12:09 AM, Uma Maheswara Rao G <maheswara@huawei.com>
wrote:
>> Hi John,
>> 
>>  BK client need to handle session expire events from ZK.  Here is the issue for that
BOOKKEEPER-225.
>> We will implement it soon. I hope this is your doubt. Please correct me if my interpretation
is wrong about your question here.
>> 
>> Thanks a lot,
>> Uma
>> From: John Nagro [jnagro@hubspot.com]
>> Sent: Tuesday, May 01, 2012 1:20 AM
>> To: bookkeeper-user@zookeeper.apache.org
>> Subject: ZooKeeper Session Expiration
>> 
>> Hello -
>> 
>> If I start seeing ZKExceptions in the Bk Client, which appear to be due to SessionExpiration
errors... it seems that the BookKeeper client never recovers from that? Is that correct?
>> 
>> Thanks!
>> 
>> -John Nagro
>> 
> 
> 
> 

flavio
junqueira
senior research scientist
 
fpj@yahoo-inc.com
direct +34 93-183-8828
 
avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301


Mime
View raw message