zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnar <an...@apache.org>
Subject Re: KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum
Date Tue, 13 Aug 2019 15:18:03 GMT
Subscriber API
https://issues.apache.org/jira/browse/ZOOKEEPER-153

Is it supposed to be something like a generic Observer API on the client side?
Observers essentially consume ordered updates of ZAB, so we would need to provide a way for
users to implement their own “observers”. They should be able to filter for path to be
more convenient.

Andor



> On 2019. Aug 2., at 20:48, Patrick Hunt <phunt@apache.org> wrote:
> 
> Michael I think you are describing subscribe - this?
> https://issues.apache.org/jira/browse/ZOOKEEPER-153
> wasn't there some work done to keep tlogs around for a while? Or am I miss
> remembering? (fb folks?)
> 
> I'll also add that we haven't done any benchmarking in quite some time. It
> would be interesting to collect a few of these use cases from the
> community, esp downstreams, and evaluate performance, see if we can address.
> 
> Patrick
> 
> On Fri, Aug 2, 2019 at 11:03 AM Michael Han <hanm@apache.org> wrote:
> 
>> Folks,
>> 
>> Some of you might already see this. Comments?
>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-500%3A+Replace+ZooKeeper+with+a+Self-Managed+Metadata+Quorum
>> 
>> 
>> What caught my eyes are:
>> 
>> *Worse still, although ZooKeeper is the store of record, the state in
>> ZooKeeper often doesn't match the state that is held in memory in the
>> controller.  For example, when a partition leader changes its ISR in ZK,
>> the controller will typically not learn about these changes for many
>> seconds.  There is no generic way for the controller to follow the
>> ZooKeeper event log.  Although the controller can set one-shot watches, the
>> number of watches is limited for performance reasons.  When a watch
>> triggers, it doesn't tell the controller the current state-- only that the
>> state has changed.  By the time the controller re-reads the znode and sets
>> up a new watch, the state may have changed from what it was when the watch
>> originally fired.  If there is no watch set, the controller may not learn
>> about the change at all.  In some cases, restarting the controller is the
>> only way to resolve the discrepancy.*
>> 
>> I've seen some similar zookeeper use cases that ended up like what's
>> described here. How can ZooKeeper solve this? It seems to me that the only
>> solution is to provide linearizable read on watched operations. Thoughts?
>> 
>> Michael.
>> 


Mime
View raw message