zookeeper-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Han (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-3418) Improve quorum throughput through eager ACL checks of requests on local servers
Date Fri, 14 Jun 2019 05:12:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16863694#comment-16863694

Michael Han commented on ZOOKEEPER-3418:

[~lvfangmin] I like this idea. If a transaction does not change the state of the system,
we should not let it flow through the consensus part, which is by design not scalable. This
is similar to what this Jira is doing, though this Jira scopes the check to local server only.



I was thinking about what could go wrong if we do this in particular what would happen if
LE happen (as error transactions will not be sync on followers anymore). Since error transactions
don't change state of the data tree it seems OK of their absence on followers, from a recovery
point of view. I'd imagine we just need a conditional processing pipeline for leader, instead
of unconditionally send every requests from preprocessor to proposal processor, we only send
requests that pass the validation. Is this the "protocol changes" you referred to, or there
are other cases we need consider here?


> Improve quorum throughput through eager ACL checks of requests on local servers
> -------------------------------------------------------------------------------
>                 Key: ZOOKEEPER-3418
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3418
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: server
>    Affects Versions: 3.6.0
>            Reporter: Michael Han
>            Assignee: Michael Han
>            Priority: Major
>              Labels: Twitter, pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
> Serving write requests that change the state of the system requires quorum operations,
and in some cases, the quorum operations can be avoided if the requests are doomed to fail.
ACL check failure is such a case. To optimize for this case, we elevate the ACL check logic
and perform eager ACL check on local server (where the requests are received), and fail fast,
before sending the requests to leader. 
> As with any features, there is a feature flag that can control this feature on, or off
(default). This feature is also forward compatible in that for new any new Op code (and some
existing Op code we did not explicit check against), they will pass the check and (potentially)
fail on leader side, instead of being prematurely filtered out on local server.
> The end result is better throughput and stability of the quorum for certain workloads.

This message was sent by Atlassian JIRA

View raw message