zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marco P. (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-2748) Four-letter command to voluntarily drop client connections
Date Mon, 10 Apr 2017 16:28:41 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963114#comment-15963114
] 

Marco P. commented on ZOOKEEPER-2748:
-------------------------------------

Thanks for the comment Michael. 
While I agree ZOOKEEPER-571 is the actual, proper, long term solution to client imbalance,
I am not going to hold my breath for that JIRA :-), it's been open 8 years with no visible
work on it to date.
This patch is not a holistic solution, but is a useful tool to have available, if needed.
And it is available today.
So I think there is still value here. 
No denying we would all like to see it deprecated by something better in the future.
The risk also seems pretty small since, unless one goes and manually hits this admin command,
this new code is unreachable (i.e. not changing any working component, just adding some behaviors).

I will look into making this work with ZooKeeperAdmin if that's the proper way to do this
in 3.5+.


> Four-letter command to voluntarily drop client connections
> ----------------------------------------------------------
>
>                 Key: ZOOKEEPER-2748
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2748
>             Project: ZooKeeper
>          Issue Type: New Feature
>          Components: server
>            Reporter: Marco P.
>            Assignee: Marco P.
>            Priority: Minor
>
> In certain circumstances, it would be useful to be able to move clients from one server
to another.
> One example: a quorum that consists of 3 servers (A,B,C) with 1000 active client session,
where 900 clients are connected to server A, and the remaining 100 are split over B and C
(see example below for an example of how this can happen).
> A will do a lot more work than B, C. 
> Overall throughput will benefit by having the clients more evenly divided.
> In case of A failure, all its client will create an avalanche by migrating en masse to
a different server.
> There are other possible use cases for a mechanism to move clients: 
>  - Migrate away all clients before a server restart
>  - Migrate away part of clients in response to runtime metrics (CPU/Memory usage, ...)
>  - Shuffle clients after adding more server capacity (i.e. adding Observer nodes)
> The simplest form of rebalancing which does not require major changes of protocol or
client code consists of requesting a server to voluntarily drop some number of connections.
> Clients should be able to transparently move to a different server.
> Patch introducing 4-letter commands to shed clients:
> https://github.com/apache/zookeeper/pull/215
> -- -- --
> How client imbalance happens in the first place, an example.
> Imagine servers A, B, C and 1000 clients connected.
> Initially clients are spread evenly (i.e. 333 clients per server).
> A: 333 (restarts: 0)
> B: 333 (restarts: 0)
> C: 334 (restarts: 0)
> Now restart servers a few times, always in A, B, C order (e.g. to pick up a software
upgrades or configuration changes).
> Restart A:
> A: 0 (restarts: 1)
> B: 499 (restarts: 0)
> C: 500 (restarts: 0)
> Restart B:
> A: 250 (restarts: 1)
> B: 0 (restarts: 1)
> C: 750 (restarts: 0)
> Restart C:
> A: 625 (restarts: 1)
> B: 375 (restarts: 1)
> C: 0 (restarts: 1)
> The imbalance is pretty bad already. C is idle while A has a lot of work.
> A second round of restarts makes the situation even worse:
> Restart A:
> A: 0 (restarts: 2)
> B: 688 (restarts: 1)
> C: 313 (restarts: 1)
> Restart B:
> A: 344 (restarts: 2)
> B: 657 (restarts: 1)
> C: 0 (restarts: 1)
> Restart C:
> A: 673 (restarts: 2)
> B: 328 (restarts: 1)
> C: 0 (restarts: 1)
> Large cluster (5, 7, 9 servers) make the imbalance even more evident.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message