zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yicheng Fang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (ZOOKEEPER-2899) Zookeeper not receiving packets after ZXID overflows
Date Fri, 15 Sep 2017 17:56:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168260#comment-16168260
] 

Yicheng Fang edited comment on ZOOKEEPER-2899 at 9/15/17 5:55 PM:
------------------------------------------------------------------

We tried running 'ZxidRolloverTest' with different setups but failed to reproduce the issue,
so we decided to use same hardware as production. The experiments below were using a 5-node
ZK ensemble, with zookeeper.testingonly.initialZxid set to high enough value:

1. With tiny scripts using kazoo, spawn off client processes that each continuously randomly
create  ZK nodes and set data on them, generating the same number of connections as production,
while having another set of clients randomly read data from the nodes.
   - Result: ZXID overflowed. Leader election happen completed within 5 seconds. Short burst
of errors was seen from the client side but the clients recovered right after the election.

2. Set up a 85 node Kafka broker cluster, then trigger overflowing with the same method as
in 1.
  - Result: same as 1. The Kafka brokers behaved normal.

3. Set up a test tool to generate ~100k/s messages, and as many consumers as needed to generate
the 1500-per-node connection count, for the Kafka cluster. The consumers writes consumption
offsets to ZK every 10ms. 
  - We noticed that after the ZXID overflowed for a couple of times, the whole system began
acting weirdly - metrics from the brokers became sporadic, ISRs became flappy, metrics volume
sent by Kafka dropped, etc. See attachment 'message_in_per_sec.png', 'metric_volume.png',
'GC_metric.png' for screenshots.
  - From the 'srvr' stats, latency became '0/[>100]/[>200]', vs. in normal conditions
'0/0/[<100]'. Profiling ZK revealed that it was because the ensemble received high QPS
of write traffics (presumably from the Kafka consumers) such that the 'submittedRequests'
queue (>6500 in queue each time when a new request is added) in 'PrepRequestProcessor'
of the leader was filled up, causing even the reads to have high latencies.
  - It looked to us that somehow by electing a new leader when overflowing caused the consumers
to align, thus DDOSing the ensemble. However, we have not observed the same behavior after
bouncing the leader process BEFORE the overflow. The ensemble should behave similarly in both
cases since both call for new leader elections. One difference though, we noticed, was that
in the overflow case the leader election port was left open so the downed leader would participate
in the new round of leader election. Not sure if it's related but thought might be worth bringing
up.
    


was (Author: eefangyicheng):
We tried running 'ZxidRolloverTest' with different setups but failed to reproduce the issue,
so we decided to use same hardware as production. The experiments below were using a 5-node
ZK ensemble, with zookeeper.testingonly.initialZxid set to high enough value:

1. With tiny scripts using kazoo, spawn off client processes that each continuously randomly
create  ZK nodes and set data on them, generating the same number of connections as production,
while having another set of clients randomly read data from the nodes.
   - Result: ZXID overflowed. Leader election happen completed within 5 seconds. Short burst
of errors was seen from the client side but the clients recovered right after the election.

2. Set up a 85 node Kafka broker cluster, then trigger overflowing with the same method as
in 1.
  - Result: same as 1. The Kafka brokers behaved normal.

3. Set up a test tool to generate ~100k/s messages, and as many consumers as needed to generate
the 1500-per-node connection count, for the Kafka cluster. The consumers writes consumption
offsets to ZK every 10ms. 
  - We noticed that after the ZXID overflowed for a couple of times, the whole system began
acting weirdly - metrics from the brokers became sporadic, ISRs became flappy, metrics volume
sent by Kafka dropped, etc. See attachment 'message_in_per_sec.png', 'metric_volume.png',
'GC_metric.png' for screenshots.
  - From the 'srvr' stats, latency became '0/[>100]/[>200]', vs. in normal conditions
'0/0/[<100]'. Profiling ZK revealed that it was because the ensemble received high QPS
of write traffics (presumably from the Kafka consumers) such that the 'submittedRequests'
queue in 'PrepRequestProcessor' of the leader was filled up, causing even the reads to have
high latencies.
  - It looked to us that somehow by electing a new leader when overflowing caused the consumers
to align, thus DDOSing the ensemble. However, we have not observed the same behavior after
bouncing the leader process BEFORE the overflow. The ensemble should behave similarly in both
cases since both call for new leader elections. One difference though, we noticed, was that
in the overflow case the leader election port was left open so the downed leader would participate
in the new round of leader election. Not sure if it's related but thought might be worth bringing
up.
    

> Zookeeper not receiving packets after ZXID overflows
> ----------------------------------------------------
>
>                 Key: ZOOKEEPER-2899
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2899
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection
>    Affects Versions: 3.4.5
>         Environment: 5 host ensemble, 1500+ client connections each, 300K+ nodes
> OS: Ubuntu precise
> JAVA 7
> JuniperQFX510048T NIC, 10000Mb/s, ixgbe driver
> 6 core Intel(R)_Xeon(R)_CPU_E5-2620_v3_@_2.40GHz
> 4 HDD 600G each 
>            Reporter: Yicheng Fang
>         Attachments: GC_metric.png, image12.png, image13.png, message_in_per_sec.png,
metric_volume.png, zk_20170309_wo_noise.log
>
>
> ZK was used with Kafka (version 0.10.0) for coordination. We had a lot of Kafka consumers
writing  consumption offsets to ZK.
> We observed the issue two times within the last year. Each time after ZXID overflowed,
ZK was not receiving packets even though leader election looked successful from the logs,
and ZK servers were up. As a result, the whole Kafka system came to a halt.
> As an attempt to reproduce (and hopefully fixing) the issue, I set up test ZK and Kafka
clusters and feed them with like-production test traffic. Though not really able to reproduce
the issue, I did see that the Kafka consumers, which used ZK clients, essentially DOSed the
ensemble, filling up the `submittedRequests` in `PrepRequestProcessor`, causing even 100ms+
read latencies.
> More details are included in the comments.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message