zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruslan Nigmatullin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-3086) [server] Lack of write timeouts causes quorum to stuck
Date Fri, 20 Jul 2018 17:50:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16551051#comment-16551051

Ruslan Nigmatullin commented on ZOOKEEPER-3086:

ZK codebase uses `Socket.setSoTimeout` to setup a timeout, however based on the [documentation|https://docs.oracle.com/javase/8/docs/api/java/net/Socket.html#setSoTimeout-int-]
it's only used for read operations.
{quote}With this option set to a non-zero timeout, a read() call on the InputStream associated
with this Socket will block for only this amount of time.

> [server] Lack of write timeouts causes quorum to stuck
> ------------------------------------------------------
>                 Key: ZOOKEEPER-3086
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3086
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.5.4, 3.4.12
>         Environment: Linux 4.13.0-32-generic, Java HotSpot(TM) 64-Bit Server VM (build
25.121-b13, mixed mode)
>            Reporter: Ruslan Nigmatullin
>            Priority: Major
>         Attachments: zookeeper-threads.txt
> Network outage on leader host can cause `QuorumPeer` thread to stuck for prolonged period
of time (2+ hours, depends on tcp keep alive settings). It effectively stalls the whole zookeeper
server making it inoperable. We've found it during one of our internal DRTs (Disaster Recovery
> The scenario which triggers the behavior (requires relatively high ping-load to the follower):
>  # `Follower.processPacket` processes `Leader.PING` message
>  # Leader is network partitioned
>  # `Learner.ping` makes attempt to write to the leader socket
>  # If write socket buffer is full (due to other ping/sync calls) `Learner.ping` blocks
>  # As leader is partitioned - `Learner.ping` blocks forever due to lack of write timeout
>  # `QuorumPeer` is the only thread reading from the leader socket, effectively meaning
that the whole server is stuck and can't recover without manual process restart.
> Thread dump from the affected server is in attachments.

This message was sent by Atlassian JIRA

View raw message