kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From t...@apache.org
Subject kudu git commit: KUDU-1788. Increase Raft RPC timeout to 30sec to avoid fruitless retries.
Date Thu, 05 Oct 2017 17:42:45 GMT
Repository: kudu
Updated Branches:
  refs/heads/master a2485f501 -> caff45f5d

KUDU-1788. Increase Raft RPC timeout to 30sec to avoid fruitless retries.

The Raft leader's behavior on a timeout is to simply retry the request,
potentially aggregating more data into the new attempt if new data is
waiting in the queue.

However, as described in the JIRA, this behavior is counterproductive in
the case that the network pipe or associated reactor thread is
saturated. The original request may be in the middle of transmission
already, and so the retry ends up re-sending bytes which have already
been sent, increasing "throughput" but not increasing "goodput".

The original Raft timeout was set to 1 second mainly due to KUDU-699, an
old bug in which the leader would block waiting on outstanding requests
to followers before it would step down. That was fixed quite a long time
back, though, so there is no longer any good reason to have such a short
timeout on a Raft request.

This patch bumps the default timeout to 30 seconds. I tested this on a
8-node cluster by using iptables to inject 1% packet loss on all nodes
and running an insertion workload as described in the JIRA. Without the
patch, if I did a 'kill -STOP' of a node and waited a couple seconds
before allowing it to continue, I would see that node log "deduplicated
request" messages for 30-60 seconds before it eventually caught up.
During that time, the tablet was effectively using only two replicas,
causing increased latency, etc.

With the higher timeout, I didn't see these messages, and the unpaused
replica caught up much more quickly.

Change-Id: I5f47dc006dc3dfb1659a224172e1905b6bf3d2a4
Reviewed-on: http://gerrit.cloudera.org:8080/8037
Reviewed-by: David Ribeiro Alves <davidralves@gmail.com>
Reviewed-by: Mike Percy <mpercy@apache.org>
Tested-by: Kudu Jenkins

Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/caff45f5
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/caff45f5
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/caff45f5

Branch: refs/heads/master
Commit: caff45f5d6e38d21625122b5755fee10cc94df9d
Parents: a2485f5
Author: Todd Lipcon <todd@apache.org>
Authored: Mon Sep 11 21:46:54 2017 -0700
Committer: Todd Lipcon <todd@apache.org>
Committed: Thu Oct 5 17:36:45 2017 +0000

 src/kudu/consensus/consensus_peers.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/kudu/consensus/consensus_peers.cc b/src/kudu/consensus/consensus_peers.cc
index 546f46d..15b472e 100644
--- a/src/kudu/consensus/consensus_peers.cc
+++ b/src/kudu/consensus/consensus_peers.cc
@@ -48,7 +48,7 @@
 #include "kudu/util/pb_util.h"
 #include "kudu/util/threadpool.h"
-DEFINE_int32(consensus_rpc_timeout_ms, 1000,
+DEFINE_int32(consensus_rpc_timeout_ms, 30000,
              "Timeout used for all consensus internal RPC communications.");
 TAG_FLAG(consensus_rpc_timeout_ms, advanced);

View raw message