impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Juan Yu (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-3575: Add retry to backend connection request and rpc timeout
Date Wed, 22 Jun 2016 17:24:37 GMT
Juan Yu has uploaded a new patch set (#12).

Change subject: IMPALA-3575: Add retry to backend connection request and rpc timeout

IMPALA-3575: Add retry to backend connection request and rpc timeout

This patch adds a configurable timeout for all backend client
RPC calls to avoid query hang issue.

Impala doesn't set socket send/recv timeout for backend client.
RPC calls will wait forever for data. In extreme case of bad network,
or destination host has kernel panic, sender will not get response
and rpc call will hang. Query hang is hard to detect. if hang happens
at ExecRemoteFragment() or CancelPlanFragments(), query cannot be
canelled unless you restart coordinator.

Added send/recv timeout to all rpc calls to avoid query hang. And fix
a bug that reporting thread does not quiting even after query is cancelled.
For catalog client, keep default timeout to 0 (no timeout) because ExecDdl()
could take very long time if table has many partitons, mainly waiting for
HMS API call.

Added a new RPC call DoRpcTimedWait() to wait for receiver response for
longer time. This is needed by certain RPCs. For example, TransmitData()
by DataStreamSender, receiver could hold response due to back pressure.

If an RPC call fails, we don't put the underlying connection back to
cache but close it. This is to make sure bad state of this connection
won't cause more RPC failure.

Besides the new EE test, I used the following iptable rule to
inject network failure to make sure rpc call never hang.
1. Block network traffic on a port completely
  iptables -A INPUT -p tcp -m tcp --dport 22002 -j DROP
2. Randomly drop 5% of TCP packet to slowdown network
  iptables -A INPUT -p tcp -m tcp --dport 22000 -m statistic --mode random --probability 0.05

Change-Id: Id6723cfe58df6217f4a9cdd12facd320cbc24964
M be/src/runtime/
M be/src/runtime/client-cache.h
M be/src/runtime/
M be/src/runtime/
M be/src/runtime/
M be/src/runtime/
M be/src/service/
M be/src/statestore/
M common/thrift/
A tests/custom_cluster/
M tests/query_test/
M tests/verifiers/
12 files changed, 219 insertions(+), 33 deletions(-)

  git pull ssh:// refs/changes/43/3343/12
To view, visit
To unsubscribe, visit

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id6723cfe58df6217f4a9cdd12facd320cbc24964
Gerrit-PatchSet: 12
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Juan Yu <>
Gerrit-Reviewer: Alan Choi <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Juan Yu <>
Gerrit-Reviewer: Sailesh Mukil <>

View raw message