impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Impala Public Jenkins (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5537: Retry RPC on somes exceptions with SSL connection
Date Wed, 21 Jun 2017 10:04:16 GMT
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5537: Retry RPC on somes exceptions with SSL connection
......................................................................


IMPALA-5537: Retry RPC on somes exceptions with SSL connection

After the fix for IMPALA-5388, all TSSLException thrown will be
treated as fatal error and the query will fail. Turns out that
this is too strict and in a secure cluster under load, queries
can easily hit timeout waiting for RPC response.

When running without SSL, we call RetryRpcRecv() to retry the recv
part of an RPC if the TSocket underlying the RPC gets an EAGAIN
during recv(). This change extends that logic to cover secure
connection. In particular, we pattern match against the exception
string "SSL_read: Resource temporarily unavailable" which corresponds
to EAGAIN error code being thrown in the SSL_read() path.

Similarly, we will handle closed connection in send() path with
secure connection by pattern matching against the exception string
"TTransportException: Transport not open". To verify that the exception
is thrown during the send part of a RPC call, the RPC client interface
has been augmented to take a bool* argument which is set to true after
the send part of the RPC has completed but before the recv part starts.
If DoRPC() catches an exception and the send part isn't done yet, the
entire RPC if the exception string matches certain substrings which are
safe to retry.

The fault injection utility has also been updated to distinguish between
time out and lost connection to exercise different error handling paths
in the send and recv paths.

Change-Id: I8243d4cac93c453e9396b0e24f41e147c8637b8c
Reviewed-on: http://gerrit.cloudera.org:8080/7229
Reviewed-by: Dan Hecht <dhecht@cloudera.com>
Tested-by: Impala Public Jenkins
---
A be/src/catalog/catalog-service-client-wrapper.h
M be/src/exec/catalog-op-executor.cc
M be/src/rpc/thrift-server-test.cc
M be/src/rpc/thrift-util.cc
M be/src/runtime/backend-client.h
M be/src/runtime/client-cache-types.h
M be/src/runtime/client-cache.h
M be/src/service/client-request-state.cc
A be/src/statestore/statestore-service-client-wrapper.h
A be/src/statestore/statestore-subscriber-client-wrapper.h
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
M be/src/testutil/fault-injection-util.cc
M be/src/testutil/fault-injection-util.h
M tests/custom_cluster/test_rpc_exception.py
17 files changed, 425 insertions(+), 92 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Dan Hecht: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/7229
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8243d4cac93c453e9396b0e24f41e147c8637b8c
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>

Mime
View raw message