impala-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sail...@apache.org
Subject [3/7] incubator-impala git commit: IMPALA-3875: Thrift threaded server hang in some cases
Date Mon, 05 Dec 2016 22:25:01 GMT
IMPALA-3875: Thrift threaded server hang in some cases

We use socket timeouts for our backend connections. We set these
timeouts only after we've open()'d the connection, which ideally
should be fine. However, our TSaslTransport stack does read()'s and
write()'s over the network on an open(), which means that on a secure
cluster we send and recieve non-TCP-handshake packets on open(). This
is because the current code tries to establish a SASL handshake during
open().

If for any reason the peer server does not respond to the read()'s
during the open() call (after connect() is successful), the client
will wait on read() indefinitely. This patch sets the socket timeout
before we call open(), so that the read()'s and write()'s during the
open() are subject to the timeout as well.

We should also consider making a larger change where this SASL
handshake does not take place during an open(), but instead after the
open() call is completed, so as to have the open() semantics be the
same for both secure and insecure clusters.

Change-Id: I6c8f91a88f723e0e58e81bb385c5a8f190021868
Reviewed-on: http://gerrit.cloudera.org:8080/5263
Reviewed-by: Sailesh Mukil <sailesh@cloudera.com>
Tested-by: Internal Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/ff629c2d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/ff629c2d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/ff629c2d

Branch: refs/heads/master
Commit: ff629c2deb97b4ef25e80745bf4689dcbe8407fe
Parents: 7efa083
Author: Sailesh Mukil <sailesh@cloudera.com>
Authored: Tue Nov 29 13:12:48 2016 -0800
Committer: Internal Jenkins <cloudera-hudson@gerrit.cloudera.org>
Committed: Sat Dec 3 23:21:10 2016 +0000

----------------------------------------------------------------------
 be/src/rpc/thrift-client.h           |  2 ++
 be/src/runtime/client-cache.cc       | 15 +++++++++++----
 be/src/statestore/statestore-test.cc |  6 +-----
 3 files changed, 14 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff629c2d/be/src/rpc/thrift-client.h
----------------------------------------------------------------------
diff --git a/be/src/rpc/thrift-client.h b/be/src/rpc/thrift-client.h
index fae193a..a194551 100644
--- a/be/src/rpc/thrift-client.h
+++ b/be/src/rpc/thrift-client.h
@@ -67,6 +67,8 @@ class ThriftClientImpl {
   /// Set send timeout on the underlying TSocket.
   void setSendTimeout(int32_t ms) { socket_->setSendTimeout(ms); }
 
+  Status socket_create_status() { return socket_create_status_; }
+
  protected:
   ThriftClientImpl(const std::string& ipaddress, int port, bool ssl)
       : address_(MakeNetworkAddress(ipaddress, port)), ssl_(ssl) {

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff629c2d/be/src/runtime/client-cache.cc
----------------------------------------------------------------------
diff --git a/be/src/runtime/client-cache.cc b/be/src/runtime/client-cache.cc
index 578ef29..4f403ad 100644
--- a/be/src/runtime/client-cache.cc
+++ b/be/src/runtime/client-cache.cc
@@ -110,15 +110,22 @@ Status ClientCacheHelper::CreateClient(const TNetworkAddress& address,
     ClientFactory factory_method, ClientKey* client_key) {
   shared_ptr<ThriftClientImpl> client_impl(factory_method(address, client_key));
   VLOG(2) << "CreateClient(): creating new client for " << client_impl->address();
-  Status status = client_impl->OpenWithRetry(num_tries_, wait_ms_);
-  if (!status.ok()) {
-    *client_key = NULL;
-    return status;
+
+  if (!client_impl->socket_create_status().ok()) {
+    *client_key = nullptr;
+    return client_impl->socket_create_status();
   }
+
   // Set the TSocket's send and receive timeouts.
   client_impl->setRecvTimeout(recv_timeout_ms_);
   client_impl->setSendTimeout(send_timeout_ms_);
 
+  Status status = client_impl->OpenWithRetry(num_tries_, wait_ms_);
+  if (!status.ok()) {
+    *client_key = nullptr;
+    return status;
+  }
+
   // Because the client starts life 'checked out', we don't add it to its host cache.
   {
     lock_guard<mutex> lock(client_map_lock_);

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/ff629c2d/be/src/statestore/statestore-test.cc
----------------------------------------------------------------------
diff --git a/be/src/statestore/statestore-test.cc b/be/src/statestore/statestore-test.cc
index 2c518a1..8c81383 100644
--- a/be/src/statestore/statestore-test.cc
+++ b/be/src/statestore/statestore-test.cc
@@ -94,8 +94,4 @@ TEST(StatestoreSslTest, SmokeTest) {
 
 }
 
-int main(int argc, char **argv) {
-  InitCommonRuntime(argc, argv, false);
-  ::testing::InitGoogleTest(&argc, argv);
-  return RUN_ALL_TESTS();
-}
+IMPALA_TEST_MAIN();


Mime
View raw message