impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Sherman (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5394: Set socket timeouts while opening TSaslTransport
Date Tue, 06 Jun 2017 19:12:15 GMT
John Sherman has posted comments on this change.

Change subject: IMPALA-5394: Set socket timeouts while opening TSaslTransport
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7061/2/be/src/transport/TSaslServerTransport.cpp
File be/src/transport/TSaslServerTransport.cpp:

PS2, Line 158: socket->setRecvTimeout(0);
             :     socket->setSendTimeout(0);
> Ah, that's what I was missing, thanks John. Still trying to make sure I und
Sorry, reposting this comment so it's not all on one line: If I'm reading the thrift code
correctly, fixing the lock doesn't really help the specific issue I've seen. The C++ thrift
server implementation ends up calling getTransport on the same thread that does the accept
of the connection. So if the getTransport implementation does reads/writes that ends up blocking,
it prevents new connections from being accepted until the read/write completes or times out.
We ran into this issue when a client connecting to the hs2_port started the SASL handshake
then for some reason quits communicating with the server (the packets from the client to impalad
started to not make it to impalad even the RST packet that the client sends after retrying).
In this case all other connection requests to the hs2_port end up being queued until I think
the tcp keepalive kicks in (I think the default time is 2 hours) or impalad is restarted.
(The Java implementation of TThreadPoolServer puts the accept!
 ed connection on a different thread before calling getTransport https://github.com/apache/thrift/blob/master/lib/java/src/org/apache/thrift/server/TThreadPoolServer.java)
If the C++ thrift implementation put the connection on a different thread before calling getTransport,
fixing the locking would be helpful.

C++ implementation: https://github.com/apache/thrift/blob/0.10.0/lib/cpp/src/thrift/server/TServerFramework.cpp
lines 147-157
 
Backtrace ends up lookings something like this (from IMPALA-5268):
#6  0x0000000001b394d2 in apache::thrift::transport::TSSLSocket::read(unsigned char*, unsigned
int) ()
No symbol table info available.
#7  0x0000000001b36003 in unsigned int apache::thrift::transport::readAll<apache::thrift::transport::TSocket>(apache::thrift::transport::TSocket&,
unsigned char*, unsigned int) ()
No symbol table info available.
#8  0x0000000000b52e15 in apache::thrift::transport::TSaslTransport::receiveSaslMessage(apache::thrift::transport::NegotiationStatus*,
unsigned int*) ()
No symbol table info available.
#9  0x0000000000b50733 in apache::thrift::transport::TSaslServerTransport::handleSaslStartMessage()
()
No symbol table info available.
#10 0x0000000000b52f9e in apache::thrift::transport::TSaslTransport::open() ()
No symbol table info available.
#11 0x0000000000b51431 in apache::thrift::transport::TSaslServerTransport::Factory::getTransport(boost::shared_ptr<apache::thrift::transport::TTransport>)
()
No symbol table info available.
#12 0x0000000001b4066d in apache::thrift::server::TThreadPoolServer::serve() ()
No symbol table info available.
#13 0x00000000009f2e60 in impala::ThriftServer::ThriftServerEventProcessor::Supervise() ()
No symbol table info available.


-- 
To view, visit http://gerrit.cloudera.org:8080/7061
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I56a5f3d9cf931cff14eae7f236fea018236a6255
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Sherman <jfs@arcadiadata.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: John Sherman <jfs@arcadiadata.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message