impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Juan Yu (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3575: Add retry to backend connection request and rpc timeout
Date Sat, 09 Jul 2016 03:43:23 GMT
Juan Yu has posted comments on this change.

Change subject: IMPALA-3575: Add retry to backend connection request and rpc timeout
......................................................................


Patch Set 21:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/3343/21/be/src/runtime/client-cache.h
File be/src/runtime/client-cache.h:

PS21, Line 227: is
> delete "is"
Done


Line 304:   TNetworkAddress address_;
> can't we get this from client_->address()?
"client_" is not an instance of ThriftClientImpl


http://gerrit.cloudera.org:8080/#/c/3343/21/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

PS21, Line 134: 300000
> Is there a short comment you could write to justify how this was chosen (5 
Done


PS21, Line 134: The time after "
              :     "which a backend client send/recv RPC call will timeout.
> The send/recv connection timeout in milliseconds for a backend client RPC.
This is the underlying TSocket send/recv call timeout, not connection timeout.


PS21, Line 138:  
> same
Done


PS21, Line 157: 0
> why is this 0? (wait_ms)
This is for retry opening connection, usually each retry will take several seconds. waiting
even longer won't help much.


PS21, Line 162: 100
> how was this chosen?
I'll set this to 0.


Line 223:             "", !FLAGS_ssl_client_ca_certificate.empty())),
> not your change, but it's really unfortunate we duplicate this code. let's 
I'll add a Todo here.


http://gerrit.cloudera.org:8080/#/c/3343/21/be/src/testutil/fault-injection-util.h
File be/src/testutil/fault-injection-util.h:

PS21, Line 36: RPC_RANDOM
> comment that this must be last
Done


PS21, Line 39: call
> delete
Done


PS21, Line 40: timeout
> this is the recv connection timeout, correct? if so, how about saying "recv
Done


PS21, Line 41: RpcCallType my_type, int32_t rpc_type, int32_t delay_ms
> document these.
Done


PS21, Line 44: rpc_type == RPC_NULL
> what is specifying RPC_NULL used for?
Just for easy testing, you can easily enable disable the fault injection by changing the value,
no need to add/remove the startup flag. In the future, we could change this value dynamically
to do more testing.


Line 50:       FLAGS_fault_injection_rpc_type, FLAGS_fault_injection_rpc_delay_ms)
> why pass these as arguments rather than just having InjectRpcDelay() read t
Similar reason as above, we could test with dynamic values without the need to restart cluster.


http://gerrit.cloudera.org:8080/#/c/3343/21/tests/custom_cluster/test_rpc_timeout.py
File tests/custom_cluster/test_rpc_timeout.py:

Line 119:     self.execute_query_verify_metrics(self.TEST_QUERY, 10)
> how long do all these tests take to execute?  let's run them only in exhaus
About 5 minutes. ok, I'll change to only execute in exhaustive mode.


-- 
To view, visit http://gerrit.cloudera.org:8080/3343
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id6723cfe58df6217f4a9cdd12facd320cbc24964
Gerrit-PatchSet: 21
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Juan Yu <jyu@cloudera.com>
Gerrit-Reviewer: Alan Choi <alan@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Huaisi Xu <hxu@cloudera.com>
Gerrit-Reviewer: Juan Yu <jyu@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message