impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5558: Reopen stale client connection
Date Sun, 25 Jun 2017 01:23:06 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5558: Reopen stale client connection
......................................................................


Patch Set 7:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/7284/7//COMMIT_MSG
Commit Message:

PS7, Line 13: It may still be buffered in the kernel
            : if there is room for it.
If you have a link that explains how this could happen, it would be a good idea to reference
it here.


PS7, Line 15: If an Impalad node is restarted, a stale client connection may
            : still allow send() to succeed
Reword for clarity:

If an Impalad node is restarted, a stale client connection to that node may still allow send()
to appear succeed even though the payload wasn't sent.


http://gerrit.cloudera.org:8080/#/c/7284/7/be/src/runtime/client-cache.h
File be/src/runtime/client-cache.h:

Line 203:       client_is_unrecoverable_(false) {
Not for this patch, but this would be a good place to inject a fault, i.e. not return a client.
That would have caught IMPALA-5576.
Feel free to leave a TODO if you agree.


http://gerrit.cloudera.org:8080/#/c/7284/7/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS7, Line 356: connect_success
What is 'connect_success' trying to achieve? It looks like it wants to keep track of whether
we were able to ever get a connection at some point in the retry loop (except the last iteration).

So if we were able to get a connection (which could have been stale) but were not able to
successfully send the Cancel RPC, we still would send true here.


http://gerrit.cloudera.org:8080/#/c/7284/7/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS7, Line 237: !client_status.ok()
> I think that adding this here is essentially equivalent to the bug IMPALA-5
No, this is still outside the loop, so we're still at least trying 3 times before calling
it quits. Adding this check inside the loop would be equivalent to IMPALA-5576.


-- 
To view, visit http://gerrit.cloudera.org:8080/7284
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4d722c8ad3bf0e78e89887b6cb286c69ca61b8f5
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Juan Yu <jyu@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message