impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Ho (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5558/IMPALA-5576: Reopen stale client connection
Date Sun, 25 Jun 2017 23:54:16 GMT
Michael Ho has posted comments on this change.

Change subject: IMPALA-5558/IMPALA-5576: Reopen stale client connection

Patch Set 9:

Commit Message:

PS9, Line 18: to appear succeed
> nit: typo, to appear to succeed
Missed that before pushing. Will update before merging.
File be/src/rpc/

PS8, Line 198: TTransportException::END_OF_FILE &&
             :              strstr(e.what(), "No more data to read.
> Can you put this in a comment as well?
File be/src/rpc/

PS9, Line 200:  
> nit: extra space
File be/src/runtime/client-cache.h:

PS9, Line 243: ) {
> not suggesting we do this now, but why do we not handle recv cxn closed err
Yes, this may be worth some re-thinking. FWIW, most callers cannot handle recv side failure
any way. I believe only TrasmitData() / ReportExecStatusAux() and Cancel() will handle it.

PS9, Line 301: rpc send $3 done
> nit
File be/src/runtime/

PS8, Line 362: true;
> I was also wondering if this return value isn't very useful. What if we ins
The status and the backtrace is most likely logged when it's constructed initially deep down
the call stack.

I will keep the return value as-is for now to avoid more unexpected complication.
File be/src/runtime/

PS9, Line 247: duplicated
> nit: duplicate

PS9, Line 248: if (instance_exec_status.done && instance_stats->done_) continue;
> Is there any reason we want to process a not done status from a fragment th
Good point. We should simply ignore it. The current check is to make sure we don't subtract
num_remaining_instances_ more than once for a given fragment instance.
File be/src/runtime/coordinator-backend-state.h:

PS9, Line 138: duplicated
> nit: duplicate

PS9, Line 138: is done
> can you make this more specific here? e.g. maybe "true if the final report 
File be/src/runtime/

PS9, Line 227: d
> nit: duplicate
File be/src/testutil/fault-injection-util.h:

PS9, Line 64:   ///
> not clear this is the mod
Rephrased the comment.

PS9, Line 66: freq
> maybe this should be every_nth_rpc or such
Rephrased the comment.

File tests/custom_cluster/

PS9, Line 72: execute_test_query("Called read on non-open socket"
> why does this result in a query failure whereas recv_timed_out does not ?  
Most likely, in this case, TSocket was closed (potentially due to programming error in our
part). In the case of timeout, the socket is still opened but it just hits the timeout we
specify when waiting for data to show up.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I4d722c8ad3bf0e78e89887b6cb286c69ca61b8f5
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Juan Yu <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-Reviewer: Mostafa Mokhtar <>
Gerrit-Reviewer: Sailesh Mukil <>
Gerrit-HasComments: Yes

View raw message