impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Date Wed, 30 Nov 2016 22:36:44 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.

Patch Set 6:

File be/src/runtime/

PS5, Line 72: qs->refcnt_.Add(1);
> i'm not sure what you're talking about. the qs is reused for subsequent fra
Another example: an ExecPlanFragment RPC can arrive late after some instances for the same
query have already arrived and finished executing and GC'd the QS.

The example you mention also sounds scary, since it doesn't seem like the code anticipates
that. That could mean FInstances that arrive late may start execution after the query has
been cancelled, wasting resources.

PS5, Line 125: int32_t cnt = qs->refcnt_.Add(1);
> why? it's not just the execution threads that need qs references.
For the same reason I mention in the comments in L141.

Line 141: 
> i'm not sure what you mean by 'unnecessarily'. the qs is gc'd when the refc
With the current state of things, after all fragment instances finish executing (either successfully
or not), there is no use of servicing any of the async calls. Those are UpdateFilter, TransmitData,
CancelPlanFragment, etc.

Having the code this way could lead to a chain effect where these late RPCs that arrive after
all FInstances have completed (successful or not) keep getting references to the QS only to
find that the query is already complete, causing unnecessary contention on the qs_map_lock_
and having the QS longer than necessary.

Late RPCs will be more evident when we move to a model where we queue RPCs rather than send
them immediately (i.e. after we move to KuduRPC), so I feel we need to keep that in mind with
future patches moving forward.

So what I'm proposing is to know when all FInstances have completed execution (via another
counter) and not give away new references to async RPCs after that.

Of course this is all moot if you think we will be introducing RPCs in the future that will
still need to be serviced after ALL FInstances have completed executing.
File be/src/runtime/

PS5, Line 338: local_query_ctx_
> the .h file points out that it's only used when there's no querystate, ie, 
Ok, makes sense.
File be/src/runtime/runtime-state.h:

PS5, Line 351: /// Use pointer to avoid i
> it's initialized to an invalid id and used in l127 of this file.
Sorry I meant that it doesn't seem to be initialized. Where is it initialized to an invalid

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Sailesh Mukil <>
Gerrit-HasComments: Yes

View raw message