impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state.
Date Sat, 17 Sep 2016 01:14:33 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state.

Patch Set 1:


Mostly clarifying questions. Sorry if Henry and you had already reached consensus on some
of these.
File be/src/runtime/fragment-instance-state.h:

Line 27: /// Tear-down happens automatically in the d'tor and frees all memory allocated for
Didn't we want to move away from implicit destruction? Why is it ok to rely on that here?

Line 31: /// ReportProfile() is invoked periodically to report the execution status. The
No ReportProfile() in here. Wouldn't the reporting be controlled by the QueryState by periodically
collecting and aggregating the profile() from all fragment instance states?

Line 52:   /// It is an error to delete a FragmentInstanceState before Open()/GetNext() (depending
What if Prepare() fails?

Line 94:   Status Open();
Given my comment below, should this be Exec() or similar?

Line 100:   Status GetNext(RowBatch** batch);
What's this call for? Don't all fragment instances end in a sink? I know that today the coord
fragment is an exception, but that's going to change.

Line 120:   void ReleaseThreadToken();
When is this token acquired exactly?
File be/src/runtime/query-state.h:

Line 24: /// Central class for all backend execution state created for a particular query,
This first paragraph could use some cleanup, e.g., grammar of the first sentence is weird,
"such as" seems misplaced. Also there seems to be some redundancy with the FragmentInstanceState

Line 30: /// The lifetime of an instance of this class is dictated by a reference count.
This sounds much like a shared_ptr, i.e., implicit management of the lifetime. What is the
motivation for this design? It does not seem to prevent certain interesting races with e.g.,
RPCs wanting to access the query state after other threads are done with it (has transitioned
to a 0 ref count).

Incoming RPCs for data exchanges seem especially interesting, e.g. when there is a limit somewhere
downstream that could cause cancellation.

Line 41:   class Guard {
ScopedQueryState? This makes it clear that the access is scoped, and we already have similarly
names classes such as ScopedSessionState.

Line 66:   /// Illegal to call any other function of this class after this call returns.
When is this legal to call with respect to the reference count? Will this call block until
reference count has reached 0?

Line 76:   Status PublishFilter(const TUniqueId& dst_finstance_id, int32_t filter_id,
Isn't a filter really published to a plan node contained in all relevant fragment instances?
In other words, why not have:
PublishFilter(TPlanNodeId dst_plan_node_id, int32_t filter_id, const TBloomFilter& thrift_bloom_filter);

That way we'll only have to deserialize the thrift_bloom_filter once and we can shield callers
from knowing how many target fragment instances there are.

Another thought: Aren't runtime filters really shared among all relevant fragment instances?
It might make more sense to have the RuntimeFilterBanks in here instead of the FragmentInstanceStates.
Are we worried about exploding the memory consumption of runtime filters by DOP?

Line 78: };
We'll deal with local filter aggregation separately correct?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-HasComments: Yes

View raw message