impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henry Robinson (Code Review)" <>
Subject [Impala-ASF-CR] PREVIEW: IMPALA-2550 Introduce query-wide execution context.
Date Thu, 11 Aug 2016 07:07:55 GMT
Henry Robinson has posted comments on this change.

Change subject: PREVIEW: IMPALA-2550 Introduce query-wide execution context.

Patch Set 4:


This is my first pass, and so doesn't capture all the style, comment etc issues. I'm first
focusing on whether the ScopedQueryExecStateRef works as intended, and whether the lookup-by-fragment
ID is required. 

I'm not yet convinced that this is coming out more cleanly than a shared_ptr with a custom
deleter (which is effectively what we are doing here).
File be/src/runtime/

PS4, Line 418: // TODO: Should we create a QueryExecState for the coordinator fragment?
yes, otherwise you wind up with special-case logic to handle the paths where the QES is null.

Why does this QES not go in the local QEM?
File be/src/runtime/

PS4, Line 71: QueryExecMgr::ScopedQueryExecStateRef s(query_exec_mgr, fragment_instance_id);
This is confusing: presumably this exists to ensure that *this has a lifetime at least as
long as this method. But if there's a danger that *this will be destroyed during this method,
that could happen before this line ever gets executed. So this won't have a reference to anything
(and will actually crash).

Why not have a fragment exec state add a reference to the QES that is decremented when the
FES completes? Then the reference is guaranteed to be > 0 after line 49 above, and you
don't need to take the reference here.
File be/src/runtime/query-exec-state.h:

Line 1: // Copyright 2016 Cloudera Inc.
Fix license headers here and elsewhere.

Line 36: /// The lifetime of this class is maintained by QueryExecMgr.
No need to talk about what owners of this class might do with it.

PS4, Line 41: NULL
nit: use nullptr now.

PS4, Line 84: std::shared_ptr
Why shared?

PS4, Line 101: boost
File be/src/runtime/runtime-state.h:

Line 274:   QueryExecState* query_exec_state_; // not owned
comment what this is.
File be/src/service/

Line 14: #include "service/request-exec-state.h"
add new line before this one

Line 570:       "query-exec-state", "wait-thread", &ImpalaServer::ClientRequestExecState::Wait,
long line

Line 677: void ImpalaServer::ClientRequestExecState::UpdateNonErrorQueryState(QueryState::type
query_state) {
long line

Line 829: Status ImpalaServer::ClientRequestExecState::GetRowValue(TupleRow* row, vector<void*>*
long line
File be/src/service/

PS4, Line 37: bind<void>(mem_fn(&QueryExecMgr::DestroyQueryExecState), this, _1,
avoid bind, prefer lambdas

PS4, Line 48:  const TUniqueId& query_id = exec_params.query_ctx.query_id;
            :   // Preparing and opening the fragment creates a thread and consumes a non-trivial
            :   // amount of memory. If we are already starved for memory, cancel the fragment
            :   // early as possible to avoid digging the hole deeper.
            :   MemTracker* process_mem_tracker = ExecEnv::GetInstance()->process_mem_tracker();
            :   if (process_mem_tracker->LimitExceeded()) {
            :     string msg = Substitute("Instance $0 of plan fragment $1 of query $2 could
not "
            :         "start because the backend Impala daemon is over its memory limit",
            :         PrintId(exec_params.fragment_instance_ctx.fragment_instance_id),
            :         exec_params.fragment_ctx.fragment.display_name,
            :         PrintId(query_id));
            :     return process_mem_tracker->MemLimitExceeded(NULL, msg, 0);
            :   }
this logic looks like it should live inside RegisterFragmentInstance

PS4, Line 65: // Create or retrieve the QueryExecState for this query id.
            :   QueryExecState* query_exec_state = FindOrInsertQueryExecState(exec_params.query_ctx);
            :   DCHECK(query_exec_state != NULL);
            :   // This is a no-op if another fragment of the same query already called Init().
            :   query_exec_state->Init();
Why not change the interface to the following:

  // Creates a QES if one doesn't exist.
  qes = QEM->GetQueryExecState(query_id);

Then the Init() is only ever done in one place (GetQES()), and you directly interact with
the QES itself.

PS4, Line 103: i->second->num_current_references_++;

PS4, Line 107: void QueryExecMgr::ReleaseQueryExecState(QueryExecState* query_exec_state)
             :   lock_guard<SpinLock> l(query_exec_state_map_lock_);
             :   query_exec_state->num_current_references_--;
             :   // If this thread was the last to use it, schedule it for destruction.
             :   if (query_exec_state->num_current_references_ == 0) {
             :     destroy_thread_.Offer(query_exec_state);
             :   }
             : }
A QES will never be destroyed unless at least one reference is ever taken to it. That works
in this patch, because a fragment will always take at least one reference, but it's a bit
subtle. Can you think of a way to make the guarantee of at-least-one-ref more explicit?

PS4, Line 133: std
remove std::

PS4, Line 137: query_exec_state_map_[query_id].get()
nit: you're doing the lookup twice. insert returns a pair <iterator, bool> - just do:

  i = qesm_.insert(...).first;

on line 134 and on this line:

  return i->second.get();

PS4, Line 142: params.fragment_instance_id
The right thing to do is to augment the rpc params with the query ID, so you don't have to
maintain the fragment->query map.
File be/src/service/query-exec-mgr.h:

Line 1: // Copyright 2014 Cloudera Inc.
Replace header with ASF one.

PS4, Line 29: fragments

PS4, Line 40: /// TODO: Remove Thrift args from methods where it would improve readability;
            : /// ImpalaInternalService can take care of translation to / from Thrift, as
it already
            : /// does for RegisterFragmentInstanceWithQuery()'s return value.
Can you do that in this patch? (Already noticed one instance below). Getting the interface
right is critical for this patch.

PS4, Line 61: void CancelPlanFragment(TCancelPlanFragmentResult& return_val,
            :       const TCancelPlanFragmentParams& params);
This shouldn't have RPC-style signature (since this isn't a proxy class): return a Status,
have explicit parameters.

PS4, Line 80: GetQueryExecStateFromFragmentId
When do we know the fragment instance id, but not the query id?

Line 91:   class ScopedQueryExecStateRef {

PS4, Line 94: const TUniqueId& fragment_id
This should take a query ID. You should try to get rid of the fragment->query ID map and
only access the QEM->QES->FES hierarchy from the top down (if I have a process I can
get a QEM, if I have a query ID and a QEM can get a QES, if I have a QES and a fragment ID
I can get an FES).

PS4, Line 98: query_exec_state_ = query_exec_mgr_->GetQueryExecStateFromFragmentId(fragment_id);
What prevents this from returning NULL? You need some mechanism to ensure that the QES actually
exists and has not been torn down.

PS4, Line 117: boost

PS4, Line 130: destroy_thread_
not just a single thread then.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Sailesh Mukil <>
Gerrit-HasComments: Yes

View raw message