impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Kornacker (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Date Mon, 29 Aug 2016 20:37:42 GMT
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 11:

(43 comments)

this needs a follow-on discussion regarding the existing class hierarchy (and what it should
turn into).

we should disentangle that before adding the per-query rpc functionality.

http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/fragment-instance-exec-state.h
File be/src/runtime/fragment-instance-exec-state.h:

Line 39:       executor_(query_exec_state, exec_env, lambda(
please use a regular c++ lambda


Line 54:   Status Prepare();
yet more indirection.

the division of labor between this class and FragmentInstanceExecutor is unclear (who does
the execution). i find the control flow very hard to follow.

here you have to go through the state to get to the executor. and the state in here is basically
all static (exec params) aside from things like status, which are really execution-related.


Line 77:   /// Contains the context for the current fragment instance.
in general: separate w/ blank line if you have member variables with comments

this particular member is redundant, because it's already contained in exec_params_


Line 84:   boost::scoped_ptr<Thread> exec_thread_;
unused?

if not: who is doing the execution then (who has the execution logic), this class or InstanceExecutor?
this seems convoluted.


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/fragment-instance-executor.cc
File be/src/runtime/fragment-instance-executor.cc:

Line 113:     cgroup = ExecEnv::GetInstance()->cgroups_mgr()->UniqueIdToCgroup(PrintId(query_id_,
"_"));
long lines (here and elsewhere)


Line 336:             &FragmentInstanceExecutor::ReportProfile, this));
indent 2 spaces for the 2nd level


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/fragment-instance-executor.h
File be/src/runtime/fragment-instance-executor.h:

Line 75:   /// TODO: Made reports per-query vs per fragment-instance.
make


Line 76:   typedef boost::function<
std::


Line 82:   FragmentInstanceExecutor(QueryState* query_exec_state, ExecEnv* exec_env,
might as well also call the param query_state.

drop exec_env, you can call GetInstance()


Line 145:   QueryState* query_exec_state() { return query_exec_state_; }
now that we're doing a global rename, let's drop 'state' uniformly


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/query-exec-state.cc
File be/src/runtime/query-exec-state.cc:

Line 64:       CreateFragmentInstanceState(exec_params);
does this get called from anywhere else? if not, inline here and get rid of function. there
is too much indirection here, and no clear separation of responsibilities (in the current
code Create does the registration, not Register).


Line 82:     fragment_inst_exec_state_map_.insert(make_pair(GetInstanceIdx(
change line breaks to improve readability (don't break up a function call if you don't have
to)


Line 106: Status QueryState::CancelPlanFragment(const TUniqueId& fragment_instance_id)
{
this and the next function are simply wrapper around FIS member functions, it looks like you
only need GetFIS and the caller can do the rest. this will make the code easier to follow
because it removes a level of indirection


Line 109:   if (fragment_inst_exec_state.get() == NULL) {
why not move this into GetInstanceState()?


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/query-exec-state.h
File be/src/runtime/query-exec-state.h:

Line 57:   // TODO: Remove once we refcator code in coordinator.
spelling


Line 75:   void CleanFragmentInstanceStates();
clear?

also, shouldn't this be a general TearDown()?


Line 82:   FragmentInstanceState* CreateFragmentInstanceState(
does this need to be public, in addition to the preceding function? who calls this?


Line 114:   /// decrement this to 0 and clean up the query exec state.
move this comment to the actor class (qem).

rephrase (it's the last one to dereference this that will remove the state) if still useful.


Line 115:   int32_t num_current_references_;
not atomic? (or does it even make a difference?)


Line 118:   SpinLock fragment_inst_exec_state_map_lock_;
i started using finstance instead of frag_inst or fragment_inst or frag_instance


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 71:       QueryState* query_state = NULL);
pass in FIS (which contains a pointer to the query state as well as the exec params)


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/service/client-request-exec-state.h
File be/src/service/client-request-exec-state.h:

Line 47: /// Execution state of a query and also captures all state required for servicing
this is limited to client-related state. differentiate more clearly against QueryState.


Line 57: class ImpalaServer::ClientRequestExecState {
in the interest of removing extraneous 'Exec' qualifiers, drop this here as well


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

Line 39:         bind<void>(mem_fn(&QueryExecMgr::DestroyQueryState), this, _1,
_2)) {
lambda is clearer because it contains a signature


Line 42: Status QueryExecMgr::RegisterFragmentInstanceWithQuery(
this does more than just registration (it starts a thread that executes the instance).

i find the control flow hard to follow because there are so many indirections between the
classes, and there doesn't seem to be a clear separation of responsibilities.

maybe the qem should only deal with creation/registration and destruction of qes, and the
qes does everything else (including starting the execution thread). or we also add a QueryExecutor
which does the execution, but that may be very little functionality, and qes already does
other activities. in that case, we can also call this class QueryStateRegistry or something
like that, not anything with 'mgr' in it, which implies active involvement.

have this Register only do the registration and return a QES*, and then have the caller do
an explicit qes->ExecInstance().

let's talk about the details tomorrow.


Line 48:              << exec_params.fragment_instance_ctx.instance_state_idx;
doesn't exist anymore


Line 59:   ImpaladMetrics::IMPALA_SERVER_NUM_FRAGMENTS_IN_FLIGHT->Increment(1L);
why not move this into ExecInstance as well, so it's symmetric?


Line 62:   FragmentInstanceState* fragment_inst_exec_state =
fix var name


Line 70:   // TODO: Move this responsibility to the QueryExecutor when such an abstraction
is
collect those todo's in the .h class comment


Line 74:       Substitute("exec-plan-fragment-$0", PrintId(fragment_id)),
fix label


Line 80: void QueryExecMgr::ExecInstance(QueryState* query_state,
the instance state already has the query state, drop this param


Line 139:   TUniqueId query_id = query_ctx.query_id;
const ref


Line 145:   ImpaladMetrics::IMPALA_SERVER_NUM_RUNNING_QUERIES->Increment(1L);
indentation


Line 155:   if (scoped_qes.query_state_ == nullptr) {
might as well use the getter


Line 158:         PrintId(fragment_instance_id))));
indentation


http://gerrit.cloudera.org:8080/#/c/3817/11/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

Line 31: /// Manages execution of queries by creating a QueryState object which owns
you're missing one of the most important aspects of this class: it's a singleton across the
process


Line 47: /// aware of FragmentInstanceStates. The API should be more like RegisterQuery().
we also decided to add another class at that point, QueryExecutor, that runs in a separate
thread (created by QEM) and creates global shared state, creates the instance threads, etc.

augment your todo here to contain a full list of all items that will get done once the per-query
rpc arrives (new classes, which functions migrate, etc.)


Line 62:   Status RegisterFragmentInstanceWithQuery(const TExecPlanFragmentParams& params);
remove 'with query' (is there one without?)


Line 78:   Status CancelPlanFragment(const TUniqueId& query_id,
remove query_id in situations where you already have a fragment instance id, you can now derive
the former from the latter.

also, rename everything 'fragment' or 'plan fragment' to 'fragment instance' if applicable.

also, i started using FInstance as an abbreviation for FragmentInstance (which is very long)


Line 83:   Status PublishFilter(const TUniqueId& query_id, const TUniqueId& dst_instance_id,
query_id is redundant


Line 93:   class QESGuard {
move into QES itself and call it Guard.


Line 136:   /// must previously have been registered in fragment_inst_exec_state_map_. Runs
in the
this map doesn't exist here.

migrate into QueryState


Line 140:       FragmentInstanceState* fragment_inst_exec_state);
have param name match its type


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message