impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Henry Robinson (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Date Mon, 21 Nov 2016 20:56:16 GMT
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 3:

(18 comments)

Looked over the ref count logic so far.

http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS4, Line 502:  QueryState* qs = ExecEnv::GetInstance()->query_exec_mgr()->GetQueryState(query_id_);
ScopedQueryState ?


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS4, Line 59: boost::
remove


Line 66:     } else {
Why is there no refcnt increment here?


PS4, Line 69: 
remove here and below.


PS4, Line 86: ImpaladMetrics::IMPALA_SERVER_NUM_FRAGMENTS_IN_FLIGHT->Increment(1L);
            :   ImpaladMetrics::IMPALA_SERVER_NUM_FRAGMENTS->Increment(1L);
            :   return Status::OK();
Why is this not in the QueryState? It seems like it should be part of RegisterFInstance().


PS4, Line 95: stance completed. in
Call this ExecFInstance() to make it clear that it blocks until instance completion.


PS4, Line 115: VLOG_FILE << "GetQueryState(): query_i
You have a bug here - if a fragment instance finishes quickly (perhaps it didn't prepare()
successfully) the query state will get GC'ed before another fragment instance can be started.


I realise this will be addressed by batching fragment start-up, but until then I suggest a
workaround like adding the number of expected fragment instances to the fragment instance
params, so that when the QEM is created it can take the right number of references.


PS4, Line 124: id QueryExecMgr::ReleaseQueryState(QueryState* qs) {
             :   VLOG_FILE << "
Prefer atomics rather than heavyweight mutexes for the ref count. Fewer locks -> fewer
locking bugs.


PS4, Line 126:           << " refcnt=" <<
What are you checking for here - overflow or that the refcnt_ didn't start below 0? If it's
the latter, move this to line 125 and check before increment that its GE(..., 0).


PS4, Line 147: 
How? Only one caller to ReleaseQueryState() can set refcnt == 0.


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS4, Line 47: In both cases it brackets the instance execution with an increment/decrement
            :   /// of the refcount.
This needs clarification - why is this relevant to callers? I think you mean that the QEM
retains a reference to the QueryState until this FInstance has completed.


PS4, Line 58: Otherwise returns nullptr
AFAICT, GetQueryState() will return a QueryState even if its refcount == 0. It should not,
because then you can avoid the complex logic in ReleaseQueryState() that deals with the possibility
that some other caller took a refcount between setting the count to 0 and trying to GC the
query state.


PS4, Line 66: /// TODO: isn't this available in std:: now?
remove, once we switch to std::mutex we'll get this one as well.


PS4, Line 72: clean up.
Comment on how it's allocated.


PS4, Line 75: 
Clean up what? Does this block until instance has finished?


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS4, Line 42: released_r
set this to 0. Callers should always manage the refcount with a balanced number of increment
/ decrement operations, otherwise bugs will be harder to understand.


http://gerrit.cloudera.org:8080/#/c/4418/4/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS4, Line 79: /// don't access query_ctx().desc_tbl, it's most likely for a different fragment
instance
I don't understand this comment. Is this going to change after batching? Otherwise let's think
about ways to avoid having to tell a caller not to touch a certain part of this data structure.


PS4, Line 107: 
Does access to this need to be synchronised?


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message