impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Date Wed, 30 Nov 2016 21:01:54 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 5:

(11 comments)

Did a quick first pass.

http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS5, Line 938: ||
joining condition should go before the line break?


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS5, Line 72: qs->refcnt_.Add(1);
If this is the logic we're going with, I would suggest that this patch and the per-query RPC
patch make it in with a relatively short gap between them.

Else we will have the penalty of tearing down and setting up a QS multiple times for the same
query.


PS5, Line 125: int32_t cnt = qs->refcnt_.Add(1);
After we have a per-query RPC, I think we should disallow getting the QS if the refcount has
already reached 0 (since that would mean all the fragments have completed).

If you agree, you can add that as a TODO here.


PS5, Line 135: qs->refcnt_.Load()
I think it will be better to put this LOG after getting the local 'cnt' value and printing
it as "refcnt=" << cnt + 1.

So we can at least via the logs understand what decision was made by this function (attempt
to gc vs return) based on the log line.
Eg: If we see "refcnt=1", we will know that this would have attempted a GC.

Also, its good to not lock the bus twice.


Line 141: 
One problem I see with this is, every time we hit refcount=0, there is some leeway for some
async call to get a reference.

However, those async calls (late reports, late filters, etc.) will mostly be of no use since
if the refcount hit 0, that means all the fragments are done executing. So why keep the QS
around unnecessarily?


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS5, Line 66: /// lock order:
            :   /// 1. qs_map_lock_
            :   /// 2. QueryState::refcnt_lock_
Remove now that refcnt is an AtomicInt


PS5, Line 69: boost::mutex qs_map_lock_
This will soon need to be a RWlock, or it could introduce scaling issues. But that will have
its own issues of reader or writer starvation, so won't be a trivial change. What do you think?


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS5, Line 338: local_query_ctx_
If we're going to have a local_query_ctx_ anyway, why bother with accessing the query_state_->query_ctx()
?


http://gerrit.cloudera.org:8080/#/c/4418/5/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS5, Line 49: class ExecEnv;
            : class DataStreamMgr;
            : class HBaseTableFactory;
            : class TPlanFragmentCtx;
            : class TPlanFragmentInstanceCtx;
            : class QueryState;
Alphabetical order?


PS5, Line 329:   // needed?
             :   // static const int DEFAULT_BATCH_SIZE = 1024;
Can remove? since its moved to query-state.h


PS5, Line 351: TUniqueId no_instance_id_;
Who sets this? It doesn't seem to be used now. My guess is it should be set in the second
constructor?


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message