impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc
Date Fri, 21 Apr 2017 20:37:31 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc

Patch Set 9:


Did a review pass while I don't have access to my desktop.
File be/src/runtime/

PS9, Line 293: UpdateExecStats
The similarity between UpdateExecStats() and UpdateExecStatus() is potentially confusing.
Spell out Statistics() maybe?

Line 455:     // TODO: we don't track cpu time per node now. Do that.
More generally we should have an aggregate profile for each backend that includes things like
this (query memory consumption, I/O, other aggregates). We don't need to do this now but it
will be helpful
File be/src/runtime/coordinator-backend-state.h:

Line 132:     const FInstanceExecParams& exec_params_;
Might be helpful to document what owns the object being referenced.

Line 164:   std::vector<const FInstanceExecParams*> instance_params_list_;
Might be helpful to document what owns the objects being referenced.

Line 175:   /// Protects fields below. Can be held while doing an RPC, so SpinLock is a bad
Our current SpinLock implementation should be fine (it blocks after spinning for a short period).
Mutex should also be fine since this won't be heavily contented - I don't feel strongly.

Line 219:   /// TODO: including the median doesn't compile, looks like some includes are missing
This TODO has been in the code since 2012 - maybe just remove it?

Line 253:   /// Root profile for all fragment instances for this fragment
Stored in obj_pool?
File be/src/runtime/

Line 364:           f->targets()->push_back(target);

  ->emplace_back(t_target, fragment_params.fragment.idx);

Line 402:   int num_instances = schedule_.GetNumFragmentInstances();
nit: intermediate variable seems unnecessary.

Line 410:   // TODO: rename this metric
There are a handful of minor TODOs like this. I assume there's some reason not to do them
in this patch, but it's not clear when/if it makes sense to address them. Do we have a JIRA
or anything to track them?

Line 938:   // Notify that we completed with an error.
This comment seems wrong - the cancellation path is used to clean up even if there isn't an
error. Maybe just remove the comment?

Line 956:   // TODO: return here if returned_all_results_? 
I believe we need to do cancellation even if we returned all results. Having returned all
results from the coordinator fragment doesn't imply that all fragments have returned all their

The cancellation path is convoluted so I may be misunderstanding something though.

Henry had a related patch in review:

Line 983:   // TODO: clarify control flow here, it's unclear we should even process this status
Henry had a patch that clarified this and improved the comment:
File be/src/runtime/coordinator.h:

Line 108: /// TODO: clean up locking behavior; in particular, clarify dependency on lock_
Does any of the locking information in the impala-server.h need to be updated?

Line 261:   boost::mutex wait_lock_;
A lot of these locks could safely be SpinLocks - unless they're used with a condition variable.
Now that we have a blocking SpinLock it's better in most cases until you need a CV or strong
fairness guarantees.
File be/src/runtime/

Line 507: 
Extra line
File be/src/runtime/fragment-instance-state.h:

Line 162:   /// True if and only if Cancel() has been called.
I had a hard time reasoning about concurrent accesses to this until I realised that it didn't
actually influence query execution. Document that it's only used for DCHECKs? Or just remove
File be/src/runtime/

Line 111:   //VLOG_QUERY << "AddChildTracker(): parent=" << this << " child="
<< tracker << " " << tracker->label_ << "\n" << GetStackTrace();
I assume this will be removed.
File be/src/runtime/query-exec-mgr.h:

Line 77:   /// Execute instances and decrement refcount.
I find it clearer to understand as: "Acquires ownership of 'qs'".
File be/src/runtime/

Line 43: #define RETRY_SLEEP_MS 100
Make a class-level constant?

Line 311:   // don't return until every instance is prepared and record the first non-OK status
Comment isn't really accurate - it doesn't record the first non-OK status because of how it
handles CANCELLED.
File be/src/runtime/runtime-state.h:

Line 257:   void set_is_cancelled(bool v) { is_cancelled_ = v; }
Not your change but related. This bool is accessed from multiple threads yet we don't do any
synchronisation. Make it an Atomic32 so that we actually have some guarantees here?
File be/src/service/

Line 96:   VLOG_QUERY << "FeSupport::NativeEvalExprs: " << texprs.size() <<
" exprs";
Could get very noisy depending on how many constant exprs there are in the query and how many
partitions predicates on partition keys get evaluated against.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message