impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Kornacker (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc
Date Sat, 15 Apr 2017 23:05:16 GMT
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc

Patch Set 4:

File be/src/exec/

PS4, Line 66:   
> nit: wrong indentation
how so?
File be/src/runtime/

PS4, Line 58: //done_(false) {
> delete

Line 64:   VLOG_QUERY << "BS::Init(): query_id=" << PrintId(query_id_) <<
" state_idx=" << state_idx_ << " host=" << host_;
> long line
i'll remove all of the extra debug logging at the end.

PS4, Line 104: rpc_params->fragment_ctxs.back().fragment.idx
             :              != params.fragment_exec_params.fragment.idx) {
> The assumption is that all fragment instances belonging to the same fragmen

PS4, Line 112: params.fragment_exec_params.fragment.idx
> May as well factor this out as a local variable as it's used above too. The

PS4, Line 170: lock_guard<mutex> l(lock_);
> Is it necessary to hold the lock while creating BackendConnection below ?
the lock is low-contention, and it protects status_.

PS4, Line 173: client_connect_status;
> Is this not used or did you intend to use this instead of status_ ?

PS4, Line 184:   const string ERR_TEMPLATE =
             :       "ExecQueryFInstances rpc query_id=$0 failed: $1";
> Shouldn't this live in common/thrift/ ?
out of scope for this patch.

PS4, Line 332:  lock_guard<mutex> l2(lock_);
> How heavily contended is this lock ? Will this become a bottleneck if we ha
no, it'll (eventually) be the backend doing the reporting, not the individual instances.

PS4, Line 346:   
> nit: indentation is off.

Line 365:     if (status_.ok()) {
> Is this block empty now ? It only contains comment, right ?
i left that in there as a reminder to myself to think this through again.

PS4, Line 391: num_remaining_instances_ == 0 || !status_.ok();
> Why not call IsDone() here ?
IsDone() simply returns a bool, not sure what you mean.

PS4, Line 393: //if (*done) stopwatch_.Stop();
> delete

PS4, Line 434:   if (!status.ok()) return false;
> Does this warrant a log message ? It's not entirely clear what's the status
it wasn't there before, and it has the potential to spam the log.

PS4, Line 513:  NULL
> nullptr

PS4, Line 549:    //DCHECK_EQ(fragment_profiles_[instance_state.fragment_idx()].num_instances,
             :         //node_exec_summary.exec_stats.size());
> ?
File be/src/runtime/coordinator-backend-state.h:

PS4, Line 45: .
> Can you please also add a comment stating that multiple fragments run a sam
the hierarchy fragment -> fragment instance is already well-documented elsewhere, so i'm
not going to point that out again.

PS4, Line 100:   //const vector<InstanceStats>& instance_stats() const { return
instance_stats_; }
             :   //MonotonicStopWatch* stopwatch() { return &stopwatch_; }
> delete

PS4, Line 109: //InstanceStats* GetInstanceStats(const TUniqueId& instance_id);
> delete

Line 111:     return num_remaining_instances_ == 0 || !status_.ok();
> nit: one line ?
only for getters and setters.

PS4, Line 113:  &error_log_;
> How does this work if multiple callers modify the map at the same time ?
they don't, the comment above stipulates that you need to hold lock_

PS4, Line 114:   //const std::vector<RuntimeProfile*>& instance_profiles() const
             :     //return instance_profiles_;
             :   //}
> delete

PS4, Line 133: #if 0
> delete

PS4, Line 143: /
> delete

PS4, Line 223: //bool done_;
> Still needed ?
File be/src/runtime/

Line 113:     torn_down_(false) {}
> Should the constructor also initialize coord_state_idx_ to -1 in case there

PS4, Line 187: FinishBackendStartup
> Would a better name be VerifyBackendStartup(); ?
Not particularly, because this really blocks until startup finished.

PS4, Line 234: new FragmentStats(avg_profile_name, root_profile_name, num_instances, obj_pool())
> Should this be added to obj_pool() too ? fragment_stats_ only contains poin
i think that was the intention.

Line 340:           // Set the 'pending_count_' to zero to indicate that for a filter with
> long line

Line 397:   //CountingBarrier* exec_complete_barrier = exec_complete_barrier_.get();
> Delete ?

PS4, Line 400:         [backend_state, this, &debug_options]() {
             :           backend_state->Exec(query_ctx_, debug_options, filter_routing_table_,
             :             exec_complete_barrier_.get());
             :         })
> IMHO, it seems easier to read if this is defined as a function like the exi
what would the function contain?

PS4, Line 421: //VLOG_QUERY << "backend status: idx=" << backend_state->st
> ?
File be/src/runtime/coordinator.h:

PS4, Line 246: Filled in
             :   /// StartBackendExec().
> Filled in by InitBackendStates() ?

PS4, Line 270:  = nullptr;
> Do we have a guideline on whether pointer like this should be initialized i
i like it with the declaration, then you don't need to repeat it for every c'tor, if you have
several. also, you won't forget to initialize null pointers.
File be/src/runtime/

Line 139:       new RuntimeState(query_state_, fragment_ctx_, instance_ctx_, ExecEnv::GetInstance()));
> long line

PS4, Line 249:     report_thread_.reset(
             :         new Thread("plan-fragment-executor", "report-profile",
             :             &FragmentInstanceState::ReportProfileThread, this));
> Is one reporting thread per fragment instance a bit much ? Have you conside
out of scope.

PS4, Line 267:  //RETURN_IF_ERROR(quer
> delete

PS4, Line 298: RETURN_IF_ERROR(status);
> Why not RETURN_IF_ERROR(exec_tree_->GetNext(...))) at line 294 instead ?
File be/src/runtime/fragment-instance-state.h:

PS4, Line 52: for
> long line

PS4, Line 83: void Cancel();
> Please also document whether this is idempotent.

PS4, Line 89: If the preparation phase encountered an error, GetOpenStatus() will
            :   /// return that error without blocking.
> I don't see the opened_promise_ being updated when Prepare() fails in Exec(
good catch.
File be/src/runtime/

PS4, Line 56: // start new thread for query startup
> This is done so the RPC won't be blocked waiting for all FInstances to star
File be/src/runtime/

PS4, Line 194: stringstream s;
> Did you mean to use s somewhere ? It looks like dead code as it stands now.
it's used in the line below.

PS4, Line 246: 3
> How is this determined ? Should we use #define or constexpr to represent it
we have this same retry loop elsewhere as well, we should probably fix this globally, but
that's out of scope.

PS4, Line 250: rpc_status = Status(res.status);
> Is this needed given line 256 below which assigns res.status to result_stat

PS4, Line 317: Status status 
> Not an error but there is a variable named status already in the outer scop
File be/src/service/

Line 56: void ImpalaInternalService::CancelQueryFInstances(TCancelQueryFInstancesResult&
> nit: long line
File common/thrift/ImpalaInternalService.thrift:

PS4, Line 399: accross
> across
there are two indices, which come in handy in different situation:
- fragment index, which is part of fragment_instance_id
- per_fragment_instance_idx

Line 411:   // Id of this instance in its role as a sender.
> Used by DataStreamSender ?
or whoever needs to deal with data streams.

PS4, Line 433: coord
> Will backend_state_idx or coord_backend_state_idx be better ?
it's the index of the coordinator's private per-backend state. i find the name clear enough.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-HasComments: Yes

View raw message