impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Date Fri, 23 Sep 2016 03:08:37 GMT
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances
on a single backend

Patch Set 10:


How much testing have you done? I'd think you should be able to run the existing tests in
ST mode.
File be/src/runtime/

PS10, Line 294: boost
Is there any reason for the boost sets?

PS10, Line 750: instance_params
why doesn't this params need a cref, but the one in StartRemoteFragments() needed it?

PS10, Line 1971:  TODO-MT: remove
please mention this only happens for ST because in the MT case it is set up in MtInitExecSummary

Line 1975:     }
in the else case you can DCHECK that exec_summary.exec_stats is the right size.

It should be either 0 (ST case when it wasn't initialized lazily) or the total num_instances.
(Unless I'm misunderstanding.)

PS10, Line 2083: // TODO: use FragmentInstanceState::error_log_ instead
After the coord is treated as a remote FragmentInstanceState?

PS10, Line 2253: // Make a PublishFilter rpc to 'impalad' for given fragment_instance_id
               : // and params.
               : // This takes by-value parameters because we cannot guarantee that the
               : // originating coordinator won't be destroyed while this executes.
nit wrap to 90chars
File be/src/runtime/coordinator.h:

PS10, Line 259:   /// TODO: document lock ordering; it looks like it's:
              :   /// 1. FragmentInstanceState::lock_
              :   /// 2. lock_
this seems to contradict the wording in FragmentInstanceState::lock_

It also looks to me like lock_ gets taken first.
File be/src/runtime/

PS10, Line 84: QUERY
will this (and below) be noisy?
File be/src/scheduling/query-schedule.h:

PS10, Line 90:  shared across fragment instances
for all fragment instances

otherwise it's weird for this to own a vector of FInstanceExecParams

Line 106: 
remove extra line

Line 153:   /// TODO-MT: get rid of special casing and always return the total
when coordinator is handled as a remote fragment?
If so, please specify

Line 157:   /// TODO-MT: remove
when coordinator is handled as a remote fragment?
If so, can you specify?

PS10, Line 143:   /// TODO-MT: remove; this is actually only the number of remote instances
              :   /// (from the coordinator's perspective)
              :   void set_num_fragment_instances(int64_t num_fragment_instances) {
              :     num_fragment_instances_ = num_fragment_instances;
              :   }
              :   /// Returns the number of fragment instances registered with this schedule.
              :   /// MT: total number of fragment instances
              :   /// ST: value set with set_num_fragment_instances(); excludes coord instance
              :   /// (in effect the number of remote instances)
              :   /// TODO-MT: get rid of special casing and always return the total
              :   int GetNumFragmentInstances() const;
              :   /// Returns the total number of fragment instances, incl. coordinator fragment.
              :   /// TODO-MT: remove
              :   int GetTotalFInstances() const;
              :   /// Returns the number of remote fragment instances (excludes coordinator).
              :   /// Works for both MT and ST.
              :   int GetNumRemoteFInstances() const;
This is all still pretty confusing.

My understanding is that after Henry's coordinator change, we can remove 2 of these, is that
right? Is it this last one that we'll keep?

PS10, Line 172:   /// Map node ids to the index of their fragment in TQueryExecRequest.fragments.
              :   int32_t GetFragmentIdx(PlanNodeId id) const { return plan_node_to_fragment_idx_[id];
              :   /// Map node ids to the id of their containing fragment.
              :   FragmentId GetFragmentId(PlanNodeId id) const { return plan_node_to_fragment_id_[id];
this is very confusing. I think you can get rid of the first one if you change the code in
SimpleScheduler::ComputeScanRangeAssignment to use GetContainingFragment().

Then I think you can also remove plan_node_to_fragment_idx_, I don't think it'll be necessary
for anything else.

PS10, Line 187: 
              :   /// Map node ids to the index of the node inside their plan.nodes list.
              :   int32_t GetNodeIdx(PlanNodeId id) const { return plan_node_to_plan_node_idx_[id];
I think you can remove this- there's only 1 usage I can see in SimpleScheduler::ComputeScanRangeAssignment,
and I think you can use GetNode() instead

PS10, Line 210:   const MtFragmentExecParams* GetCoordFragmentExecParams() const {
              :     const TPlanFragment& coord_fragment =  request_.mt_plan_exec_info[0].fragments[0];
              :     if (coord_fragment.partition.type != TPartitionType::UNPARTITIONED) return
              :     return &mt_fragment_exec_params_[];
              :   }
I only see this used by the fn below, and I only see the fn below used once by Coordinator.
Can we keep 1 of these? E.g. either merge this into the fn below or keep this one and perform
the logic in the fn below where it is needed in Coordinator.

There are a lot of fns in this class already, so we should consider simplifying by removing
those that are only used once.

PS10, Line 244:   /// Maps from plan node id to its fragment index. Filled in c'tor.
              :   /// TODO-MT: remove
              :   std::vector<int32_t> plan_node_to_fragment_idx_;
I think we can get rid of this by changing ComputeScanRangeAssignment, and I think we should
do it now. Too many vectors of ids/idxs.
File be/src/scheduling/scheduler.h:

PS10, Line 49:  whose execution is to be coordinated by coord

PS10, Line 51:   /// If resource management is enabled, also reserves resources from the central
             :   /// resource manager (Yarn via Llama) to run the query in. This function
blocks until
             :   /// the reservation request has been granted or denied.
whoops looks like we missed this in Henry's change. Can you change this to say something like:

The schedule is submitted to admission control before returning.
File be/src/scheduling/

PS10, Line 900:  // TODO-MT: call AdmitQuery()
just to follow up on our conversation in person: 

I don't see a reason not to do this, it should at least work with the # of queries limit.
While we plan on doing future in AC work to help MT (i.e. cores), maybe this is useful to
limit the number of queries to 1. Were you thinking we would enable this only after we have
support for # of cores in AC?
File common/thrift/ExecStats.thrift:

PS10, Line 69: 
             :   // One entry for each BE executing this plan node. True if this plan node
is still
             :   // running.
             :   // TODO: is this used?
             :   8: optional list<bool> is_active
I don't see it used anywhere. I tested removing it and everything still compiles. Can you
remove it?
File common/thrift/Planner.thrift:

PS10, Line 35:   // TODO: should this be called idx, to distinguish more clearly from a
             :   // globally unique id?
I'd prefer to call this an idx given that we have fragment_instance_id of type TUniqueId on
TPlanFragmentInstanceCtx and TReportExecStatusParams. That alone might be OK, but it requires
more thought than necessary to read the code (e.g. in query-schedule) where we use it to index
into maps, sort on it, etc., and I have to remember it isn't the TUniqueId.

PS10, Line 37: TFragmentId
... and then maybe this would be better as an int? I'm not sure the typedef gets us anything.
File fe/src/main/java/com/cloudera/impala/service/

PS10, Line 914:    * TODO-MT: if we need to apply the same updates to this function and
              :    * createExecRequest(), pull out the common functionality into a separate
We've already accepted a split between MT & ST paths, so I think this conditional TODO
can be removed. We know refactoring is possible in a number of places but we'll remove the
ST code--maybe leave a TODO-MT to remove there.

PS10, Line 982:  

PS10, Line 1029:     // create plan

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Mostafa Mokhtar <>
Gerrit-HasComments: Yes

View raw message