Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3A79F200B88 for ; Thu, 22 Sep 2016 18:37:39 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 391C6160AAD; Thu, 22 Sep 2016 16:37:39 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 55EE1160AA9 for ; Thu, 22 Sep 2016 18:37:38 +0200 (CEST) Received: (qmail 18721 invoked by uid 500); 22 Sep 2016 16:37:37 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 18705 invoked by uid 99); 22 Sep 2016 16:37:37 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Sep 2016 16:37:37 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id E1C7CC836C for ; Thu, 22 Sep 2016 16:37:36 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id oTf_pxSuABym for ; Thu, 22 Sep 2016 16:37:35 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id EB1765F19B for ; Thu, 22 Sep 2016 16:37:34 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id u8MGbYW1011664; Thu, 22 Sep 2016 16:37:34 GMT Message-Id: <201609221637.u8MGbYW1011664@ip-10-146-233-104.ec2.internal> Date: Thu, 22 Sep 2016 16:37:34 +0000 From: "Marcel Kornacker (Code Review)" To: impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Henry Robinson , Matthew Jacobs , Alex Behm , Dan Hecht , Mostafa Mokhtar , Lars Volker Reply-To: marcel@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-3902=3A_Scheduler_improvements_for_running_multiple_fragment_instances_on_a_single_backend=0A?= X-Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 X-Gerrit-ChangeURL: X-Gerrit-Commit: e18318041e4f4f508ce9e13a61ceb85007ea7832 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.2 archived-at: Thu, 22 Sep 2016 16:37:39 -0000 Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend ...................................................................... Patch Set 10: (29 comments) http://gerrit.cloudera.org:8080/#/c/4054/3/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS3, Line 713: > ? Done PS3, Line 1654: (!rpc_status.ok()) { > Not sure what that means - you've added code to explicitly skip the coordin the coordinator previously didn't have an associated FragmentInstanceState. i added that, and am now skipping it in this path, since it's not a remote fragment. http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS9, Line 203: const > I think what the original comment was trying to get at was that the returne Done PS9, Line 227: int per_fragment_instance_idx_; : > not used Done PS9, Line 485: > inconsistent use of int/int32_t Done PS9, Line 486: > nit: space Done PS9, Line 488: progress_.Init(str, schedule_.num_scan_ranges()); > move this above near other usage of num_remote_instances Done PS9, Line 494: : bool has_coordinator_fragment = schedule_.GetCoordFragment() != NULL; : if (has_coordinator_fragment) { > I don't understand why this comment belongs here. Am I missing something or it's a comment on the lock acquisition. added a blank line. PS9, Line 542: ken on the coordin > Seems weird that we wouldn't have to call a function called 'FinishQuerySta renamed PS9, Line 637: } : executor_.reset(new PlanFragmentExecutor( : exec_env_, PlanFragmentExecutor::ReportStatusCallback())); > Why leave this as a comment? I'd imagine it will be called for the coord fr added todo, but i don't want to remove it, just to make sure we don't forget. PS9, Line 645: s before optimizing the LLVM module. The other expr > this and other cases where we index into fragment_instance_states_ would be Done Line 680: fragment_idx < request.fragments.size(); ++fragment_idx) { > comment where they are created. Done PS9, Line 711: > why cref(params)? Won't the reference (not the value) be copied by bind(), without the cref it's trying to copy the parameters and then complains about a missing copy c'tor (i'm glad there wasn't one). PS9, Line 712: > several of these can be omitted if schedule_ is a member variable. Again, d Done PS9, Line 735: for (const MtFragmentExecParams& fragment_params: s > why is this a comment? Done PS9, Line 746: gOptions* instance > same thing about using state_idx_ Done PS9, Line 1453: u > please make sure to run git-clang-format over this patch when you commit it Done PS9, Line 1555: SetExecPlanFragmentParams(exec_pa > It'd be nice to avoid passing these through the fn. I don't see any reason these are shared among all instances to which they apply, so i'd rather not create copies everywhere. they're also only used during instance startup, so i don't see why passing it as a parameter is inappropriate. PS9, Line 1564: rpc_params.fragment_instance_ctx.per_node_scan_ranges); : VLOG_FILE << "making rpc: ExecPlanFragment" > same thing about the state_idx_ parameter Done PS9, Line 1934: &fragment_profiles_[fragment_id]; > was removing the bounds checks intentional? Here and below. Done http://gerrit.cloudera.org:8080/#/c/4054/6/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: Line 101: class Coordinator { > nit: extra line Done http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: PS9, Line 547: e. : void MtStartRemoteFInstances(); > Looks to me like it sets up the state for all non-coordinator fragment inst Done http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/scheduling/query-resource-mgr.cc File be/src/scheduling/query-resource-mgr.cc: PS9, Line 53: void ResourceResolver::GetResourceHostport(const TNetworkAddress& src, : TNetworkAddress* dest) const { : if (ExecEnv::GetInstance()->is_pseudo_distributed_llama()) { : auto entry = impalad_to_dn_.find(src); : if (entry != impalad_to_dn_.end()) { : *dest = entry->second; : } else { : *dest = TNetworkAddress(); : } : } else { : dest->hostname = src.hostname; : dest->port = 0; : } > I take it you'll rebase on the llama removal patch? i was going to. http://gerrit.cloudera.org:8080/#/c/4054/3/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: Line 184: const TPlanFragment& GetContainingFragment(PlanNodeId node_id) const { > What's 'it' - the QuerySchedule? It doesn't get created in the scheduler, b i'm not sure i find that particularly problematic, but either way this doesn't need to get addressed as part of this change. PS3, Line 196: d::vector* exec_params() { return &fragment_exec_params_; } : const std::vector& exec_params() const { : return fragment_exec_params_; > Done the dcheck provides additional guarantees. http://gerrit.cloudera.org:8080/#/c/4054/4/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: PS4, Line 541: // We adjust all replicas with memory distance less than base_distance to base_distance : // and collect all replicas with equal or better distance > i'm not sure what to add. added an explanation http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: PS9, Line 942: int64_t assigned_bytes = 0; > any reason you can't do this now? it's purely additive, and there's lots of functionality that's still missing (joins, for instance) and will need to get implemented at a later date. http://gerrit.cloudera.org:8080/#/c/4054/4/be/src/scheduling/simple-scheduler.h File be/src/scheduling/simple-scheduler.h: PS4, Line 499: > QuerySchedule is really just a helper class which functions as a container added to class comment http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/service/fragment-mgr.cc File be/src/service/fragment-mgr.cc: PS9, Line 148: VLOG_FILE > This might be noisy for QUERY Done -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel Kornacker Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes