Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 67C6A200C6B for ; Tue, 18 Apr 2017 06:44:24 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 66589160BAE; Tue, 18 Apr 2017 04:44:24 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id AE29C160BAB for ; Tue, 18 Apr 2017 06:44:23 +0200 (CEST) Received: (qmail 7362 invoked by uid 500); 18 Apr 2017 04:44:22 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 7351 invoked by uid 99); 18 Apr 2017 04:44:22 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Apr 2017 04:44:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 1C8151B04AF for ; Tue, 18 Apr 2017 04:44:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id WaNy1PFhTjjL for ; Tue, 18 Apr 2017 04:44:18 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 728625FBB9 for ; Tue, 18 Apr 2017 04:44:18 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v3I4iHgT019116; Tue, 18 Apr 2017 04:44:17 GMT Message-Id: <201704180444.v3I4iHgT019116@ip-10-146-233-104.ec2.internal> Date: Tue, 18 Apr 2017 04:44:17 +0000 From: "Marcel Kornacker (Code Review)" To: Henry Robinson , impala-cr@cloudera.com, reviews@impala.incubator.apache.org Reply-To: marcel@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-4856=3A_Port_ImpalaInternalService_to_KRPC=0A?= X-Gerrit-Change-Id: I95229290566a8ccffd80ed2d74c1c57cf1479238 X-Gerrit-ChangeURL: X-Gerrit-Commit: c936fa290a0f047944a88fa58c3274bf381a12c1 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.7 archived-at: Tue, 18 Apr 2017 04:44:24 -0000 Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4856: Port ImpalaInternalService to KRPC ...................................................................... Patch Set 3: (17 comments) first look http://gerrit.cloudera.org:8080/#/c/5888/3/be/src/runtime/data-stream-mgr.h File be/src/runtime/data-stream-mgr.h: Line 60: /// first batch. Since the sender may start sending before the receiver is ready, the data we could relatively easily remove that restriction, the execrpc changes are going to reduce the startup time. Line 77: /// fixed-size buffer for later processing, or discard the batch if the buffer is full. In how can the is-full case come up? shouldn't flow control keep the sender from getting to that point? Line 84: /// queue. An error code is returned which causes the sender to retry the previous batch. sounds complicated. is that really necessary? Line 113: /// time-out and cancel itself. However, it is usual that the coordinator will initiate i guess that extra complication is necessary because the non-existence of an in-flight query id doesn't mean it won't show up later. Line 124: /// The sender has a default timeout of 2 minutes for TransmitData() calls. If the timeout why so long? can we not distinguish "the rpc failed because the receiver went away" and "the data stream is paused because the client hasn't called fetch in a while"? Line 131: /// immediately. Both of these cases are designed so that the sender can prepare a new in the queued case, it seems like the response should go out when the queue length drops below a limit, no? Line 137: /// notification should not exceed 60s and so the sender will not time out. that also sounds complicated. Line 228: /// Adds a row batch to the recvr identified by fragment_instance_id/dest_node_id if the there is no dest_node_id, sender_id. Line 245: void AddData(const TUniqueId& fragment_instance_id, TransmitDataCtx&& payload); don't use an rvalue param here. rather, make ownership (and change of ownership) explicit. http://gerrit.cloudera.org:8080/#/c/5888/3/be/src/runtime/descriptors.h File be/src/runtime/descriptors.h: Line 546: /// Serialize to the row_tuples field of a RowBatchPb. something called ToProto should materialize a message that corresponds to this class. that's not the case here. i don't see why RowBatch::ToProto wouldn't do this itself. http://gerrit.cloudera.org:8080/#/c/5888/3/be/src/service/impala_internal_service.proto File be/src/service/impala_internal_service.proto: Line 1: // Licensed to the Apache Software Foundation (ASF) under one let's move all .proto files into common/proto, that'll make it a lot easier to find them. Line 32: // Tuples for all rows incomprehensible comment. since protos don't allow typedefs, we should probably define messages for all the ids, that makes it clear in the code what it's meant to represent. Line 46: // Of type CatalogObjects.THdfsCompression (TODO(KRPC): native enum) why not use the enum? Line 50: message TransmitDataRequestPb { you left out the service version (example: ImpalaInternalServiceVersion in ImpalaInternalService.thrift). for all params proto: - must include the service version - all fields are optional - the comment needs to indicate whether it's optional or required in v1 for all result protos: the latter two points apply as well Line 115: service ExecControlService { why not put the services in separate .proto files Line 131: service DataStreamService { would it make sense to have separate patches for the two services? it feels like the data stream service is pretty much separate from the control service, and in particular it doesn't conflict with the ongoing coordinator/"control" changes. Line 140: // Called by the coordinator to deliver global runtime filters to fragment i consider the filters to be control structures. why wouldn't they be in execcontrolservice? -- To view, visit http://gerrit.cloudera.org:8080/5888 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I95229290566a8ccffd80ed2d74c1c57cf1479238 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Anonymous Coward #168 Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes