Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 71F1E200CD4 for ; Sat, 15 Jul 2017 00:11:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 7087216E9DD; Fri, 14 Jul 2017 22:11:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8F9B716E9DB for ; Sat, 15 Jul 2017 00:11:13 +0200 (CEST) Received: (qmail 20854 invoked by uid 500); 14 Jul 2017 22:11:12 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 20840 invoked by uid 99); 14 Jul 2017 22:11:12 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Jul 2017 22:11:12 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 0DB0CC0C8D for ; Fri, 14 Jul 2017 22:11:12 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.363 X-Spam-Level: X-Spam-Status: No, score=0.363 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 5rWiYbxeUDVW for ; Fri, 14 Jul 2017 22:10:56 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id B18165F6BF for ; Fri, 14 Jul 2017 22:10:56 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v6EMAttT018350; Fri, 14 Jul 2017 22:10:55 GMT Message-Id: <201707142210.v6EMAttT018350@ip-10-146-233-104.ec2.internal> Date: Fri, 14 Jul 2017 22:10:54 +0000 From: "Sailesh Mukil (Code Review)" To: Michael Ho , Henry Robinson , impala-cr@cloudera.com, reviews@impala.incubator.apache.org Reply-To: sailesh@cloudera.com X-Gerrit-MessageType: newpatchset Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-4856=3A_Port_data_stream_service_to_KRPC=0A?= X-Gerrit-Change-Id: Ia66704be7a0a8162bb85556d07b583ec756c584b X-Gerrit-ChangeURL: X-Gerrit-Commit: 182966ded50d00f9a576a632615e4da950c8951b In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.7 archived-at: Fri, 14 Jul 2017 22:11:14 -0000 Sailesh Mukil has uploaded a new patch set (#6). Change subject: IMPALA-4856: Port data stream service to KRPC ...................................................................... IMPALA-4856: Port data stream service to KRPC This patch ports the data-flow parts of ImpalaInternalService to KRPC. * ImpalaInternalService is split into two services. The first, ImpalaInternalService, deals with control messages for plan fragment instance execution, cancellation and reporting, and remains implemented in Thrift for now. The second, DataStreamService, handles large-payload RPCs for transmitting runtime filters and row batches between hosts. * In the DataStreamService, all RPCs use 'native' protobuf. The DataStreamService starts on the port previously reserved for the StatestoreSubscriberService (which is also a KRPC service), to avoid having to configure another port when starting Impala. When the ImpalaInternalService is ported to KRPC, all services will run on one port. * To support needing to address two different backend services, a data service port has been added to TBackendDescriptor. * This patch adds support for asynchronous RPCs to the RpcMgr and Rpc classes. Previously, Impala used fixed size thread pools + synchronous RPCs to achieve some parallelism for 'broadcast' RPCs like filter propagation, or a dedicated per-sender+receiver pair thread on the sender side in the DataStreamSender case. In this patch, the PublishFilter() and TransmitData() RPCs are sent asynchronously using KRPC's thread pools. * The TransmitData() protocol has changed to adapt to asynchronous RPCs. The full details are in data-stream-mgr.h. * As a result, DataStreamSender no longer creates a thread-per-connection on the sender side. * Both tuple transmission and runtime filter publication use sidecars to minimise the number of copies and serialization steps required. * Also include a fix for KUDU-2011 that properly allows sidecars to be shared between KRPC and the RPC caller (fixing IMPALA-5093, a corruption bug). * A large portion of this patch is the replacement of TRowBatch with its Protobuf equivalent, RowBatchPb. The replacement is a literal port of the data structure, and row-batch-test, row-batch-list-test and row-batch-serialize-benchmark continue to execute without major logic changes. * Simplify FindRecvr() logic in DataStreamManager. No-longer need to handle blocking sender-side, so no need for complex promise-based machinery. Instead, all senders with no receiver are added to a per-receiver list, which is processed when the receiver arrives. If it does not arrive promptly, the DataStreamManager cleans them up after FLAGS_datastream_sender_timeout_ms. * This patch also begins a clean-up of how ImpalaServer instances are created (by removing CreateImpalaServer), and clarifying the relationship between ExecEnv and ImpalaServer. ImpalaServer now follows the standard construct->Init()->Start()->Join() lifecycle that we use for other services. * Ensure that all addresses used for KRPCs are fully resolved, avoiding the need to resolve them for each RPC. TESTING ------- * New tests added to rpc-mgr-test. TO DO ----- * Re-enable throughput and latency measurements per data-stream sender when that information is exposed from KRPC (KUDU-1738). * TLS and Kerberos are still not supported by KRPC in this patch. Change-Id: Ia66704be7a0a8162bb85556d07b583ec756c584b --- M .clang-format M CMakeLists.txt M be/CMakeLists.txt M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/benchmarks/row-batch-serialize-benchmark.cc M be/src/common/init.cc M be/src/common/status.cc M be/src/common/status.h M be/src/exprs/expr-test.cc M be/src/kudu/rpc/rpc_sidecar.cc M be/src/kudu/rpc/rpc_sidecar.h M be/src/rpc/CMakeLists.txt M be/src/rpc/TAcceptQueueServer.cpp M be/src/rpc/common.proto M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/rpc/rpc.h M be/src/runtime/backend-client.h M be/src/runtime/client-cache-types.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-recvr.cc M be/src/runtime/data-stream-recvr.h M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch-serialize-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/scheduling/backend-config-test.cc M be/src/scheduling/backend-config.cc M be/src/scheduling/backend-config.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/CMakeLists.txt A be/src/service/data_stream_service.proto M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/service/session-expiry-test.cc M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-test.cc M be/src/statestore/statestore.cc M be/src/testutil/fault-injection-util.h M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h M be/src/util/bloom-filter-test.cc M be/src/util/bloom-filter.cc M be/src/util/bloom-filter.h M be/src/util/hdfs-util-test.cc M be/src/util/network-util.cc M be/src/util/network-util.h M be/src/util/thread-pool.h M bin/impala-config.sh M common/thrift/CMakeLists.txt M common/thrift/ImpalaInternalService.thrift M common/thrift/Results.thrift M common/thrift/StatestoreService.thrift M common/thrift/generate_error_codes.py M tests/custom_cluster/test_breakpad.py M tests/custom_cluster/test_exchange_delays.py M tests/custom_cluster/test_rpc_timeout.py M tests/query_test/test_runtime_filters.py M tests/stress/test_mini_stress.py 83 files changed, 2,793 insertions(+), 1,980 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/7103/6 -- To view, visit http://gerrit.cloudera.org:8080/7103 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia66704be7a0a8162bb85556d07b583ec756c584b Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil