Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8A61F200C12 for ; Sun, 22 Jan 2017 05:34:46 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 7EC08160B56; Sun, 22 Jan 2017 04:34:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 79F16160B4A for ; Sun, 22 Jan 2017 05:34:45 +0100 (CET) Received: (qmail 9192 invoked by uid 500); 22 Jan 2017 04:34:44 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 9172 invoked by uid 99); 22 Jan 2017 04:34:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Jan 2017 04:34:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A8AF21A0355 for ; Sun, 22 Jan 2017 04:34:43 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id OCKPXLHPY6sE for ; Sun, 22 Jan 2017 04:34:40 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 197375F252 for ; Sun, 22 Jan 2017 04:34:40 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v0M4Yc6K011587; Sun, 22 Jan 2017 04:34:38 GMT Message-Id: <201701220434.v0M4Yc6K011587@ip-10-146-233-104.ec2.internal> Date: Sun, 22 Jan 2017 04:34:38 +0000 From: "Henry Robinson (Code Review)" To: impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Marcel Kornacker , Taras Bobrovytsky , Sailesh Mukil , David Knupp Reply-To: henry@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-4670_/_IMPALA-4672=3A_Add_RpcMgr_and_port_Statestore_services_to_KRPC=0A?= X-Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 X-Gerrit-ChangeURL: X-Gerrit-Commit: d82b4991949dc21dfe1184d6ce4b3dec363e318b In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.2 archived-at: Sun, 22 Jan 2017 04:34:46 -0000 Henry Robinson has posted comments on this change. Change subject: IMPALA-4670 / IMPALA-4672: Add RpcMgr and port Statestore services to KRPC ...................................................................... Patch Set 3: (50 comments) http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/exec/kudu-util.h File be/src/exec/kudu-util.h: PS2, Line 60: inline Status FromKuduStatus > Leave a TODO if we're planning on standardizing the Status class. I don't think that's imminent. http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/rpc/CMakeLists.txt File be/src/rpc/CMakeLists.txt: Line 36: rpc-trace.cc > where is this defined? See cmake_modules/FindKRPC.cmake. Part of the KRPC library inclusion patch. (https://github.com/henryr/Impala/blob/krpc/cmake_modules/FindKRPC.cmake#L25) http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/rpc/rpc-builder.h File be/src/rpc/rpc-builder.h: Line 40 > use actual type, not auto I think it encourages readability here. The type of Rpc::Make() is Rpc. It seems repetitive to have that on both sides of the assignment. The google style guide says: (Encouraged) ... long/cluttery type names, particularly when the type is clear from context. Line 41 > it feels like the service itself should have reasonable defaults, instead o Timeouts are a per-invocation construct in KRPC. They're set on the per-RPC RpcController object. I agree there should be a default timeout (it's better to occasionally timeout too early than to never return). I don't think it's necessary to change that per-service. However, there are cases where the timeout should be different for different methods on the same service. For example, Statestore::Update() has a relatively long timeout (because it can take a long time). Statestore::Heartbeat() has a short one because it should execute very quickly. I think it makes sense to keep per-RPC timeouts, but have the Rpc class have a reasonable (60s?) timeout. Line 84 > how about making this a static function in the proxy class instead? same fo The proxy class is generated by the KRPC toolchain. It could be changed, but I'd prefer to share that code with Kudu for now. PS2, Line 109: : : : > Should we add a comment stating what sort of error/error codes don't warran I think the method comment covers this. Let me know if you think it should be expanded upon. PS2, Line 131: > Check return val? Done PS2, Line 157: > This function name sounds like it's executing the RPC, akin to ExecuteRPC() I see your point, but I think the likelihood of confusion is small. Brevity is important for this method name particularly. http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: Line 18: #ifndef IMPALA_RPC_RPC_MGR_H > since this is part of the runtime system, let's move it under /runtime. Will postpone for now. There's a dependency on security code which currently lives in rpc/ that hasn't been integrated yet. After that let's take a view if this still belongs in runtime/. Line 21: #include "kudu/rpc/messenger.h" > do we really need all of these includes here, or can some of those move int Moved some to the .inline. Line 45: /// > RpcServiceMgr then? Refer to comment on line 40. This section is only describing the management of services, but there's more to it than that. Line 77: /// these pools may be configured by RegisterService(). > what is 'the io system'? Done Line 122: const scoped_refptr result_tracker() const { > move to -inline.h Done PS2, Line 143: red with all servic > Do you think this struct is really very necessary? We can do the service re We would need to retain num_service_threads until we call Init(). But I put Init() into RegisterService() as well, and now StartServices() is really about starting the acceptor threads. So this structure can be removed. Line 145: > how widely does this .h file get included? if it's not super-rare, let's tr Done Line 158 > why is this needed? Removed now. Line 178 > why not inline? scoped_refptr is a kind of shared_ptr. This reference needs to be shared with all service objects. I've added a comment. Line 183 > explain why this needs to be a shared_ptr, ie, who it's shared with. Done Line 190 > move to -inline.h Done Line 198 > emplace_back instead Done http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/rpc/thrift-util.h File be/src/rpc/thrift-util.h: Line 143: return serializer.Serialize(thrift, proto->mutable_thrift_struct()); > protos have a standard function called mutable_thrift_struct()? See the method comment. Type P must have a member called thrift_struct. Although this is expected only to apply to ThriftWrapperPB objects, keeping the templatization here means I don't have to declare that class in this header, keeping protobuf out of the way of other consumers of this file. http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/statestore/statestore-subscriber.cc File be/src/statestore/statestore-subscriber.cc: Line 36: #include "statestore/statestore.proxy.h" > are these generated? yes Line 76: > where does this come from? rpc::StatestoreSubscriberIf comes from statestore.service.h, which is auto-generated. Line 79: StatestoreSubscriberImpl(RpcMgr* rpc_mgr, StatestoreSubscriber* subscriber) > why not const&? Done Line 89: status = subscriber_->UpdateState(thrift_request.topic_deltas, > if this is !ok(), wouldn't thrift_response.status be OK? thrift_response.status gets set on line 100. Line 91: &thrift_response.skipped); > why isn't !__isset an error? Good point! I made it required in the thrift struct. PS2, Line 101: const ThriftWrapperPB* request, ThriftWrapperPB* respons > Check the return value? That's a fair question, but what should we do with it? If we can't write the response object, how can we signal this to caller? (One option is to add an explicit status field to the protobuf, but I'd rather not confuse matters). I think it's ok for now to return the response and have the caller fail to deserialize; the error will be caught there. Line 102: THeartbeatRequest thrift_request; > what does this mean if there was a deserialization error? Deserialization errors are application level. This is used to indicate RPC-level failures. There is facility in RpcContext to return application errors, but that would change quite a lot of our error handling (e.g. how we assign error codes), and so I consider using that API out of scope for now. PS2, Line 114: StatestoreSubscriber* subscriber_; > Same here. Same as above. Line 179: if (status.ok()) connected_to_statestore_metric_->set_value(true); > no auto See comments elsewhere. Line 180: if (response.__isset.registration_id) { > repeating all these template params feels redundant. what does this look li You have to pass all the parameters at once, which I think is less readable: DoRpc(rpc_mgr_, address, request, &response, 100, 3, 500) a) Who knows what the numeric constants mean? b) There's no optionality to the arguments (so no neat way to change defaults across RPCs without visiting all call sites) c) The set of parameters will expand over time to include security parameters, deadlines, feature flags and sidecars. DoRpc() would become really unwieldy. I eliminated the template parameters (except for the proxy type) by pushing them down to the Execute() method, where they can be inferred. Line 210: return Status("--statestore_subscriber_svc_queue_depth must be a positive integer"); > can't it deduce the type param? type param removed. *Impl is the same naming convention as Kudu. I don't think it's so clear cut that the Impl 'is' the service. Fixed the numeric constants. http://gerrit.cloudera.org:8080/#/c/5720/3/be/src/statestore/statestore-subscriber.cc File be/src/statestore/statestore-subscriber.cc: Line 59: "The number of threads dedicated" this is clang-format weirdness. I'll fix in the final patch. http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/statestore/statestore-subscriber.h File be/src/statestore/statestore-subscriber.h: Line 266 > what happened here? No longer [[noreturn]]. Adding the shutdown flag allows this to exit. Line 73: StatestoreSubscriber(const std::string& subscriber_id, > explain param Done Line 111: > there is no heartbeat service Done http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: Line 132: const ThriftWrapperPB* request, ThriftWrapperPB* response, RpcContext* context) { > why not make this a c'tor param Ok, changed RegisterService() so it takes a mandatory service implementation, so that now the responsibility for calling the c'tor is with the caller. Line 366: subscribers_.erase(subscriber_it); > if you care about this, shouldn't it be called before the if? No, otherwise there's a race: - ShouldExit() returns false - SetExitFlag() is called - Try to offer to threadpool - Wrong error message printed out. This is only to get the error message right when the thread pool rejects an update. Line 734: OfferUpdate(make_pair(deadline_ms, subscriber->id()), is_heartbeat ? > why the logic change there? (and why schedule another update if you just un There was a deadlock possibility that we never hit (https://issues.cloudera.org/browse/IMPALA-2642) that became much more likely because the tests now shut down the statestore and make it possible that OfferUpdate() failed. You're right that we should return from line 732. PS2, Line 772: > Should we make these const uints for clarity? Made them flags. Line 779: } > move the body into Start() and remove this function Start() should not block. Otherwise tests become very awkward to write. Renamed to Join() for clarity. http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/statestore/statestore.h File be/src/statestore/statestore.h: Line 109: /// Blocks until the exit flag is set. Does not call Start(). > who calls this and does it need to be public? Yes - anything that starts a statestore and wants to block until it finishes execution. Line 403: RpcMgr rpc_mgr_; > why not inline? Done http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/statestore/statestore.proto File be/src/statestore/statestore.proto: Line 19: // common/thrift/StatestoreService.thrift for complementary Thrift data structures. > why a different namespace? I have a slight preference for this organisationally (keep generated classes separate), but don't feel strongly so have removed. Line 21: package impala; > what for? Done Line 22: > leave note explaining what .thrift file this is paired up with Done Line 23: import "rpc/common.proto"; > these typically end in Msg KRPC seems to have standardized on *PB, which is more concise. The protobuf style guide is silent on the subject: https://developers.google.com/protocol-buffers/docs/style http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/testutil/in-process-servers.cc File be/src/testutil/in-process-servers.cc: Line 163 > remove commented out code Leaving in as a reference for what this class used to do. I have a patch to remove all of this, however, which will get squashed into this patch. http://gerrit.cloudera.org:8080/#/c/5720/2/be/src/util/collection-metrics.h File be/src/util/collection-metrics.h: Line 234: int64_t count() const { return boost::accumulators::count(acc_); } > this isn't truly a getter how not? I don't mind changing it, but I think count(acc_) acts effectively as a getter on acc_. http://gerrit.cloudera.org:8080/#/c/5720/2/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: Line 147 > leave a note explaining where the service definition that uses the structs Done -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes