Return-Path: X-Original-To: apmail-impala-dev-archive@minotaur.apache.org Delivered-To: apmail-impala-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 87B6219728 for ; Thu, 21 Apr 2016 23:20:08 +0000 (UTC) Received: (qmail 9751 invoked by uid 500); 21 Apr 2016 23:20:08 -0000 Delivered-To: apmail-impala-dev-archive@impala.apache.org Received: (qmail 9710 invoked by uid 500); 21 Apr 2016 23:20:08 -0000 Mailing-List: contact dev-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@impala.incubator.apache.org Delivered-To: mailing list dev@impala.incubator.apache.org Received: (qmail 9699 invoked by uid 99); 21 Apr 2016 23:20:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Apr 2016 23:20:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 8C883C0D3C for ; Thu, 21 Apr 2016 23:20:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id mD3cIoFqAG1G for ; Thu, 21 Apr 2016 23:20:05 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id B0D305F5A4 for ; Thu, 21 Apr 2016 23:20:05 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id u3LNK4xh021181; Thu, 21 Apr 2016 23:20:04 GMT Message-Id: <201604212320.u3LNK4xh021181@ip-10-146-233-104.ec2.internal> Date: Thu, 21 Apr 2016 23:20:04 +0000 From: "Thomas Tauber-Marshall (Code Review)" To: impala-cr@cloudera.com, dev@impala.incubator.apache.org CC: Matthew Jacobs , Marcel Kornacker , Wes McKinney Reply-To: tmarshall@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?[hs2client-CR]_Initial_structure_for_the_C++_hiveserver2_client.=0A?= X-Gerrit-Change-Id: I7be82d5897ad6093116915b17924674c4ae508a0 X-Gerrit-ChangeURL: X-Gerrit-Commit: 9a9ea9baeb7fb899da35093e07954950889baa22 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.10-rc0 Thomas Tauber-Marshall has posted comments on this change. Change subject: Initial structure for the C++ hiveserver2 client. ...................................................................... Patch Set 6: (49 comments) Updated to the latest reviews and also finished implementing the rest of the operation functions. Still need to write more tests. http://gerrit.cloudera.org:8080/#/c/2645/6/CMakeLists.txt File CMakeLists.txt: Line 253: src/hs2client/operation-test.cc > I don't think you need to put these here -- the test files are compiled as Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/columnar-row-set.cc File src/hs2client/columnar-row-set.cc: Line 37: row_set->columns[i] > this is copying columns[i] into column. Should this be a pointer? Done Line 38: impl > For what we see here, I don't actually think you need ColumnImpl or Column: Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/columnar-row-set.h File src/hs2client/columnar-row-set.h: Line 35: return (*(nulls + (i / 8)) & (1 << (i % 8))) != 0; > nulls[i / 8] Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2service-test.cc File src/hs2client/hs2service-test.cc: Line 45: coon > connect Done Line 52: connection > service? Done Line 55: EXPECT_OK(service->Close()); > Why is Close() a no-op on this call but not on l58? Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2service.cc File src/hs2client/hs2service.cc: Line 56: assert(!IsConnected()); > Maybe DCHECK here? Done Line 66: try { : impl_->transport->close(); : } catch (TException& tx) {} > just call Close()? Done Line 76: isOpen > Are we sure this can't throw? Yes. The documentation says it shouldn't, and I also looked at the code to verify, though obviously it wouldn't hurt to put it in a 'try' http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2service.h File src/hs2client/hs2service.h: Line 76: Close > What happens if you call this on a closed session -- no-op or error? Done Line 78: Reconnect > Do we really need this functionality, or could we ask users to just create If the connection to Impala is broken but Impala is still running, just creating a new service would leave any already existing sessions orphaned, since they rely on the rpc object from the service that created them in our current model, though I'm not sure Reconnect is the right solution to that problem. This brings up the important point of how we want to deal with transient errors, eg. right now we're just returning errors immediately when rpcs fail, but maybe we want to add automatic retries, and relatedly how we're going to test our behavior in the face of transient errors (as well as what Thrift's behavior is in the case of transient errors, which I need to look up). http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2session-test.cc File src/hs2client/hs2session-test.cc: Line 44: EXPECT_OK(session_ok->ExecuteStatement("select * from "+ TEST_TBL, &select_op)); > nit: space Done Line 55: "+ > nit space Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2session.cc File src/hs2client/hs2session.cc: Line 30: _ > consistent naming with underscores please, I think we don't use them traili Done Line 40: > maybe return immediately if !open_ ? Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2session.h File src/hs2client/hs2session.h: Line 40: Close > can be called twice? Done Line 48: Status GetSchemas(const std::string& schema_name, std::shared_ptr* out); > In this new function, schema_name can be a wildcard, is that right? If so, Done Line 59: Status GetFunctions(const std::string& schema_name, std::shared_ptr* out); > Same question with these others. Do they accept wildcards (e.g. "foo_*") Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/macros.h File src/hs2client/macros.h: Line 19: DISALLOW_COPY_AND_ASSIGN > we shouldn't let this macro leak into the public headers (this file is incl Done Line 24: if (!status.ok()) { \ : return status; \ : } > Yeah, in general you should wrap all multi-line macros in do {... } while(0 Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/operation-test.cc File src/hs2client/operation-test.cc: Line 60: EXPECT_TRUE(FindRow(1, "a", int_col.get(), string_col.get())); : EXPECT_TRUE(FindRow(2, "b", int_col.get(), string_col.get())); : EXPECT_FALSE(FindRow(3, int_col.get())); : EXPECT_FALSE(FindRow(4, int_col.get())); > This is fine, but the results are ordered so you should be able to check ex Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/operation.cc File src/hs2client/operation.cc: Line 31: using std::vector; > I think it's better to use std members explicitly most of the time (unlike Done Line 36: 512 > is this from somewhere else? in our code we often pick 1024 so if this is j Done Line 90: row_set_impl->row_set = resp.results > This involves a memory copy of the data. A few ways around this but maybe t Done Line 112: PrintResults > Let's see if we can move PrintResults to test code or at least some utility Done Line 113: namespace { > ? This hides these symbols outside of this .cc file to prevent linking conflicts, similar to declaring them static. https://google.github.io/styleguide/cppguide.html#Namespaces http://stackoverflow.com/questions/154469/unnamed-anonymous-namespaces-vs-static-functions http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/operation.h File src/hs2client/operation.h: Line 28: enum > I think we can use "enum class" here. Done Line 28: enum FetchOrientation { : FETCH_NEXT, : FETCH_PRIOR, : FETCH_RELATIVE, : FETCH_ABSOLUTE, : FETCH_FIRST, : FETCH_LAST : }; : : enum OperationState { : INITIALIZED_STATE, : RUNNING_STATE, : FINISHED_STATE, : CANCELED_STATE, : CLOSED_STATE, : ERROR_STATE, : UKNOWN_STATE, : PENDING_STATE, : }; > Are these just mirroring the definitions in hs2? Can you add a comment. Can Done Line 59: GetState > Brief comment. When is this valid to call? Done Line 63: Wait > Is this necessary? Seems like a useful function for testing, but I don't kn Moved to Util. Line 66: GetLog > Brief comment, when is this valid to call? Done Line 68: GetProfile > same here and the other actions below Done Line 84: PrintResults > Agree -- I think we can leave this to users of the library to implement the Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/sample-usage.cc File src/hs2client/sample-usage.cc: Line 21: TCLIService > I would even go so far as to add a unit test exhibiting that Thrift is not Done Line 50: CHECK_OK(status); > I actually don't think we want to be using this macro everywhere (here and Done Line 56: CHECK_OK(status); > Here you need to close the session. Done Line 60: CHECK_OK > here you would close the execute_op and then the session. Done Line 66: int_col->length > I think this is wrong. length seems to be the number of elements in the dat I don't think so - the data vector will have length equal to the total number of elements, null and non-null. See OperationTest::TestFetch for proof of this. The entries in the vector for the null values will be 0 for numeric types, '' for string, etc. I'll still add the assert and always use the same column, as suggested. Line 68: auto > would you mind using the real types here for clarity? Done Line 86: Fetch > please make sure that the Fetch() API documents how this works with end-of- Done Line 87: CHECK_OK > same Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/status.h File src/hs2client/status.h: Line 22: StatusCode > can you add a comment this is referencing the HS2 status codes, and brief c Done Line 23: Status > we don't need to append Status on each one. Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/test-util.h File src/hs2client/test-util.h: Line 37: Get > I don't see a method "Get()" Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/thrift-internal.cc File src/hs2client/thrift-internal.cc: Line 93: case > can you define a stringstream only where you need it, i.e. here and on l108 Done Line 95: if (i != 0) { : ss << ","; : } > 1 line Done http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/thrift-internal.h File src/hs2client/thrift-internal.h: Line 68: h > nit extra space Done Line 74: > I can add automated clang-tidy style cleaning in a follow up patch. Done -- To view, visit http://gerrit.cloudera.org:8080/2645 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7be82d5897ad6093116915b17924674c4ae508a0 Gerrit-PatchSet: 6 Gerrit-Project: hs2client Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Wes McKinney Gerrit-HasComments: Yes