impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <ger...@cloudera.org>
Subject [hs2client-CR] Initial structure for the C++ hiveserver2 client.
Date Thu, 21 Apr 2016 23:20:04 GMT
Thomas Tauber-Marshall has posted comments on this change.

Change subject: Initial structure for the C++ hiveserver2 client.
......................................................................


Patch Set 6:

(49 comments)

Updated to the latest reviews and also finished implementing the rest of the operation functions.

Still need to write more tests.

http://gerrit.cloudera.org:8080/#/c/2645/6/CMakeLists.txt
File CMakeLists.txt:

Line 253:   src/hs2client/operation-test.cc
> I don't think you need to put these here -- the test files are compiled as 
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/columnar-row-set.cc
File src/hs2client/columnar-row-set.cc:

Line 37: row_set->columns[i]
> this is copying columns[i] into column. Should this be a pointer?
Done


Line 38: impl
> For what we see here, I don't actually think you need ColumnImpl or Column:
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/columnar-row-set.h
File src/hs2client/columnar-row-set.h:

Line 35:     return (*(nulls + (i / 8)) & (1 << (i % 8))) != 0;
> nulls[i / 8]
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2service-test.cc
File src/hs2client/hs2service-test.cc:

Line 45: coon
> connect
Done


Line 52: connection
> service?
Done


Line 55: EXPECT_OK(service->Close());
> Why is Close() a no-op on this call but not on l58?
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2service.cc
File src/hs2client/hs2service.cc:

Line 56:   assert(!IsConnected());
> Maybe DCHECK here? 
Done


Line 66: try {
       :     impl_->transport->close();
       :   } catch (TException& tx) {}
> just call Close()?
Done


Line 76: isOpen
> Are we sure this can't throw?
Yes. The documentation says it shouldn't, and I also looked at the code to verify, though
obviously it wouldn't hurt to put it in a 'try'


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2service.h
File src/hs2client/hs2service.h:

Line 76: Close
> What happens if you call this on a closed session -- no-op or error?
Done


Line 78: Reconnect
> Do we really need this functionality, or could we ask users to just create 
If the connection to Impala is broken but Impala is still running, just creating a new service
would leave any already existing sessions orphaned, since they rely on the rpc object from
the service that created them in our current model, though I'm not sure Reconnect is the right
solution to that problem.

This brings up the important point of how we want to deal with transient errors, eg. right
now we're just returning errors immediately when rpcs fail, but maybe we want to add automatic
retries, and relatedly how we're going to test our behavior in the face of transient errors
(as well as what Thrift's behavior is in the case of transient errors, which I need to look
up).


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2session-test.cc
File src/hs2client/hs2session-test.cc:

Line 44:   EXPECT_OK(session_ok->ExecuteStatement("select * from "+ TEST_TBL, &select_op));
> nit: space
Done


Line 55: "+
> nit space
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2session.cc
File src/hs2client/hs2session.cc:

Line 30: _
> consistent naming with underscores please, I think we don't use them traili
Done


Line 40:  
> maybe return immediately if !open_ ?
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/hs2session.h
File src/hs2client/hs2session.h:

Line 40: Close
> can be called twice?
Done


Line 48:   Status GetSchemas(const std::string& schema_name, std::shared_ptr<Operation>*
out);
> In this new function, schema_name can be a wildcard, is that right? If so, 
Done


Line 59:   Status GetFunctions(const std::string& schema_name, std::shared_ptr<Operation>*
out);
> Same question with these others. Do they accept wildcards (e.g. "foo_*")
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/macros.h
File src/hs2client/macros.h:

Line 19: DISALLOW_COPY_AND_ASSIGN
> we shouldn't let this macro leak into the public headers (this file is incl
Done


Line 24: if (!status.ok()) {                   \
       :     return status;                      \
       :   }
> Yeah, in general you should wrap all multi-line macros in do {... } while(0
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/operation-test.cc
File src/hs2client/operation-test.cc:

Line 60:   EXPECT_TRUE(FindRow(1, "a", int_col.get(), string_col.get()));
       :   EXPECT_TRUE(FindRow(2, "b", int_col.get(), string_col.get()));
       :   EXPECT_FALSE(FindRow(3, int_col.get()));
       :   EXPECT_FALSE(FindRow(4, int_col.get()));
> This is fine, but the results are ordered so you should be able to check ex
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/operation.cc
File src/hs2client/operation.cc:

Line 31: using std::vector;
> I think it's better to use std members explicitly most of the time (unlike 
Done


Line 36: 512
> is this from somewhere else? in our code we often pick 1024 so if this is j
Done


Line 90: row_set_impl->row_set = resp.results
> This involves a memory copy of the data. A few ways around this but maybe t
Done


Line 112: PrintResults
> Let's see if we can move PrintResults to test code or at least some utility
Done


Line 113: namespace {
> ?
This hides these symbols outside of this .cc file to prevent linking conflicts, similar to
declaring them static.

https://google.github.io/styleguide/cppguide.html#Namespaces
http://stackoverflow.com/questions/154469/unnamed-anonymous-namespaces-vs-static-functions


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/operation.h
File src/hs2client/operation.h:

Line 28: enum
> I think we can use "enum class" here.
Done


Line 28: enum FetchOrientation {
       :   FETCH_NEXT,
       :   FETCH_PRIOR,
       :   FETCH_RELATIVE,
       :   FETCH_ABSOLUTE,
       :   FETCH_FIRST,
       :   FETCH_LAST
       : };
       : 
       : enum OperationState {
       :   INITIALIZED_STATE,
       :   RUNNING_STATE,
       :   FINISHED_STATE,
       :   CANCELED_STATE,
       :   CLOSED_STATE,
       :   ERROR_STATE,
       :   UKNOWN_STATE,
       :   PENDING_STATE,
       : };
> Are these just mirroring the definitions in hs2? Can you add a comment. Can
Done


Line 59: GetState
> Brief comment. When is this valid to call?
Done


Line 63: Wait
> Is this necessary? Seems like a useful function for testing, but I don't kn
Moved to Util.


Line 66: GetLog
> Brief comment, when is this valid to call?
Done


Line 68: GetProfile
> same here and the other actions below
Done


Line 84: PrintResults
> Agree -- I think we can leave this to users of the library to implement the
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/sample-usage.cc
File src/hs2client/sample-usage.cc:

Line 21: TCLIService
> I would even go so far as to add a unit test exhibiting that Thrift is not 
Done


Line 50: CHECK_OK(status);
> I actually don't think we want to be using this macro everywhere (here and 
Done


Line 56: CHECK_OK(status);
> Here you need to close the session.
Done


Line 60: CHECK_OK
> here you would close the execute_op and then the session.
Done


Line 66: int_col->length
> I think this is wrong. length seems to be the number of elements in the dat
I don't think so - the data vector will have length equal to the total number of elements,
null and non-null. See OperationTest::TestFetch for proof of this. The entries in the vector
for the null values will be 0 for numeric types, '' for string, etc.

I'll still add the assert and always use the same column, as suggested.


Line 68: auto
> would you mind using the real types here for clarity?
Done


Line 86: Fetch
> please make sure that the Fetch() API documents how this works with end-of-
Done


Line 87: CHECK_OK
> same
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/status.h
File src/hs2client/status.h:

Line 22: StatusCode
> can you add a comment this is referencing the HS2 status codes, and brief c
Done


Line 23: Status
> we don't need to append Status on each one.
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/test-util.h
File src/hs2client/test-util.h:

Line 37: Get
> I don't see a method "Get()"
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/thrift-internal.cc
File src/hs2client/thrift-internal.cc:

Line 93: case
> can you define a stringstream only where you need it, i.e. here and on l108
Done


Line 95: if (i != 0) {
       :           ss << ",";
       :         }
> 1 line
Done


http://gerrit.cloudera.org:8080/#/c/2645/6/src/hs2client/thrift-internal.h
File src/hs2client/thrift-internal.h:

Line 68:   h
> nit extra space
Done


Line 74:  
> I can add automated clang-tidy style cleaning in a follow up patch.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/2645
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7be82d5897ad6093116915b17924674c4ae508a0
Gerrit-PatchSet: 6
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Wes McKinney <wes@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message