impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <ger...@cloudera.org>
Subject [hs2client-CR] Initial structure for the C++ hiveserver2 client.
Date Fri, 08 Apr 2016 21:14:45 GMT
Thomas Tauber-Marshall has posted comments on this change.

Change subject: Initial structure for the C++ hiveserver2 client.
......................................................................


Patch Set 4:

(25 comments)

http://gerrit.cloudera.org:8080/#/c/2645/4/src/hs2client/columnar-row-set.h
File src/hs2client/columnar-row-set.h:

Line 30:     char c = (*((std::string*)nulls))[i / 8];
> if you use `const uint8_t*` above, this is not needed. I don't think it's a
I'm having a hard time getting this to work. I'm going to punt on it for now, so if anyone
knows how to do it some pointers would be appreciated. Otherwise, I'll get back to it after
I write the tests.


http://gerrit.cloudera.org:8080/#/c/2645/4/src/hs2client/hs2client.cc
File src/hs2client/hs2client.cc:

Line 49: using apache::hive::service::cli::thrift::TStatusCode;
> You could instead use a namespace alias:
Done


Line 58: using namespace std;
> Better to use qualified names
Done


Line 103: TFetchOrientation::type FetchOrientationToTFetchOrientation(
> Methods that are not exported should be static
Done


Line 132:     unique_ptr<HS2Service>* out) {
> Need an HS2 protocol argument. We will eventually need to support non-colum
Done


Line 180:   }
> It may be worth creating a macro like RETURN_NOT_OK but that captures and t
Done


Line 194:   TProtocolVersion hs2_protocol_version_;
> Struct data members do not have trailing underscores (https://google.github
Done


Line 198:     : impl_(new HS2SessionImpl()), rpc_(move(rpc)), open_(false) {}
> std::move is not necessary with std::shared_ptr
Its not necessary, but (I believe) it is slightly more efficient, since it saves a copy and
therefore two atomic operations (increment/decrement) on the ref counter.

There may not be a huge performance difference, but it could be a scalability issue since
every session and operation gets a copy of this shared_ptr.

At the very least, it doesn't hurt anything, so I think its worth keeping, unless you have
a good reason to remove it.


Line 208:   rpc_->client_->CloseSession(resp, req);
> Can this fail?
Done


Line 218:   rpc_->client_->OpenSession(resp, req);
> Can this fail?
Done


Line 220:   Status status = TStatusToStatus(resp.status);
> RETURN_NOT_OK(TStatusToStatus(resp.status));
Done


Line 233:   Status Open(TSessionHandle sessionHandle, const string& statement,
> session_handle
Done


Line 241:     rpc_->client_->ExecuteStatement(resp, req);
> failure possible?
Done


Line 255:     const HS2ClientConfig& confOverlay, shared_ptr<Operation>* out) {
> conf_overlay
Sorry about this (and the other places I made the same mistake). Still getting used to switching
back and forth between Java and C++ regularly.


Line 266:       const std::string& tableName) {
> Please use Google C++ naming conventions
Done


Line 273:     rpc_->client_->GetTables(resp, req);
> can this fail?
Done


Line 306:   rpc_->client_->CancelOperation(resp, req);
> handle exception thrown
Done


Line 314:   rpc_->client_->CloseOperation(resp, req);
> may throw exception
Done


Line 330:   rpc_->client_->FetchResults(resp, req);
> may throw exception
Done


Line 331:   Status status = TStatusToStatus(resp.status);
> RETURN_NOT_OK(...) ?
Done


Line 348:   impl->column_ = row_set->columns[i];
> I believe this performs a copy of the column in the row set, which must be 
Seems plausible (Operation::Fetch may have a similar problem). However, after playing around
with it for awhile, I couldn't come up with any different way to do this that improves the
performance.

Any suggestions for what the right thing to do here is?


http://gerrit.cloudera.org:8080/#/c/2645/4/src/hs2client/hs2session.h
File src/hs2client/hs2session.h:

Line 61:   Status GetFunctions(const std::string& schemaName, std::shared_ptr<Operation>*
out);
> May prefer C++-style variable naming in this methods. Can you add documenta
Done


http://gerrit.cloudera.org:8080/#/c/2645/4/src/hs2client/operation.h
File src/hs2client/operation.h:

Line 46:   static const int DEFAULT_MAX_ROWS;
> Is this intended to be configurable? If yes, state if the user is expected 
Done


Line 52:   Operation& operator=(Operation const&) = delete;
> You may want to pull in the gutil DISALLOW_COPY_AND_ASSIGN https://github.c
Done


http://gerrit.cloudera.org:8080/#/c/2645/4/src/hs2client/sample-usage.cc
File src/hs2client/sample-usage.cc:

Line 103:   service->Close();
> Is there concern about "zombie sessions"? We might be able to apply RAII-st
Impala supports idle_query/session_timeout parameters, so that should help prevent zombies
from being too much of a concern.

What Matt and I had talked about was checking in the destructors that the operation/session/service
has been closed, and throwing an error if it hasn't. This has the advantage of requiring the
user to be explicit about the lifetimes of these objects, but is also obviously more work
for the user to handle cleanup during errors than if we just closed it for them RAII-style.

I don't have a strong opinion about which is better.


-- 
To view, visit http://gerrit.cloudera.org:8080/2645
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7be82d5897ad6093116915b17924674c4ae508a0
Gerrit-PatchSet: 4
Gerrit-Project: hs2client
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Wes McKinney <wes@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message