hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devaraj Das <d...@hortonworks.com>
Subject Re: What's going on? Two C++ clients being developed at the moment?
Date Mon, 18 Apr 2016 23:57:24 GMT
Your mail is very timely, Stack. Since I have been involved in this work for some time now,
let me try replying to this email...

Vamsi and his team is working with us (Hortonworks) to get a C++ HBase client implementation
up and running. We were discussing internally that we should reach out to the Dev community
now, discuss, and get help on having these two implementations (Elliott's and Vamsi's) converge,
if possible.

Elliott is right - since he already had something in the works at Facebook, we thought it'd
be good to start with that.. One of the major things that existed in the patch Elliott put
up was the use of Buck as the way to build the C++ library. We thought that Make is a much
more common tool for building C++ and we went with that (and granted we should have discussed
this aspect earlier in the dev cycle).

Vamsi submitted patches for reviews a month-and-a-half back, and it was reviewed, and a few
iterations happened. The current state is that most of the APIs like get/put/scan are in place
and a few DDLs are in place as well - Kerberos/SASL is being worked on. It's still a work-in-progress,
and would need more work on probably the error/exception handling, and such... 

One of the major changes in the recent patch was to be able to use the recent C++ features/best-practices.
The major comments that are unaddressed yet are these:
1. Be able to do async RPC. Vamsi's current implementation does sync. I know Vamsi is already
looking at that aspect. Just to be clear, this in my mind this doesn't qualify as a blocker
- since we don't have an async client API to support yet.
2. Switch to C++ ways of configuring the hbase client configuration. This is something I am
really not sure about. By going this route, we'd have to be able to manage two different ways
of configuring things - one for Java and another for C++. This will lead to unnecessary duplication
of configs and such (and the deployment tools would now have to be aware about a new way of
configuring c++ clients). But we can take a look at making this configuration method pluggable
if it makes sense.
3. Use Facebook's Folly instead of POCO for the RPC layer implementation. This is under consideration.
Maybe, if we decide to have one implementation going forward, this would be an area of active
4. Use of Facebook's Buck build system. This I already talked about above.

Elliott, regarding your concern as to who would support the large code drop .. we did talk
about breaking the patch up into smaller ones if it makes sense for reviews and such. I personally
would like to avoid having multiple implementations of the C++ client, and would like to see
how we can work together... I think Vamsi has already addressed the other concerns to do with
Copyright headers, etc. Vamsi can add more color wherever needed...

From: Elliott Clark <eclark@apache.org>
Sent: Monday, April 18, 2016 3:38 PM
To: dev@hbase.apache.org
Subject: Re: What's going on? Two C++ clients being developed at the moment?

Yeah there's currently two different implementation efforts on-going.

I started working on a cpp client a while ago. Then there was some interest
in working on the cpp client from other parties. So I put some of the
implementation up. Things stayed there for a while. Then interest surged
again. Vamsi had been working on some code away from jira's.

As I see it right now, the cpp client is a great place to learn from our
mistakes. Async client is needed; it's simpler and cleaner. Retrofitting
async onto of a synchronous implementation leads to a very large mess (I
give you AsyncProcess). So I've been working on a fully async client using
Boost, Folly and Wangle. These are the libraries that power thrift at
Facebook. So I have some good faith in them being very fast and well
maintained. Folly and Wangle together allow for very few copy network
layer. I've provided a convenience docker file that has all libraries
needed. This allows everyone that's building to build vs the exact same

Vamsi et al have created something that works now. It's able to connect and
send some commands. It's synchronous building on the poco library. It
builds using autotools.

For me I have a few concerns around the other implementations:

* Who will support it. The HBase community has not had good luck with large
code drops from people who are not running the code every day.
* Sync client has been very hard to keep a clean code base. Why start with
that when there's a way forward that doesn't
* Poco: It's a library that I haven't heard of and I don't know the
scale/testing of it.
* There's code with other people's copyrights on the headers. For me this
is just a no-go. Importing code that has questions about who wrote what is
just a recipe to have Apache's lawyers get upset. Dima raised some points
that some things look to be gnu licensed.
* It uses XML for configuration of a native lib. That's something that is
VERY strange. I don't know another client lib that does that.

For the async implementation there are still some things that need to be
cleaned up:
* Some people would like to use a build system other than buck. That's
fine, I think anyone that wanted to add on a cmake file would be a nice
* There's still more work to go. Right now we can connect, send the header,
send a request header, and serialize across the request body. Getting the
response isn't there, and locating things in meta isn't done.

On Mon, Apr 18, 2016 at 2:56 PM, Stack <stack@duboce.net> wrote:

> Correct me if I am wrong, but it seems like there are two (different?) C++
> clients underway? There is the work by Vamsi Mohan V S Thattikota that is
> going on in HBASE-15534 and then there is what seems like a different
> effort over in HBASE-14850 C++ client implementation by the mighty Elliott.
> Whats up? We going to carry two c++ clients? Work together?
> Thanks,
> St.Ack

View raw message