hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: What's going on? Two C++ clients being developed at the moment?
Date Tue, 19 Apr 2016 17:32:04 GMT
On Mon, Apr 18, 2016 at 4:57 PM, Devaraj Das <ddas@hortonworks.com> wrote:

> Your mail is very timely, Stack. Since I have been involved in this work
> for some time now, let me try replying to this email...
>
> Vamsi and his team is working with us (Hortonworks) to get a C++ HBase
> client implementation up and running. We were discussing internally that we
> should reach out to the Dev community now, discuss, and get help on having
> these two implementations (Elliott's and Vamsi's) converge, if possible.
>
>
+1

Lets not have two C++ clients or, to put it another way, I don't think we
can check in two C++ clients. We will only confuse our users (thrift1 vs
thrift2 redux!)



> Elliott is right - since he already had something in the works at
> Facebook, we thought it'd be good to start with that.. One of the major
> things that existed in the patch Elliott put up was the use of Buck as the
> way to build the C++ library. We thought that Make is a much more common
> tool for building C++ and we went with that (and granted we should have
> discussed this aspect earlier in the dev cycle).
>
>
Yes, to both of the above.

...

1. Be able to do async RPC. Vamsi's current implementation does sync. I
> know Vamsi is already looking at that aspect. Just to be clear, this in my
> mind this doesn't qualify as a blocker - since we don't have an async
> client API to support yet.
>


See org.apache.hadoop.hbase.ipc.AsyncRpcClient (HBASE-12684 and the still
open HBASE-13784 where we'd surface an async API). Also consider
asynchbase, which many consider a superior client to our own. The other
arguments -- sync on async is easier to do than sync on async and
scaling/threads -- make sense to me.



> 2. Switch to C++ ways of configuring the hbase client configuration. This
> is something I am really not sure about. By going this route, we'd have to
> be able to manage two different ways of configuring things - one for Java
> and another for C++. This will lead to unnecessary duplication of configs
> and such (and the deployment tools would now have to be aware about a new
> way of configuring c++ clients). But we can take a look at making this
> configuration method pluggable if it makes sense.
> 3. Use Facebook's Folly instead of POCO for the RPC layer implementation.
> This is under consideration. Maybe, if we decide to have one implementation
> going forward, this would be an area of active collaboration.
> 4. Use of Facebook's Buck build system. This I already talked about above.
>
> Elliott, regarding your concern as to who would support the large code
> drop .. we did talk about breaking the patch up into smaller ones if it
> makes sense for reviews and such. I personally would like to avoid having
> multiple implementations of the C++ client, and would like to see how we
> can work together... I think Vamsi has already addressed the other concerns
> to do with Copyright headers, etc. Vamsi can add more color wherever
> needed...
>


The discussion here seems overdue but seems to be making good progress now.

Thanks,
St.Ack



>
> ________________________________________
> From: Elliott Clark <eclark@apache.org>
> Sent: Monday, April 18, 2016 3:38 PM
> To: dev@hbase.apache.org
> Subject: Re: What's going on? Two C++ clients being developed at the
> moment?
>
> Yeah there's currently two different implementation efforts on-going.
>
> I started working on a cpp client a while ago. Then there was some interest
> in working on the cpp client from other parties. So I put some of the
> implementation up. Things stayed there for a while. Then interest surged
> again. Vamsi had been working on some code away from jira's.
>
> As I see it right now, the cpp client is a great place to learn from our
> mistakes. Async client is needed; it's simpler and cleaner. Retrofitting
> async onto of a synchronous implementation leads to a very large mess (I
> give you AsyncProcess). So I've been working on a fully async client using
> Boost, Folly and Wangle. These are the libraries that power thrift at
> Facebook. So I have some good faith in them being very fast and well
> maintained. Folly and Wangle together allow for very few copy network
> layer. I've provided a convenience docker file that has all libraries
> needed. This allows everyone that's building to build vs the exact same
> versions.
>
> Vamsi et al have created something that works now. It's able to connect and
> send some commands. It's synchronous building on the poco library. It
> builds using autotools.
>
>
> For me I have a few concerns around the other implementations:
>
> * Who will support it. The HBase community has not had good luck with large
> code drops from people who are not running the code every day.
> * Sync client has been very hard to keep a clean code base. Why start with
> that when there's a way forward that doesn't
> * Poco: It's a library that I haven't heard of and I don't know the
> scale/testing of it.
> * There's code with other people's copyrights on the headers. For me this
> is just a no-go. Importing code that has questions about who wrote what is
> just a recipe to have Apache's lawyers get upset. Dima raised some points
> that some things look to be gnu licensed.
> * It uses XML for configuration of a native lib. That's something that is
> VERY strange. I don't know another client lib that does that.
>
>
> For the async implementation there are still some things that need to be
> cleaned up:
> * Some people would like to use a build system other than buck. That's
> fine, I think anyone that wanted to add on a cmake file would be a nice
> addition.
> * There's still more work to go. Right now we can connect, send the header,
> send a request header, and serialize across the request body. Getting the
> response isn't there, and locating things in meta isn't done.
>
> On Mon, Apr 18, 2016 at 2:56 PM, Stack <stack@duboce.net> wrote:
>
> > Correct me if I am wrong, but it seems like there are two (different?)
> C++
> > clients underway? There is the work by Vamsi Mohan V S Thattikota that is
> > going on in HBASE-15534 and then there is what seems like a different
> > effort over in HBASE-14850 C++ client implementation by the mighty
> Elliott.
> >
> > Whats up? We going to carry two c++ clients? Work together?
> >
> > Thanks,
> > St.Ack
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message