hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Pallas <joseph.pal...@oracle.com>
Subject Re: Thrift2 interface
Date Tue, 28 Aug 2012 19:07:41 GMT
Thanks for the info, Karthik (and sorry that I didn’t see it for so long, it got auto-filed).

I think the reasoning behind the native client approach makes sense.  I don’t know how much
of the extra hop overhead is network and how much is serialization/deserialization, so for
now I have been hoping that co-locating the thrift proxy with the client will give adequate

Of course, putting knowledge about .META. into the client creates a strong coupling between
the client and the server, which means changes that affect .META. may break compatibility.
 That is the price to pay for avoiding the extra hop.

The driving force behind our move to thrift2 is checkAndPut/checkAndDelete.  I see that there
is checkAndMutate support in the native client thrift file at <https://github.com/facebook/native-cpp-hbase-client/blob/master/hbase/hbase.thrift>,
but I honestly don’t understand how that works with the embedded thrift server.  I don’t
understand the relationship between that thrift interface file and the thrift interface exported
by the embedded server.  Is the native client actually able to use those routines?

On a side note, that file  also describes an API that seems to use prefixes as generalized
column families (like a column family + filter).  That looks like it would be really handy.


On Aug 23, 2012, at 9:15 AM, Karthik Ranganathan wrote:

> Hey Joe,
> We have tried a few different things wrt the C++ clients and thrift. Just
> putting out some of out thoughts here.
> First, we used the existing Thrift proxy as a separate tier (Thrift proxy
> tier). The issue there was that we just didn't get enough throughput (for
> various reasons). Indepedently, adoption of HBase from C++ was increasing
> - so we thought it made sense to write a native client.
> So we wrote the native C++ client and embedded the thrift proxy into the
> region server (embedded thrift proxy). Cutting the redirect from the
> client was one gain (as the native client is a smart client), but the real
> advantage came from short-circuiting the flow. In the thrift proxy tier
> case, the Thrift client would talk to the proxy using Thrift
> serialization, proxy would deserialize the Thrift call and re-serialize it
> into the Java client format, then send it to the region server which would
> deserialize the java formatted buffers again. But in the embedded proxy +
> native client, we can short-circuit on the embedded proxy and make a
> function call to the region server which is running in the same JVM (which
> helps cut one round of serialization and deserialization).
> The issues, however, with the thrift based approach are that the Java
> objects (Htable, scan, get, put, etc) are not thrift definitions, so they
> need to be updated as a separate (and often very different) set of api's
> every time there is an enhancement to the Java side of things. The proxy
> tier has to be separately configured/tuned/bug fixed from the region
> server to make sure it is as performant as the region server - as the
> overall system will perform like the slowest component in the stack.
> The ideal solution (IMHO) is to have a C++ client which has a compatible
> protocol with the Java client, so that there are no significant perf
> differences between the two approaches, and there is no separate proxy to
> tune. Just a though of course, might be hard to achieve. Of course we have
> just talked about this :) but with the move to protocol buffers in trunk,
> this should be easier.
> Out of curiosity, why thrift2 - do you specifically need thrift api's to
> region servers? Why not " efficient C/C++ client for HBase"?
> Thanks
> Karthik
> On 8/22/12 4:06 PM, "Joe Pallas" <joseph.pallas@oracle.com> wrote:
>> On Aug 21, 2012, at 9:29 AM, Stack wrote:
>>> On Mon, Aug 20, 2012 at 6:18 PM, Joe Pallas <joseph.pallas@oracle.com>
>>> wrote:
>>>> Anyone out there actively using the thrift2 interface in 0.94?  Thrift
>>>> bindings for C++ don¹t seem to handle optional arguments too well (that
>>>> is to say, it seems that optional arguments are not optional).
>>>> Unfortunately, checkAndPut uses an optional argument for value to
>>>> distinguish between the two cases (value must match vs no cell with
>>>> that column qualifier).  Any clues on how to work around that
>>>> difficulty would be welcome.
>>> If you make a patch, we'll commit it Joe.
>> Well, I think the patch really needs to be in Thrift; the only workaround
>> I can see is to restructure the hbase.thrift interface file to avoid
>> having routines with optional arguments.  It seems a shame to break
>> compatibility with existing clients for that, and I am not sure if there
>> is a way to do it without breaking compatibility.  (On the other hand,
>> we¹re talking about thrift2, so it isn¹t like there are many existing
>> clients.)
>> The state of Thrift documentation is lamentable.  The original white
>> paper is the most detailed information I can find about compatibility
>> rules.  It has enough information to tell me that Thrift doesn¹t support
>> overloading of routine names within a service, because the names are the
>> identifiers used to identify the routines.  I think that means it isn¹t
>> possible to make a compatible change that would only affect the client
>> side.
>>> Have you seen this?
>>> https://github.com/facebook/native-cpp-hbase-client  Would it help?
>> The native client stuff is certainly interesting, but, as near as I can
>> tell, it expects the in-region-server Thrift server, which I would like
>> to give a chance to mature a bit before playing with.  I¹m also puzzled
>> by the hbase.thrift file in that repository.  It seems to be based on the
>> older HBase Thrift interface, but it adds some functions.  I can¹t see
>> how a client could use them, though, since there are no HBase-side
>> patches.
>> Anyone involved with FB¹s native client efforts care to enlighten me?
>> joe

View raw message