hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Mor <amit....@hotmail.com>
Subject Re: Re: Protobuf Proxy - your feedback is welcomed !
Date Thu, 06 Aug 2015 08:05:15 GMT

>> I've been thinking of creating a new Twemproxy/Thrift (
>> https://github.com/twitter/twemproxy) like proxy for HBase that would be
>> based on using (as the underlying transport protocol) the native
>> protocol-buffer templates that come with HBase source.
> So, what'd be the memcached/redis in this case? What is the problem that is
> being solved? Thanks Amit,
> St.Ack
Hi St.Ack and thanks,

I just gave Twemproxy as an example for the concept and need, no Redis/Memcached in this case.
The core of the concept (if it solves anything...) is that every client node should run a
single local proxy.

The problems the proxy is intended to solve:

1) Reduce the number of connections to the RegionServers - for example, in situations of very
elastic, "micro-serviced" applications (as we have, with Akka actors) that potentially have
10s or 100s of independent apps on a single machine, that are all trying to connect to the
RS, it might be beneficial to let them connect to a proxy service running on the same machine
they are running (fan in).

2) Following 1) above, multiple clients on a machine can use their local proxy and benefit
from the proxy being able to 'pipeline' (or chain) requests directed at the same RS. For example:
if client1 & client2 both sends a 'get'  request from t1rs1, then the proxy can turn
these 2 requests into a 'multi-get',  thus reducing the number of tcp requests to t1rs1.

3) Easier metrics, logging and monitoring collection, in situations of multiple clients on
a single machine. All would be under the proxy responsibility.

4) Another problem is that the hbase-client lib supports synchronous req/res, and every application
that tries to be asynchronous (and pay the price) needs to implement the somehow intricate
semantics of async programming. The proxy can hide some of these complexities, by handling
its own async mechanisms, buffers, 'threads'  

5) Thrift - I was thinking that since protobuf's are in use, using Hbase.thrift is redundant
and 'invite' continuous maintenance to allow the Thrift API to follow the latest HBase protocol
buffers. i.e. Someone needs to update the Thirft API while there is already a templating system
being used already in the shape of protobuf

Apropos Redis - the proxy could support cache aside/through to HBase, but that's a feature
among many others that could be added to the proxy

>> This could be viewed as a Thrift alternative, also, and CMIIW, I don't see
>> any reason why the HBase community should support 2 protocol types, at
>> least from 'singularity' version onward, where protobuf is in heavy use
>> internally/externally.
>> I envision that the proxy would provide the following benefits:Async
>> semantics - client would not need to implement anything new except wrapping
>> their calls in something like futures and pass the calls to the proxy,
>> using the same 'commands' as the Java native cmds (that are already defined
>> as protos). Only (well, mostly...) the proxy would have to deal the
>> concerns associated with contention, starvation, blocking
>> etc.Batching/Pipelining - the service would be able to batch queries sent
>> from several clients connected to the (local) proxy service and batch them
>> The proxy service could act as Facade between the client and HBase (i.e.
>> stabilize a protocol that withhold API changes in HBase client) Separation
>> of concerns - no need for the application to 'tame' the HBase driver in
>> terms of IO resources, memory and buffer size, connection pool size,
>> etc.Naturally - Language agonisticTesting - the proxy could hold a mock
>> connection to HBase or a mock HBase for easier testing
>> Your kind feedback on the idea, concept and usefulness is very welcomed
>> Thanks,
>> Amit Mor
View raw message