accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher <>
Subject Re: thrift versions in accumulo
Date Wed, 19 Aug 2015 20:19:09 GMT
On Wed, Aug 19, 2015 at 4:01 PM, Josh Elser <> wrote:
> Christopher wrote:
>> Well, Thrift serves two purposes for us. It manages the RPC API and
>> handles the data serialization for the RPC.
> Hit the nail on the head. It's nice because it does solve formatting the
> data in a compatible way and also delivering that data from client ot
> server. It also has lots of nice things like SSL and SASl support baked in
> too. Take a look at the 2k lines of code HBase maintains just to set up an
> RPC connection. :)

I'd bet we have at least 2k lines of copied/slightly-modified Thrift
code baked in to Accumulo, reimplemented so we could access APIs which
have repeatedly regressed in visibility.
And, I'm sure we have more than that in "utility code" to set up a
Thrift server and/or client.

>> Long term, I'd like to look at Netty for handling the API (because it
>> seems stable, widely used, and feature-rich). I'm not particularly
>> concerned about the serialization, as long as the library is stable.
>> Avro makes as much sense to me as any other, but I haven't actually
>> used it yet, so I'll reserve judgment.
>> Even if we move away from Thrift in the long-term for the
>> client/server and server/server RPCs, we'll probably not get rid of it
>> entirely. It's still pretty useful for the accumulo-proxy. However,
>> since that's a small, optional add-on, it'd be pretty easy to isolate
>> thrift to just be a dependency of that... and presumably it'd be easy
>> to rebuild that add-on component for different thrift versions or to
>> support multiple versions.
> I know Sean had expressed a desire in trying to pull out Thrift from the
> core of things to make this kind of experimentation easier. Being able to
> express our RPCs in a way that isn't tied to Thrift would be step #1. Then,
> make Thrift be an "implementation" (convert to that generic way from step
> #1). After that, it should be straightforward to use whatever serialization
> and transport mechanism your heart desires (and your fingers code).

We might be able to change the way we leverage Thrift, to inoculate
ourselves from some of its nuisances as a stepping stone in this

For example, we could make sure we catch and properly return any
server-side exceptions at the Accumulo layer, so we're not relying on
Thrift to do any exception-handling/serialization for us. (That was a
pain point with 0.9.1) Similarly, we can simplify our parameters in
our RPC methods, so they take a data structure (which can be treated
as a byte source) rather than a parameter list (also a pain point in
0.9.1, due to a modified implicit serialVersionUID having changed).
That would allow us to explicitly control the API evolution, and would
ease swapping out serialization while still using Thrift to manage the

> +1 To Thrift being around in some form for the proxy for the foreseeable
> future.
>> --
>> Christopher L Tubbs II
>> On Wed, Aug 19, 2015 at 3:41 PM, Max Thomas<>  wrote:
>>> I'm not opposed to the idea...
>>> What about the long term future (e.g., Christopher's comment) for Thrift?
>>> Are there particularly attractive alternatives (Avro) that you have in
>>> mind
>>> for the project? Asking both as a interested user and general
>>> technologist.

View raw message