hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Multi-language serialization discussion
Date Sat, 25 Oct 2008 00:44:28 GMT
Chad Walters wrote:
> Re-open that discussion and I imagine you might get some interested parties.

I think I just did, no?

> Bumping up a level, rather than inventing a whole new set of Hadoop-specific RPC and
serialization mechanisms

Whatever we use, we'd probably end up recycling much of Hadoop's 
client/server implementation, since it's been finely tuned for Hadoop's 
performance needs, and I've not yet seen a Thrift transport that looks 
appropriate.  We also need to add authentication and authorization 
layers to Hadoop's RPC, which don't exist in Thrift either, as far as I 
can tell.  So mostly what we'd use from Thrift directly is object 
serialization.

That said, if we use Thrift for object serialization then we'd probably 
eventually contribute our transport, authentication and authorization 
stuff to the Thrift project.  We'd probably want to build it first in 
Hadoop, since it's critical kernel stuff for Hadoop, but, once it's 
stable, contribute it to Thrift if it seemed useful to others.

As a serialization layer, Thrift lacks the self-describing stuff that I 
think is critical.  If JSON will be the primary text format, then it 
looks to me that it would be easier and more natural to base a binary 
self-describing format on JSON schema than on Thrift IDL, but perhaps I 
can be convinced otherwise.

Doug

Mime
View raw message