hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: [PROPOSAL] new subproject: Avro
Date Wed, 08 Apr 2009 03:33:32 GMT
To be clear, since a few folks have missed this point: Avro is not 
complete.  At some point in the future, before people start using it as 
a format for persistent data, we'll need to stop altering its 
specification, or at least do so much more cautiously.  But before then, 
my immediate goal to move development from private to open so that we 
have a chance to incorporate feedback before we lock down the specification.

For example, several folks have raised the issue of compatibility with 
Thrift.  We certainly want to avoid gratuitous incompatibilities.  There 
are also features clearly missing from Avro that we expect to add before 
we make a release, like default values, a more efficient RPC handshake, 
etc.  And some features that we might consider removing, if they're not 
broadly useful and inhibit interoperability, like single-float, which 
isn't in Thrift, Python, etc.  And I expect there will be more such 
issues raised in the coming weeks and months.

But before we can discuss and resolve such issues we need a forum in 
which to do so.  That's all I am after at this point: mailing lists, a 
bug database, a public source code repository, etc., so that we can 
start accepting patches, adding committers, etc.

Three days have now passed since I initially proposed this, the nominal 
time for an Apache vote.  Is there anyone who strongly opposes taking 
the development of Avro public as a Hadoop subproject?  Only PMC votes 
are binding, but I would vastly prefer that the broader community also 
supports this step in the process.

Thanks,

Doug

Doug Cutting wrote:
> I propose we add a new Hadoop subproject for Avro, a serialization 
> system.  My ambition is for Avro to replace both Hadoop's RPC and to be 
> used for most Hadoop data files, e.g., by Pig, Hive, etc.
> 
> Initial committers would be Sharad Agarwal and me, both existing Hadoop 
> committers.  We are the sole authors of this software to date.
> 
> The code is currently at:
> 
> http://people.apache.org/~cutting/avro.git/
> 
> To learn more:
> 
> git clone http://people.apache.org/~cutting/avro.git/ avro
> cat avro/README.txt
> 
> Comments?  Questions?
> 
> Doug

Mime
View raw message