hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Sammer <esam...@cloudera.com>
Subject Re: [VOTE] Direction for Hadoop development
Date Tue, 14 Dec 2010 04:49:17 GMT
On Mon, Dec 13, 2010 at 10:08 PM, Owen O'Malley <omalley@apache.org> wrote:

> On Dec 7, 2010, at 2:37 PM, Roy T. Fielding wrote:
>  The proposal is to change the extension mechanism incompatibly with
>>> unclear benefits,
>> Good, these are technical reasons.  The benefits can be cleared by docs.
>> By incompatible, I assume you mean forward-compatibility of old versions
>> of Hadoop reading newer files.  Can we fix that by having the new
>> implementation use the old file format by default until it is configured
>> to use one of the new interfaces for writing?
> There are two goals here. The first is to extend the serialization plugin
> interface. The current patch does things completely compatibly including a
> shim that will use the previous plugins to satisfy the new API. The benefits
> are also clear. Avro serialization is possible when it wasn't previously. It
> also provides a wide range of opportunities that weren't previously
> possible.
> The file format was changed as a demonstration that the serialization
> interface was useful and complete. The file change is also backwards
> compatible and will automatically read old versions of the file. Old
> versions of the code will complain with an error message if they are given a
> new version. This is exactly the pattern we have used in the past.
> So, no there are no technical issues with the patch as it stands.

One of the technical issues is the fact that this precludes users from using
PB (or thrift or avro) in their jobs if the version required conflicts with
what Hadoop proper has on the classpath. We've already seen these kinds of
conflicts with other libraries in the wild and I would like to minimize this
possibility in the future. Was there something in the patch that addressed
this (I may have missed it; only did a cursory scan through)?

Jumping back to the "non-technical" issue, I really think it would help to
develop a course of action for resolution similar to what I suggested
earlier. It doesn't need to be specifically what I suggested, but I do think
that consensus building and conflict resolution are in the best interest of
the community. I feel like we could debate what people said, did, meant, or
the specifics of this issue for a long time.

Thanks and regards.
Eric Sammer
twitter: esammer
data: www.cloudera.com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message