avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christophe Taton <christophe.ta...@gmail.com>
Subject Re: Effort towards Avro 2.0?
Date Wed, 04 Dec 2013 18:37:42 GMT
Hi Douglas,

When you write a middleware that lets users define custom types, extensions
are pretty much required.

Middleware doesn't need to, and shouldn't need to know these user-defined
custom types ahead of time : you don't want to rebuild and restart your
middleware everytime a user define a new type they want handled by the

An explicit bytes field always works, but is both inefficient and unwieldy:

   - inefficient because you'll end up serializing your data twice, once
   from the actual type into the bytes field, then a second type as a bytes
   - unwieldy because as a user, I'll have to encode and decode the bytes
   field manually everytime I want to access this field from the original
   record, unless I keep track of the decoded extension externally to the Avro


On Wed, Dec 4, 2013 at 8:07 AM, Douglas Creager <douglas@creagertino.net>wrote:

> On Tue, Dec 3, 2013, at 07:49 AM, Doug Cutting wrote:
> > On Mon, Dec 2, 2013 at 1:42 PM, Christophe Taton
> > <christophe.taton@gmail.com> wrote:
> > > - New extension data type, similar to ProtocolBuffer extensions
> (incompatible change).
> >
> > Extensions might be implemented as something like:
> >
> >   {"type":"record", "name":"extension", "fields":[
> >     {"name":"fingerprint", "type": {"type":"fixed", "size":16}},
> >     {"name":"payload", "type":"bytes"}
> >     ]
> >   }
> I'd also want to know more about the kind of use cases that you'd need
> protobuf-style extensions for.  I like Doug's solution if each record
> can have a different set of extensions.  If all of the records will have
> the same set of extensions, my hunch is that you'd only need to use
> extra fields and schema resolution.  Either way, I can't think of a use
> case where a new data type in the spec is a noticeable improvement.
> –doug

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message