avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@gmail.com>
Subject Re: Avro union compatibility mode enhancement proposal
Date Fri, 10 Jun 2016 21:00:36 GMT
Matthieu,

Can you please provide an example of how this would work?

Thanks,

Doug

On Thu, Jun 9, 2016 at 6:47 PM, Matthieu Monsch <monsch@alum.mit.edu> wrote:

> Thinking about this a bit more (and a couple months later…), maybe there
> is a simpler alternative.
>
> Currently, a reason why writer evolution is hard (the union issue
> described below is a special case of this) is that aliases are only used on
> the reader side. Why not also allow readers to use the writer’s aliases?
>
> Resolution would first be done on names, then fall back to reader aliases,
> and finally fall back to writer aliases. In the example below, it would be
> enough to add an alias to the base record inside any new records to have
> evolution work.
>
> -Matthieu
>
>
>
> > On Apr 22, 2016, at 8:42 AM, Matthieu Monsch <monsch@alum.mit.edu>
> wrote:
> >
> > The second solution sounds like a great alternative.
> >
> > Branch aliases are more straightforward than an implicit order-sensitive
> policy. They also have the additional benefit of giving users a bit more
> flexibility: since defaults are specified on the branches’ types, it is
> possible to have different branches have different defaults inside the same
> union. There are probably a few edge cases (e.g. allowing multiple such
> aliases would be useful) but they should be simple to address.
> >
> > What would be a good attribute name for this? `baseTypes`?
> >
> > -Matthieu
> >
> >
> >
> >> On Apr 21, 2016, at 10:52 AM, Doug Cutting <cutting@gmail.com> wrote:
> >>
> >> On Wed, Apr 20, 2016 at 9:09 PM, Ryan Blue <rblue@netflix.com.invalid>
> wrote:
> >>> Making the default a property of an
> >>> inner schema makes me think that we will have to deal with multiple
> schemas
> >>> with such a label at some point.
> >>
> >> On Thu, Apr 21, 2016 at 6:54 AM, Matthieu Monsch <monsch@alum.mit.edu>
> wrote:
> >>> Delegating default selection to the branches themselves is a great
> idea but it
> >>> will be tricky to handle reference branches smoothly. More minor but
> it also
> >>> doesn’t feel intuitive to not have the union “own” its default
> attribute.
> >>
> >> If I understand your concerns correctly, I attempted to address this
> above:
> >>
> >> "Note however that, when using a record as the default branch, one
> >> could not then
> >> use that same record as a non-default branch in another union.  To
> >> ameliorate that, we might permit multiple default branches in a union
> >> to be specified as default with the convention that the first such is
> >> used."
> >>
> >> Does that make sense?
> >>
> >> This isn't ideal syntax, but it's not terrible, and it doesn't change
> >> schema syntax incompatibly, which seems important, especially when its
> >> unlikely that all implementations would implement such a syntax change
> >> in a synchronized manner.
> >>
> >> Alternately, one might annotate each derived record with the name of
> >> its base record, then one wouldn't need to alter union definitions.
> >> This would work like an alias.  If a record doesn't exist in the
> >> reader's schema, then an alias to the missing record would be added in
> >> the reader's schema to the base record it names in the writer's
> >> schema.  Aliases work by rewriting the writer's schema at read-time,
> >> updating names, including those in unions.  Might that work?  It seems
> >> like perhaps a more elegant approach.  It has compatible syntax and
> >> only alters behavior of a case that fails today.
> >>
> >> Doug
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message