avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: Union with a single branch
Date Tue, 07 Dec 2010 18:14:53 GMT

On Dec 7, 2010, at 2:58 AM, Thiruvalluvan M. G. wrote:

> The Java implementation allows unions with just one branch. But C++
> implementation doesn't. The spec is silent in this respect.
> Is there a need for single-branch unions?
> There could be an argument that single-branch unions can be used for future
> extensions. But I don't think it is needed because our resolution spec
> allows matching standalone entities with unions as long as the entity's type
> is one of the branches in the union.


> Another argument could be that data written using single-branch union can be
> read by multi-branch union without using schema resolution. But we do not
> want to encourage such usage. If the schemas for reader and writer are
> different (in whatever way) we want people to use schema resolution.

Not only that, but the single branch union would have to coincide with the first branch of
the multi branch union.  Thats asking for trouble.  Reader / Writer schema resolution is always
required unless the schemas are identical.  The resolver could note that the written union's
branch subset is smaller and in the same order as the reader's and thus compatible, but this
compatibility check needs to be in the resolver, not left to the user.

> The only valid argument I could think of is that someone may already be
> using single-branch unions.

I fear that even if only the Java implementation supported single branch unions, there would
likely still be persisted single branch unions out in the wild.

> Tightening the spec will break their code.
> Tightening spec will also means that all language implementations should fix
> the problem, if they haven't already. In any case we need to make the
> implementations consistent and make the specification explicit in this
> regard.

Making all implementations capable of reading already persisted single-branch unions but incapable
of writing them doesn't seem like a good way forward.  We probably have to just support single
branch unions and put them in the spec.  I don't think that is a burden from a code maintenance
point of view -- a single branch isn't much of a special case, its more of a degenerate case.
 We should discourage their use though.  Any idea what the other implementations do?  I suppose
we need to add a single branch union to the interop tests and find out.
> Any thoughts?
> Thanks
> Thiru

View raw message