avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@gmail.com>
Subject Re: schema resolution rules issue
Date Wed, 15 Feb 2017 17:30:05 GMT
On Mon, Feb 13, 2017 at 10:21 PM, Torche Guillaume <gtorche@gmail.com> wrote:
> if both are unions:
> The first schema in the reader's union that matches the selected writer's
> union schema is recursively resolved against it. if none match, an error is
> signaled.
> [...]
> The only difference between these two schemas are on the metroCode field
> where the writer type is the following union: [\"null\",\"int\"] and the
> reader type is the following union: [\"null\",\"string\"].
> As far as I understand the union rule, these two schemas match because null
> match with null.

That is not correct.  Resolution is described here not as a static
analysis, but as a dynamic process while reading data.  The "selected"
schema refers to branch of the writer's schema that was actually
written.  So, if you'd written selecting a "null" branch of a union,
then this is resolved against any null branch in the reader's union,
successfully in your example.  If however you wrote selecting the
"int" branch, then resolution would fail, as there is no matching
"int" branch in the reader's union.

A static analysis of whether two schemas are compatible could detect
three cases:
  1. All data written by one can be read by the other.
  2. No data written by one can be read by the other.
  3. Some but not all data written by one can be read by the other.

Your example is an instance of (3).  Rejecting these when checking
static compatibility would be the safest strategy, grouping cases (2)
and (3) together as incompatible.


View raw message