avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Satish Duggana <satish.dugg...@gmail.com>
Subject Parsing canonical forms with schemas having default values.
Date Tue, 06 Jun 2017 18:41:04 GMT
> Parsing Canonical Form is a transformation of a writer's schema that
let's us define what it means for two schemas to be "the same" for the
purpose of reading data written agains the schema. It is called Parsing
Canonical Form because the transformations strip away parts of the schema,
like "doc" attributes, that are irrelevant to readers trying to parse
incoming data. It is called Canonical Form because the transformations
normalize the JSON text (such as the order of attributes) in a way that
eliminates unimportant differences between schemas. If the Parsing
Canonical Forms of two different schemas are textually equal, then those
schemas are "the same" as far as any reader is concerned, i.e., there is no
serialized data that would allow a reader to distinguish data generated by
a writer using one of the original schemas from data generated by a writing
using the other original schema. (We sketch a proof of this property in a
companion document.)

Currently, it keeps only attributes of type, name, fields, symbols, items,
values, size and strips all others including default attribute.
Should not default attribute also be kept? Because schema with default
value and without default value are not canonically same with respect to
schema evolution.


View raw message