avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: clarification on schema resolution
Date Wed, 18 Nov 2009 21:51:35 GMT
Scott Banachowski wrote:
> What I am really wondering is what flagging an error means in this context.
> What if this is a field of a record--in this case, does error mean that the
> entire record is bad?  Or do I simply ignore this field and process the rest
> of the record?  And if I do the latter, how is this different than "unset"?

The entire record is bad.  We leave a field unset when the written 
record has no field with that name.  But when it does have a field with 
that name but with a different type, this is a type mismatch and is 
specified to be an error.

I don't know that these conventions are always the best.  Consider the 
extreme: if the reader's top-level schema is "int" and the writer wrote 
using a "float" schema, should we:

  a. require that implementations signal an error;
  b. require that implementations ignore this, e.g., returning null or 
unset or somesuch; or,
  c. permit implementations to vary?

Currently the spec indicates (a). This applies equally to field type 
mismatches as to top-level type mismatches.  That said, there might be 
cases where an application wants to read all fields it can map into an 
existing datastructure, ignoring whatever it cannot read.

Thus we might change the spec to say that implementations are 
recommended (but not required) to signal errors in such cases.  An 
implementation that signals such errors can state in its documentation 
that it implements the recommended resolution algorithm.  An 
implementation that does not should state how it deviates.


View raw message