avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Carey <sc...@richrelevance.com>
Subject Re: schema defaults not reflected in generated objects (1.3.2)
Date Mon, 07 Jun 2010 22:34:52 GMT
Creating default fields for objects would have performance issues if they are new instances
-- new Utf8() and new YourClassHere() are not free.  So 'foo = new Utf8("foo"); is not right,
but assignment from a static default would be fine.  But unless these objects are immutable,
a client could change the default.

Ideally, the specific and generic APIs can handle this better.  A getter can return a default
value if its field is null, or generated classes can be more sophisticated, removing default
constructors and providing constructors or static factory methods that require a user to provide
all 'default-less' fields up front.  In the long run I'd like to have these sort of more powerful
and flexible generated objects with various user-configured options, but at this time it is
not there.
Also, consider how Unions complicate things here.  Right now they are not so fun to deal with
unless it is only a union of NULL and one other type.  Client code has to know the exact classes/types
to inspect to resolve the union.

Until there are enhancements to the API, if you are concerned with users putting garbage in,
I suggest you write a wrapper class that handles this.  Have users use that class rather than
the classes generated by the specific compiler.

On Jun 7, 2010, at 3:11 PM, Bill de hOra wrote:

> Scott Carey wrote:
>> No, it should not initialize the field to the default.
>> Default values are for readers, not writers.   The intended use case is schema evolution.
> This means writers can't leverage schema defaults, so writers should do 
> something like this?
>  Message message = new Message();
>  // no defaults set
>  String quux = message
>      .getSchema()
>      .getField("foo")
>      .defaultValue()
>      .getTextValue();
>  message.foo=new Utf8(quux);
> [ignoring that the writer needs to know the schema type]. I suspect 
> people will just write in garbage (like empty strings).
>> A writer must always correctly provide
>> data for all of the fields in the schema it declared
>> it is writing.
> Why is it incorrect to not provide defaults when defaults are  part of 
> the schema author's intention? Or put another way, why is reader/writer 
> asymmetry a goal under a given schema?
> I see in the code that SpecificDatumWriter/GenericDatumWriter is passed 
> the Schema - By all means crash on fields with no defaults, but I'm not 
> clear on what harm is done by using default field data. The current code 
> seems fragile in comparison.
> Bill

View raw message