avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Blue <rb...@netflix.com.INVALID>
Subject Re: (Default) values for logical types in human-readable form
Date Fri, 20 Oct 2017 16:47:22 GMT
> So if I understand correctly, you support the idea of human-readable
defaults but not logical-type-dependent interpretation. I don't see how we
could achieve the first without the second, since different logical types
have different human-readable representations. So it seems that the
optional nature of logical types actually makes this feature impossible.

Sorry, I can see how that was confusing. I would use the logical type to
determine the transform, but wouldn't require the user to have configured a
conversion for the type, which is optional. Basically, I'm saying that this
feature would support decimal, date, time, and timestamp from string.

The conversion from string would happen when parsing a schema and would set
the "default" field from another string-based field, if "default" doesn't
already exist. That way, we always have the "default" that all readers use.

For example, instead of
defining {"name":"d","type":"fixed","logical-type":"decimal",...,"default":"\u000C\u006C"},
you would use {"name":"d","type":"fixed",...,"string-default":"31.80"} that
gets translated when parsed to the one with a "default" field. I think that
would work in all cases.

On the subject of AVDL, I think we clearly have a case where people are
editing schemas directly so it makes sense to support this. We can also
extend IDL to use an annotation or something that bakes down to
"string-default" in the schema. I'm not very familiar with the IDL, though,
so I can't say exactly what we would need to do here.

rb

On Fri, Oct 20, 2017 at 6:08 AM, Bridger Howell <bhowell@sofi.org> wrote:

> On Fri, Oct 20, 2017 at 2:04 AM, Frédéric SOUCHU <
> Frederic.SOUCHU@ingenico.com> wrote:
>
> > In line with Philip Zeyliger on IDL being a good tool for a human to
> > produce schema.
> > Key features (IMHO):
> > - support for includes (killer feature)
> > - simpler syntax (a *lot* less '{' and '['...)
> > - simpler comments syntax
> >
>
> I'm afraid you're missing my point. I'm not arguing that IDL isn't a "good"
> tool for producing schemas. I'm arguing that I don't think, as it is, it
> should be the preferred tool for writing schemas.
>
> Implicitly, I'm also extending that to mean we shouldn't currently prefer
> to give new features like this only to IDL, unless we want to make the
> process for using IDL for schemas cleaner and simpler. If we're willing to
> do that, then I have no issues.
>
> I have a toolchain going from IDL to Java + C# classes that wouldn't work
> > using JSON schema (the many holes in the AVRO C# side not helping
> either..).
> >
>
> Cool. I helped build something similar and I work with others who use it
> regularly.
>
>
> >  (btw, how did we end up with different json/IDL logical names?!?!)
> >
>
> I assume you're referring to the keywords like  "timestamp_ms" and "date"
> added to IDL to refer to the "timestamp-millis" and "date" logical types?
>
> These are special keywords, not a general mechanism that produces schemas
> with the given logical type so there's no particular reason that they have
> to match the logical type that they implement (although it does seem
> inconsistent). I looked around AVRO-1684 (
> https://issues.apache.org/jira/browse/AVRO-1684) where this was
> implemented
> for some justification, but I didn't find anything.
>
> Suggest to have an 'encoding' attribute to indicate how the default value
> > is defined.
> > {
> >   "type": "bytes",
> >   "logicalType": "decimal",
> >   "precision": 4,
> >   "scale": 2,
> >   "default":"3.151351351",
> >   "default-encoding":"string" // default encoding being 'AVRO' default
> > (e.g. binary)
> > }
> >
>
> This has the same problem as one of my earlier suggestions; if you change
> the meaning of the default field, then older readers will read the schema
> with an incorrect default value.
>
> - Bridger Howell
>
> --
>
>
> The information contained in this email message is PRIVATE and intended
> only for the personal and confidential use of the recipient named above. If
> the reader of this message is not the intended recipient or an agent
> responsible for delivering it to the intended recipient, you are hereby
> notified that you have received this message in error and that any review,
> dissemination, distribution or copying of this message is strictly
> prohibited.  If you have received this communication in error, please
> notify us immediately by email, and delete the original message.
>



-- 
Ryan Blue
Software Engineer
Netflix

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message