avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bridger Howell <bhow...@sofi.org>
Subject Re: (Default) values for logical types in human-readable form
Date Fri, 20 Oct 2017 18:13:01 GMT
>
> Sorry, I can see how that was confusing. I would use the logical type to
> determine the transform, but wouldn't require the user to have configured a
> conversion for the type, which is optional. Basically, I'm saying that this
> feature would support decimal, date, time, and timestamp from string.


I think I confused things a bit by suggesting that we shouldn't base this
default on logical type.

My initial motivation for this was primarily based on the idea of
representing bytes in base64. I wouldn't really think of "base64" as a
logical type, since it isn't really a "higher-level" encoding of bytes.
It's more convenient for humans, but I don't think I'd really want to be
manipulating base64 strings in my code very often, and definitely not just
because I wanted to write the default value in base64.

My other additional reasoning is that there might be more than one distinct
way of representing a default as a string for a given type. If we go back
to the bytes example, I think both hex and base64 are reasonable
human-readable encodings for bytes.
I don't have any good examples of this for logical types, though.

We can also extend IDL to use an annotation or something that bakes
down to "string-default"
> in the schema. I'm not very familiar with the IDL, though, so I can't say
> exactly what we would need to do here.


We could use the existing lightly-documented feature of IDL that it maps
annotations to fields.

So a field declared like:
timestamp_ms @strdefault("2000-01-01T00:00:00.000Z") createdDt;

(Just tested this) Is converted to a schema field like:
{"name":"createdDt","type":{"type":"long","logicalType":"timestamp-millis"},"strdefault":"2000-01-
01T00:00:00.000Z"}

There would still be some work needed on the IDL side if we went with a
field named "string-default" or "default-as-string", since IDL treats
"string" as a keyword regardless of the context.

Alternatively, having some special syntax for human-readable defaults might
be okay, but I don't have any great suggestions.

On Fri, Oct 20, 2017 at 11:08 AM, Doug Cutting <cutting@gmail.com> wrote:

> On Fri, Oct 20, 2017 at 12:13 AM, Bridger Howell <bhowell@sofi.org> wrote:
>
> > But then would I end up with a few bits of ugliness:
> > - the schemas have to be wrapped in a made-up protocol
> > - the generated code includes a generated protocol class I wouldn't care
> > about
> >
>
> Wouldn't these be eliminated if we simply permitted 'record', 'enum',
> 'union' and 'fixed' as top-level items in the IDL syntax?  That would not
> be a difficult change.
>
>
Correct, that would be a relatively easy change and that should also
address my third concern (since all the weird protocol stuff is hidden
behind protocol blocks).

I assume then we'd want to change the output of the IDL parser become a
mixture of schemas and protocols instead of just one protocol (since that
would be equivalent to unwrapping a protocol)?
I'm not sure if with the new SchemaResolver logic we can deal with IDL
schemas living in different files.

- Bridger Howell

-- 


The information contained in this email message is PRIVATE and intended 
only for the personal and confidential use of the recipient named above. If 
the reader of this message is not the intended recipient or an agent 
responsible for delivering it to the intended recipient, you are hereby 
notified that you have received this message in error and that any review, 
dissemination, distribution or copying of this message is strictly 
prohibited.  If you have received this communication in error, please 
notify us immediately by email, and delete the original message.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message