avro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitry Kovalev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AVRO-739) Add Date/Time data types
Date Tue, 12 Aug 2014 14:53:14 GMT

    [ https://issues.apache.org/jira/browse/AVRO-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14094121#comment-14094121

Dmitry Kovalev commented on AVRO-739:

bq. But what I'm trying to get at is whether your IPC use case for the string representations
could be solved another way.

In short - of course it could, just in a more laborious way.

bq. Maybe I'm wrong about this, but it seems like using strings would probably be most helpful
in debugging the application. And if that's the case, we can provide a few simple tools for
working with these types rather than changing the representation to avoid the conversion.
What about adding a set of helpers...

Having these in Avro distribution would certainly encourage more people to go with binary
representations if they are going to be the only standard, although this level of support
is of course not the same - e.g. when you use toString() to dump the object as JSON or introspect
an object in a debugger you will still see just a byte sequence. Other binary-encoded types
are mostly first-class primitives which get translated to strings by standard tools so this
is not an issue.

However I used the debugging as just one illustration of why I thought it would be worth having
standardised string representations where compactness and performance are not absolutely critical.

Another reason we have also touched above is that there is a real lack of common _binary_
representations (and platform support) of anything beyond simple timestamps and dates, and
this is what made people misuse e.g. Date to confuse utc/local/zoned time, fixed duration
vs duration in months/days etc. 
Even in this spec we don't have a separate type/binary representation of "local" date+time
- only separate types for each component - so undoubtedly some people will decide to use timestamp-millis,
despite the spec saying that it represents UTC date-time explicitly. And the representation
specified for Duration may be most efficient but is not something that can be called commonly
used or easy to interpret. If you remember the issue of higher-precision time we have omitted
in the spec - is it going to have a separate binary representation as well? 
ISO-8601 provides a basis to represent all of these "naturally", in a way instantly understandable
by human, and makes it easy to standardise different types of date-time information and promote
their correct usage, and also provide a "bridge" to binary representations.

Having said that, I absolutely don't insist on including these into spec - just attempted
to explain the reasons I am using them currently and have initially suggested them.

> Add Date/Time data types
> ------------------------
>                 Key: AVRO-739
>                 URL: https://issues.apache.org/jira/browse/AVRO-739
>             Project: Avro
>          Issue Type: New Feature
>          Components: spec
>            Reporter: Jeff Hammerbacher
>             Fix For: 1.7.8
>         Attachments: AVRO-739-datetime-spec.xml.patch, AVRO-739.patch

This message was sent by Atlassian JIRA

View raw message