arrow-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Cutler <cutl...@gmail.com>
Subject Re: Date/Time fields values in Java
Date Fri, 10 Mar 2017 21:47:35 GMT
Thanks for the info Julien.  I'll open a JIRA for fixing the type layout
for TIME, and I'll give the documentation a shot.

Regards,
Bryan

On Thu, Mar 9, 2017 at 9:01 PM, Julien Le Dem <julien@dremio.com> wrote:

> Hi Bryan,
> In the JSON representation we should use the integer representation of the
> Timestamp. We should not depend on joda for this.
>
> DATE is on 8 bytes => 64bits:
> https://github.com/apache/arrow/blob/6b3ae2aecc8cd31425035a021fa04b
> 9ed3385a8d/format/Message.fbs#L79
> https://github.com/apache/arrow/blob/6b3ae2aecc8cd31425035a021fa04b
> 9ed3385a8d/java/vector/src/main/codegen/data/ValueVectorTypes.tdd#L73
>
> Time in on 4 bytes => 32 bits and has a unit:
> https://github.com/apache/arrow/blob/6b3ae2aecc8cd31425035a021fa04b
> 9ed3385a8d/format/Message.fbs#L85
> https://github.com/apache/arrow/blob/6b3ae2aecc8cd31425035a021fa04b
> 9ed3385a8d/java/vector/src/main/codegen/data/ValueVectorTypes.tdd#L60
> It should the time in {unit} since midnight stored in a 32 bit integer.
> It should not have a default unit IMO
>
> So as you pointed out it looks like a bug both on the C++ and java side for
> Time
> https://github.com/apache/arrow/blob/6b3ae2aecc8cd31425035a021fa04b
> 9ed3385a8d/java/vector/src/main/java/org/apache/arrow/
> vector/schema/TypeLayout.java#L163
> tests TODO here:
> https://issues.apache.org/jira/browse/ARROW-510
>
> We need to add the Date, Time, Timestamp description in the doc:
> https://github.com/apache/arrow/blob/master/format/Metadata.md
> You are welcome to take a stab at it and send a Pull request if you feel
> like it.
> Otherwise I'll update it.
>
> On Thu, Mar 9, 2017 at 3:37 PM, Bryan Cutler <cutlerb@gmail.com> wrote:
>
> > I guess it would make sense to just store the time of day value in
> > milliseconds to go along with the DATE type that contains days since
> epoch,
> > which would fit into a 4 byte value.  Only I see conflicting code in
> > TypeLayout.java that defines the schema as 64 bit width
> >
> > public TypeLayout visit(Time type) {
> >     return newFixedWidthTypeLayout(dataVector(64));
> > }
> >
> > And in C++ there is this comment
> >   // Exact time encoded with int64, default unit millisecond
> >   TIME,
> >
> > Does the TIME type still need to go through some discussion to get pinned
> > down?
> >
> > Thanks,
> > Bryan
> >
> > On Thu, Mar 9, 2017 at 10:53 AM, Bryan Cutler <cutlerb@gmail.com> wrote:
> >
> > > Hello All,
> > >
> > > I've started work on ARROW-582 to add Date/Time support for Java JSON
> > > files and would just like to clear up a few things.  I believe the Java
> > > Time type is supposed to represent milliseconds since epoch, it is
> stored
> > > as a FixedValueVector with a width of 4 bytes (equivalent to Java
> 'int')
> > > and it retrieved by constructing a org.joda.time.DateTime with that
> > value.
> > > Shouldn't this be an 8 byte width, equivalent to Java 'long'?
> > >
> > >     <#elseif minor.class == "Time">
> > >     @Override
> > >     public DateTime getObject(int index) {
> > >
> > >         org.joda.time.DateTime time = new org.joda.time.DateTime(get(
> > index),
> > > org.joda.time.DateTimeZone.UTC);
> > >         time = time.withZoneRetainFields(org.joda.time.DateTimeZone.
> > > getDefault());
> > >         return time;
> > >     }
> > >
> > > Thanks,
> > > Bryan
> > >
> >
>
>
>
> --
> Julien
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message