drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiang Wu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-6242) Output format for nested date, time, timestamp values in an object hierarchy
Date Fri, 23 Mar 2018 18:48:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16411881#comment-16411881
] 

Jiang Wu commented on DRILL-6242:
---------------------------------

Hmm.  The testing is failing due to the TimeZone aspect of date time handling.  Looking
at the code, when the data is read out from a Drill vector, the code does:
{code:java}
    <#if minor.class == "Date">
    @Override
    public ${friendlyType} getObject(int index) {
      org.joda.time.DateTime date = new org.joda.time.DateTime(get(index), org.joda.time.DateTimeZone.UTC);
      date = date.withZoneRetainFields(org.joda.time.DateTimeZone.getDefault());
      return new java.sql.Date(date.getMillis());
    }
{code}
The code "withZoneRetainFields(<local-zone>)" actually modifies the time value in milliseconds. 
While this produces a textual representation that looks the same as the "UTC" textual representation,
wouldn't this cause CTAS to output a different real value?

 

> Output format for nested date, time, timestamp values in an object hierarchy
> ----------------------------------------------------------------------------
>
>                 Key: DRILL-6242
>                 URL: https://issues.apache.org/jira/browse/DRILL-6242
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>    Affects Versions: 1.12.0
>            Reporter: Jiang Wu
>            Priority: Major
>
> Some storages (mapr db, mongo db, etc.) have hierarchical objects that contain nested
fields of date, time, timestamp types.  When a query returns these objects, the output format
for the nested date, time, timestamp, are showing the internal object (org.joda.time.DateTime),
rather than the logical data value.
> For example.  Suppose in MongoDB, we have a single object that looks like this:
> {code:java}
> > db.test.findOne();
> {
>     "_id" : ObjectId("5aa8487d470dd39a635a12f5"),
>     "name" : "orange",
>     "context" : {
>         "date" : ISODate("2018-03-13T21:52:54.940Z"),
>         "user" : "jack"
>     }
> }
> {code}
> Then connect Drill to the above MongoDB storage, and run the following query within
Drill:
> {code:java}
> > select t.context.`date`, t.context from test t; 
> +--------+---------+ 
> | EXPR$0 | context | 
> +--------+---------+ 
> | 2018-03-13 | {"date":{"dayOfYear":72,"year":2018,"dayOfMonth":13,"dayOfWeek":2,"era":1,"millisOfDay":78774940,"weekOfWeekyear":11,"weekyear":2018,"monthOfYear":3,"yearOfEra":2018,"yearOfCentury":18,"centuryOfEra":20,"millisOfSecond":940,"secondOfMinute":54,"secondOfDay":78774,"minuteOfHour":52,"minuteOfDay":1312,"hourOfDay":21,"zone":{"fixed":true,"id":"UTC"},"millis":1520977974940,"chronology":{"zone":{"fixed":true,"id":"UTC"}},"afterNow":false,"beforeNow":true,"equalNow":false},"user":"jack"}
|
> {code}
> We can see that from the above output, when the date field is retrieved as a top level
column, Drill outputs a logical date value.  But when the same field is within an object
hierarchy, Drill outputs the internal object used to hold the date value.
> The expected output is the same display for whether the date field is shown as a top
level column or when it is within an object hierarchy:
> {code:java}
> > select t.context.`date`, t.context from test t; 
> +--------+---------+ 
> | EXPR$0 | context | 
> +--------+---------+ 
> | 2018-03-13 | {"date":"2018-03-13","user":"jack"} |
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message