drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Khurram Faraaz <kfar...@maprtech.com>
Subject Re: Behavior of TIME type column in Drill
Date Wed, 10 Jun 2015 22:24:11 GMT
Thanks Daniel.

On Wed, Jun 10, 2015 at 3:08 PM, Daniel Barclay <dbarclay@maprtech.com>
wrote:

> Khurram Faraaz wrote:
>
>> Hi All,
>>
>> Can someone please review this ?
>>
>> SQL standard (ISO/IEC 9075-1:2011(E)) specifies that a value of datatype
>> TIME comprises values of DATETIME fields HOUR, MINUTE and SECONDS.
>> (HH:MM:SS)
>>
> Note that the SECOND field can have a fractional seconds precision.
>
> (Your mention of "HH:MM:SS" suggests that maybe you were thinking that
> seconds can't be fractional.)
>
> Daniel
>
>  The question here is for the same input data Drill and Postgres return
>> different output/results, when we cast the column to TIME in Drill.
>>
>> Drill 1.0 results below are with cast to TIME. Note that there are three
>> digits after the seconds. (HH:MM:SS.sss)
>>
>> {code}
>>
>> 0: jdbc:drill:schema=dfs.tmp> SELECT cast( columns[8] as TIME ) FROM
>> `allData.csv` limit 10;
>>
>> *+---------------+*
>>
>> *| **   EXPR$0    ** |*
>>
>> *+---------------+*
>>
>> *| *08:16:08.580 * |*
>>
>> *| *09:11:49.170 * |*
>>
>> *| *18:44:25.430 * |*
>>
>> *| *20:52:08.560 * |*
>>
>> *| *19:46:10.420 * |*
>>
>> *| *15:21:34.390 * |*
>>
>> *| *10:31:55.240 * |*
>>
>> *| *01:37:47.310 * |*
>>
>> *| *01:50:05.110 * |*
>>
>> *| *11:28:25.100 * |*
>>
>> *+---------------+*
>>
>> 10 rows selected (0.173 seconds)
>>
>>
>> explain plan for above query
>>
>>
>> *| *00-00    Screen
>>
>> 00-01      SelectionVectorRemover
>>
>> 00-02        Limit(fetch=[10])
>>
>> 00-03          Project(EXPR$0=[CAST(ITEM($0, 8)):TIME(0)])
>>
>> 00-04            Scan(groupscan=[EasyGroupScan
>> [selectionRoot=/tmp/allData.csv, numFiles=1, columns=[`columns`[8]],
>> files=[maprfs:///tmp/allData.csv]]])
>>
>>
>> {code}
>>
>>
>> Drill results below are without cast to TIME, these are same as Postgres
>> results. However, we should note that there are only two digits after SS
>> in
>> the results, and this is not as per the SQL specification for TIME
>> datatype.
>>
>>
>> {code}
>>
>> 0: jdbc:drill:schema=dfs.tmp> SELECT columns[8] FROM `allData.csv` limit
>> 10;
>>
>> *+--------------+*
>>
>> *| **   EXPR$0   ** |*
>>
>> *+--------------+*
>>
>> *| *8:16:8.58   * |*
>>
>> *| *9:11:49.17  * |*
>>
>> *| *18:44:25.43 * |*
>>
>> *| *20:52:8.56  * |*
>>
>> *| *19:46:10.42 * |*
>>
>> *| *15:21:34.39 * |*
>>
>> *| *10:31:55.24 * |*
>>
>> *| *1:37:47.31  * |*
>>
>> *| *1:50:5.11   * |*
>>
>> *| *11:28:25.1  * |*
>>
>> *+--------------+*
>>
>> 10 rows selected (0.264 seconds)
>>
>>
>> explain plan for query that does not cast to TIME
>>
>>
>> *| *00-00    Screen
>>
>>
>> 00-01      SelectionVectorRemover
>>
>> 00-02        Limit(fetch=[10])
>>
>> 00-03          Project(EXPR$0=[ITEM($0, 8)])
>>
>> 00-04            Scan(groupscan=[EasyGroupScan
>> [selectionRoot=/tmp/allData.csv, numFiles=1, columns=[`columns`[8]],
>> files=[maprfs:///tmp/allData.csv]]])
>>
>> {code}
>>
>>
>> Postgres 9.3 results are
>>
>>
>> {code}
>>
>> postgres=# SELECT col_tm FROM all_typs_tbl limit 10;
>>
>>     col_tm
>>
>> -------------
>>
>>   08:16:08.58
>>
>>   09:11:49.17
>>
>>   18:44:25.43
>>
>>   20:52:08.56
>>
>>   19:46:10.42
>>
>>   15:21:34.39
>>
>>   10:31:55.24
>>
>>   01:37:47.31
>>
>>   01:50:05.11
>>
>>   11:28:25.1
>>
>> (10 rows)
>> {code}
>>
>> Thanks,
>> Khurram
>>
>>
>
> --
> Daniel Barclay
> MapR Technologies
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message