drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Volodymyr Vysotskyi (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-4722) date_add() function returns incorrect result with interval hour, minute and second
Date Tue, 27 Jun 2017 09:16:00 GMT

     [ https://issues.apache.org/jira/browse/DRILL-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Volodymyr Vysotskyi updated DRILL-4722:
---------------------------------------
    Description: 
*Issue*
The following query returns the same data for the second column as the first:
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '3' HOUR), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '5' HOUR) from (values(1));
+------------------------+------------------------+
|         EXPR$0         |         EXPR$1         |
+------------------------+------------------------+
| 2015-01-24 10:27:05.0  | 2015-01-24 10:27:05.0  |
+------------------------+------------------------+
{code}
If each column is run separately, then it produces correct result:
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '3' HOUR) from (values(1));
+------------------------+
|         EXPR$0         |
+------------------------+
| 2015-01-24 10:27:05.0  |
+------------------------+
{code}
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '5' HOUR) from (values(1));
+------------------------+
|         EXPR$0         |
+------------------------+
| 2015-01-24 12:27:05.0  |
+------------------------+
{code}
Same problem is seen for interval of minute and second:
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '50' MINUTE), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '40' MINUTE) from (values(1));
+------------------------+------------------------+
|         EXPR$0         |         EXPR$1         |
+------------------------+------------------------+
| 2015-01-24 08:17:05.0  | 2015-01-24 08:17:05.0  |
+------------------------+------------------------+
{code}
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '50' second), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '40' second) from (values(1));
+------------------------+------------------------+
|         EXPR$0         |         EXPR$1         |
+------------------------+------------------------+
| 2015-01-24 07:27:55.0  | 2015-01-24 07:27:55.0  |
+------------------------+------------------------+
{code}
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '3' HOUR), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '50' MINUTE), date_add(cast('2015-01-24 07:27:05.0' as
timestamp), interval '50' second) from (values(1));
+------------------------+------------------------+------------------------+
|         EXPR$0         |         EXPR$1         |         EXPR$2         |
+------------------------+------------------------+------------------------+
| 2015-01-24 10:27:05.0  | 2015-01-24 10:27:05.0  | 2015-01-24 10:27:05.0  |
+------------------------+------------------------+------------------------+
{code}

  was:
*Issue*
The following query returns the same data for the second column as the first:
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '3' HOUR), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '5' HOUR) from (values(1));
+------------------------+------------------------+
|         EXPR$0         |         EXPR$1         |
+------------------------+------------------------+
| 2015-01-24 10:27:05.0  | 2015-01-24 10:27:05.0  |
+------------------------+------------------------+
{code}
If each column is run separately, then it produces correct result:
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '3' HOUR) from (values(1));
+------------------------+
|         EXPR$0         |
+------------------------+
| 2015-01-24 10:27:05.0  |
+------------------------+
{code}
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '5' HOUR) from (values(1));
+------------------------+
|         EXPR$0         |
+------------------------+
| 2015-01-24 12:27:05.0  |
+------------------------+
{code}
Same problem is seen for interval of minute and second:
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '50' MINUTE), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '40' MINUTE) from (values(1));
+------------------------+------------------------+
|         EXPR$0         |         EXPR$1         |
+------------------------+------------------------+
| 2015-01-24 08:17:05.0  | 2015-01-24 08:17:05.0  |
+------------------------+------------------------+
{code}
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '50' second), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '40' second) from (values(1));
+------------------------+------------------------+
|         EXPR$0         |         EXPR$1         |
+------------------------+------------------------+
| 2015-01-24 07:27:55.0  | 2015-01-24 07:27:55.0  |
+------------------------+------------------------+
{code}
{code:sql}
select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '3' HOUR), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '50' MINUTE), date_add(cast('2015-01-24 07:27:05.0' as
timestamp), interval '50' second) from (values(1));
+------------------------+------------------------+------------------------+
|         EXPR$0         |         EXPR$1         |         EXPR$2         |
+------------------------+------------------------+------------------------+
| 2015-01-24 10:27:05.0  | 2015-01-24 10:27:05.0  | 2015-01-24 10:27:05.0  |
+------------------------+------------------------+------------------------+
{code}

*Problem description*
According to the common subexpression elimination logic (see DRILL-3912) Drill reuses the
results from previously evaluated expression trees for redundant evaluations.

Drill uses [IntervalDayExpression|https://github.com/apache/drill/blob/0dc237e3161cf284212cc63f740b229d4fee8fdf/logical/src/main/java/org/apache/drill/common/expression/ValueExpressions.java#L608]
to store days and millis. Hours, minutes and seconds are converted to the milliseconds to
be stored in this field.

The [visitIntervalDayConstant() method|https://github.com/apache/drill/blob/3e8b01d5b0d3013e3811913f0fd6028b22c1ac3f/exec/java-exec/src/main/java/org/apache/drill/exec/expr/EqualityVisitor.java#L168]
does not check the equality of  values in the IntervalDayExpression.millis field. 
Therefore Drill assumes that the second expression from the query may be replaced by the result
of the first during the code generation [stage|https://github.com/apache/drill/blob/416ec70a616e8d12b5c7fca809763b977d2f7aad/exec/java-exec/src/main/java/org/apache/drill/exec/expr/EvaluationVisitor.java#L151].


> date_add() function returns incorrect result with interval hour, minute and second  
> ------------------------------------------------------------------------------------
>
>                 Key: DRILL-4722
>                 URL: https://issues.apache.org/jira/browse/DRILL-4722
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>            Reporter: Krystal
>            Assignee: Volodymyr Vysotskyi
>             Fix For: 1.11.0
>
>
> *Issue*
> The following query returns the same data for the second column as the first:
> {code:sql}
> select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '3' HOUR), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '5' HOUR) from (values(1));
> +------------------------+------------------------+
> |         EXPR$0         |         EXPR$1         |
> +------------------------+------------------------+
> | 2015-01-24 10:27:05.0  | 2015-01-24 10:27:05.0  |
> +------------------------+------------------------+
> {code}
> If each column is run separately, then it produces correct result:
> {code:sql}
> select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '3' HOUR) from (values(1));
> +------------------------+
> |         EXPR$0         |
> +------------------------+
> | 2015-01-24 10:27:05.0  |
> +------------------------+
> {code}
> {code:sql}
> select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '5' HOUR) from (values(1));
> +------------------------+
> |         EXPR$0         |
> +------------------------+
> | 2015-01-24 12:27:05.0  |
> +------------------------+
> {code}
> Same problem is seen for interval of minute and second:
> {code:sql}
> select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '50' MINUTE), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '40' MINUTE) from (values(1));
> +------------------------+------------------------+
> |         EXPR$0         |         EXPR$1         |
> +------------------------+------------------------+
> | 2015-01-24 08:17:05.0  | 2015-01-24 08:17:05.0  |
> +------------------------+------------------------+
> {code}
> {code:sql}
> select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '50' second), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '40' second) from (values(1));
> +------------------------+------------------------+
> |         EXPR$0         |         EXPR$1         |
> +------------------------+------------------------+
> | 2015-01-24 07:27:55.0  | 2015-01-24 07:27:55.0  |
> +------------------------+------------------------+
> {code}
> {code:sql}
> select date_add(cast('2015-01-24 07:27:05.0' as timestamp), interval '3' HOUR), date_add(cast('2015-01-24
07:27:05.0' as timestamp), interval '50' MINUTE), date_add(cast('2015-01-24 07:27:05.0' as
timestamp), interval '50' second) from (values(1));
> +------------------------+------------------------+------------------------+
> |         EXPR$0         |         EXPR$1         |         EXPR$2         |
> +------------------------+------------------------+------------------------+
> | 2015-01-24 10:27:05.0  | 2015-01-24 10:27:05.0  | 2015-01-24 10:27:05.0  |
> +------------------------+------------------------+------------------------+
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message