impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (JIRA)" <j...@apache.org>
Subject [jira] [Created] (IMPALA-5121) avg() on timestamp col is wrong with -use_local_tz_for_unix_timestamp_conversions
Date Tue, 28 Mar 2017 02:03:42 GMT
Matthew Jacobs created IMPALA-5121:
--------------------------------------

             Summary: avg() on timestamp col is wrong with -use_local_tz_for_unix_timestamp_conversions
                 Key: IMPALA-5121
                 URL: https://issues.apache.org/jira/browse/IMPALA-5121
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 2.5.0, Impala 2.2.10, Impala 2.3.4
            Reporter: Matthew Jacobs
            Priority: Critical


The flag '-use_local_tz_for_unix_timestamp_conversions' was added for IMPALA-97. Enabling
it results in timestamps sometimes being converted into localtime, but unfortunately this
doesn't seem to be well defined when/where this conversion will happen.

I've noticed that its use seems to break the avg() aggregate function on timestamp types (despite
being an odd function on timestamps, it should still work).

Impala by default, i.e. not enabling this flag:
{code}
[localhost:21000] > select timestamp_col from functional.alltypestiny;
Query: select timestamp_col from functional.alltypestiny
Query submitted at: 2017-03-27 18:50:57 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=8242bb6012948f06:143961ed00000000
+---------------------+
| timestamp_col       |
+---------------------+
| 2009-01-01 00:00:00 |
| 2009-01-01 00:01:00 |
| 2009-02-01 00:00:00 |
| 2009-02-01 00:01:00 |
| 2009-03-01 00:00:00 |
| 2009-03-01 00:01:00 |
| 2009-04-01 00:00:00 |
| 2009-04-01 00:01:00 |
+---------------------+
Fetched 8 row(s) in 0.02s
[localhost:21000] > select avg(timestamp_col) from functional.alltypestiny;
Query: select avg(timestamp_col) from functional.alltypestiny
Query submitted at: 2017-03-27 18:50:59 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=534f6ab59b201b5e:40e2a86d00000000
+---------------------+
| avg(timestamp_col)  |
+---------------------+
| 2009-02-14 23:45:30 |
+---------------------+
{code}

Then enabling the flag results in the same timestamps returned when scanning, but evaluating
them in avg() results in them being converted:
{code}
[localhost:21000] > select timestamp_col from functional.alltypestiny;
Query: select timestamp_col from functional.alltypestiny
Query submitted at: 2017-03-27 18:51:17 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=ac4ab8fd8caf4be9:ebb0834d00000000
+---------------------+
| timestamp_col       |
+---------------------+
| 2009-01-01 00:00:00 |
| 2009-01-01 00:01:00 |
| 2009-02-01 00:00:00 |
| 2009-02-01 00:01:00 |
| 2009-03-01 00:00:00 |
| 2009-03-01 00:01:00 |
| 2009-04-01 00:00:00 |
| 2009-04-01 00:01:00 |
+---------------------+
Fetched 8 row(s) in 0.30s
[localhost:21000] > select avg(timestamp_col) from functional.alltypestiny;
Query: select avg(timestamp_col) from functional.alltypestiny
Query submitted at: 2017-03-27 18:51:25 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=9e4e2c16896090f7:8922c4f200000000
+---------------------+
| avg(timestamp_col)  |
+---------------------+
| 2009-02-15 00:00:30 |
+---------------------+
Fetched 1 row(s) in 0.12s
{code}

This behavior seems inconsistent and I'm pretty sure is not intentional. There are two misleading
functions on TimestampValue that will do this conversion when the flag is set: ToUnixTime()
and ToSubsecondUnixTime(). avg() seems to have started using ToSubsecondUnixTime() after IMPALA-2914.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message