Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 90729200C24 for ; Thu, 23 Feb 2017 15:55:51 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 8F086160B62; Thu, 23 Feb 2017 14:55:51 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D344E160B50 for ; Thu, 23 Feb 2017 15:55:50 +0100 (CET) Received: (qmail 28535 invoked by uid 500); 23 Feb 2017 14:55:50 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 28524 invoked by uid 99); 23 Feb 2017 14:55:50 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Feb 2017 14:55:50 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 986F71A07BD for ; Thu, 23 Feb 2017 14:55:49 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.199 X-Spam-Level: X-Spam-Status: No, score=-1.199 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id bOmwQMNG9oEt for ; Thu, 23 Feb 2017 14:55:48 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 31306618A4 for ; Thu, 23 Feb 2017 14:55:48 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 9B61DE08C2 for ; Thu, 23 Feb 2017 14:55:45 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 243D124138 for ; Thu, 23 Feb 2017 14:55:44 +0000 (UTC) Date: Thu, 23 Feb 2017 14:55:44 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (DRILL-5034) Select timestamp from hive generated parquet always return in UTC MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 23 Feb 2017 14:55:51 -0000 [ https://issues.apache.org/jira/browse/DRILL-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15880589#comment-15880589 ] ASF GitHub Bot commented on DRILL-5034: --------------------------------------- Github user vdiravka commented on the issue: https://github.com/apache/drill/pull/656 @bitblender Parquet files with INT96 TIMESTAMP values usually are generated with hive. Those TIMESTAMP values represent the local timezone of the host where the data was written. To read that values hive considers the local timezone (in fact shift between local and UTC timezones is adding to the timestamp values). The aim of this patch to make the same behaviour like in HIVE while reading parquet INT96 TIMESTAMP vals. As the result in different timezones we have different data after query or in other words this test depends from the local timezone and `baselineValues` for different timezones will be different. However I applied the logic of converting timestamps to the local timezone for the test's `baselineValues`. So now this test works on the every timezone properly. The last commit is updated. > Select timestamp from hive generated parquet always return in UTC > ----------------------------------------------------------------- > > Key: DRILL-5034 > URL: https://issues.apache.org/jira/browse/DRILL-5034 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Parquet > Affects Versions: 1.9.0 > Reporter: Krystal > Assignee: Vitalii Diravka > > commit id: 5cea9afa6278e21574c6a982ae5c3d82085ef904 > Reading timestamp data against a hive parquet table from drill automatically converts the timestamp data to UTC. > {code} > SELECT TIMEOFDAY() FROM (VALUES(1)); > +----------------------------------------------+ > | EXPR$0 | > +----------------------------------------------+ > | 2016-11-10 12:33:26.547 America/Los_Angeles | > +----------------------------------------------+ > {code} > data schema: > {code} > message hive_schema { > optional int32 voter_id; > optional binary name (UTF8); > optional int32 age; > optional binary registration (UTF8); > optional fixed_len_byte_array(3) contributions (DECIMAL(6,2)); > optional int32 voterzone; > optional int96 create_timestamp; > optional int32 create_date (DATE); > } > {code} > Using drill-1.8, the returned timestamps match the table data: > {code} > select convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from `/user/hive/warehouse/voter_hive_parquet` limit 5; > +------------------------+ > | EXPR$0 | > +------------------------+ > | 2016-10-23 20:03:58.0 | > | null | > | 2016-09-09 12:01:18.0 | > | 2017-03-06 20:35:55.0 | > | 2017-01-20 22:32:43.0 | > +------------------------+ > 5 rows selected (1.032 seconds) > {code} > If the user timzone is changed to UTC, then the timestamp data is returned in UTC time. > Using drill-1.9, the returned timestamps got converted to UTC eventhough the user timezone is in PST. > {code} > select convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from dfs.`/user/hive/warehouse/voter_hive_parquet` limit 5; > +------------------------+ > | EXPR$0 | > +------------------------+ > | 2016-10-24 03:03:58.0 | > | null | > | 2016-09-09 19:01:18.0 | > | 2017-03-07 04:35:55.0 | > | 2017-01-21 06:32:43.0 | > +------------------------+ > {code} > {code} > alter session set `store.parquet.reader.int96_as_timestamp`=true; > +-------+---------------------------------------------------+ > | ok | summary | > +-------+---------------------------------------------------+ > | true | store.parquet.reader.int96_as_timestamp updated. | > +-------+---------------------------------------------------+ > select create_timestamp from dfs.`/user/hive/warehouse/voter_hive_parquet` limit 5; > +------------------------+ > | create_timestamp | > +------------------------+ > | 2016-10-24 03:03:58.0 | > | null | > | 2016-09-09 19:01:18.0 | > | 2017-03-07 04:35:55.0 | > | 2017-01-21 06:32:43.0 | > +------------------------+ > {code} > -- This message was sent by Atlassian JIRA (v6.3.15#6346)