Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 00990200B32 for ; Wed, 25 May 2016 01:21:15 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id F3196160A36; Tue, 24 May 2016 23:21:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id CBB82160A35 for ; Wed, 25 May 2016 01:21:13 +0200 (CEST) Received: (qmail 54378 invoked by uid 500); 24 May 2016 23:21:12 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 54369 invoked by uid 99); 24 May 2016 23:21:12 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 24 May 2016 23:21:12 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id C63122C033A for ; Tue, 24 May 2016 23:21:12 +0000 (UTC) Date: Tue, 24 May 2016 23:21:12 +0000 (UTC) From: "Jason Dere (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-13837) current_timestamp() output format is different in some cases MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 24 May 2016 23:21:15 -0000 [ https://issues.apache.org/jira/browse/HIVE-13837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15299137#comment-15299137 ] Jason Dere commented on HIVE-13837: ----------------------------------- +1 if tests look good > current_timestamp() output format is different in some cases > ------------------------------------------------------------ > > Key: HIVE-13837 > URL: https://issues.apache.org/jira/browse/HIVE-13837 > Project: Hive > Issue Type: Bug > Reporter: Pengcheng Xiong > Assignee: Pengcheng Xiong > Attachments: HIVE-13837.01.patch > > > As [~jdere] reports: > {code} > current_timestamp() udf returns result with different format in some cases. > select current_timestamp() returns result with decimal precision: > {noformat} > hive> select current_timestamp(); > OK > 2016-04-14 18:26:58.875 > Time taken: 0.077 seconds, Fetched: 1 row(s) > {noformat} > But output format is different for select current_timestamp() from all100k union select current_timestamp() from over100k limit 5; > {noformat} > hive> select current_timestamp() from all100k union select current_timestamp() from over100k limit 5; > Query ID = hrt_qa_20160414182956_c4ed48f2-9913-4b3b-8f09-668ebf55b3e3 > Total jobs = 1 > Launching Job 1 out of 1 > Tez session was closed. Reopening... > Session re-established. > Status: Running (Executing on YARN cluster with App id application_1460611908643_0624) > ---------------------------------------------------------------------------------------------- > VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED > ---------------------------------------------------------------------------------------------- > Map 1 .......... llap SUCCEEDED 1 1 0 0 0 0 > Map 4 .......... llap SUCCEEDED 1 1 0 0 0 0 > Reducer 3 ...... llap SUCCEEDED 1 1 0 0 0 0 > ---------------------------------------------------------------------------------------------- > VERTICES: 03/03 [==========================>>] 100% ELAPSED TIME: 0.92 s > ---------------------------------------------------------------------------------------------- > OK > 2016-04-14 18:29:56 > Time taken: 10.558 seconds, Fetched: 1 row(s) > {noformat} > explain plan for select current_timestamp(); > {noformat} > hive> explain extended select current_timestamp(); > OK > ABSTRACT SYNTAX TREE: > > TOK_QUERY > TOK_INSERT > TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE > TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION > current_timestamp > STAGE DEPENDENCIES: > Stage-0 is a root stage > STAGE PLANS: > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > TableScan > alias: _dummy_table > Row Limit Per Split: 1 > GatherStats: false > Select Operator > expressions: 2016-04-14 18:30:57.206 (type: timestamp) > outputColumnNames: _col0 > ListSink > Time taken: 0.062 seconds, Fetched: 30 row(s) > {noformat} > explain plan for select current_timestamp() from all100k union select current_timestamp() from over100k limit 5; > {noformat} > hive> explain extended select current_timestamp() from all100k union select current_timestamp() from over100k limit 5; > OK > ABSTRACT SYNTAX TREE: > > TOK_QUERY > TOK_FROM > TOK_SUBQUERY > TOK_QUERY > TOK_FROM > TOK_SUBQUERY > TOK_UNIONALL > TOK_QUERY > TOK_FROM > TOK_TABREF > TOK_TABNAME > all100k > TOK_INSERT > TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE > TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION > current_timestamp > TOK_QUERY > TOK_FROM > TOK_TABREF > TOK_TABNAME > over100k > TOK_INSERT > TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE > TOK_SELECT > TOK_SELEXPR > TOK_FUNCTION > current_timestamp > _u1 > TOK_INSERT > TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE > TOK_SELECTDI > TOK_SELEXPR > TOK_ALLCOLREF > _u2 > TOK_INSERT > TOK_DESTINATION > TOK_DIR > TOK_TMP_FILE > TOK_SELECT > TOK_SELEXPR > TOK_ALLCOLREF > TOK_LIMIT > 5 > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > DagId: hrt_qa_20160414183119_ec8e109e-8975-4799-a142-4a2289f85910:7 > Edges: > Map 1 <- Union 2 (CONTAINS) > Map 4 <- Union 2 (CONTAINS) > Reducer 3 <- Union 2 (SIMPLE_EDGE) > DagName: > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: all100k > Statistics: Num rows: 100000 Data size: 15801336 Basic stats: COMPLETE Column stats: COMPLETE > GatherStats: false > Select Operator > Statistics: Num rows: 100000 Data size: 4000000 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: 2016-04-14 18:31:19.0 (type: timestamp) > outputColumnNames: _col0 > Statistics: Num rows: 200000 Data size: 8000000 Basic stats: COMPLETE Column stats: COMPLETE > Group By Operator > keys: _col0 (type: timestamp) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: timestamp) > null sort order: a > sort order: + > Map-reduce partition columns: _col0 (type: timestamp) > Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE > tag: -1 > TopN: 5 > TopN Hash Memory Usage: 0.04 > auto parallelism: true > Execution mode: llap > LLAP IO: no inputs > Path -> Alias: > hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/all100k [all100k] > Path -> Partition: > hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/all100k > Partition > base file name: all100k > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > properties: > COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"t":"true","si":"true","i":"true","b":"true","f":"true","d":"true","s":"true","dc":"true","bo":"true","v":"true","c":"true","ts":"true"}} > EXTERNAL TRUE > bucket_count -1 > columns t,si,i,b,f,d,s,dc,bo,v,c,ts,dt > columns.comments > columns.types tinyint:smallint:int:bigint:float:double:string:decimal(38,18):boolean:varchar(25):char(25):timestamp:date > field.delim | > file.inputformat org.apache.hadoop.mapred.TextInputFormat > file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > location hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/all100k > name default.all100k > numFiles 1 > numRows 100000 > rawDataSize 15801336 > serialization.ddl struct all100k { byte t, i16 si, i32 i, i64 b, float f, double d, string s, decimal(38,18) dc, bool bo, varchar(25) v, char(25) c, timestamp ts, date dt} > serialization.format | > serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > totalSize 15901336 > transient_lastDdlTime 1460612683 > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > properties: > COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"t":"true","si":"true","i":"true","b":"true","f":"true","d":"true","s":"true","dc":"true","bo":"true","v":"true","c":"true","ts":"true"}} > EXTERNAL TRUE > bucket_count -1 > columns t,si,i,b,f,d,s,dc,bo,v,c,ts,dt > columns.comments > columns.types tinyint:smallint:int:bigint:float:double:string:decimal(38,18):boolean:varchar(25):char(25):timestamp:date > field.delim | > file.inputformat org.apache.hadoop.mapred.TextInputFormat > file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > location hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/all100k > name default.all100k > numFiles 1 > numRows 100000 > rawDataSize 15801336 > serialization.ddl struct all100k { byte t, i16 si, i32 i, i64 b, float f, double d, string s, decimal(38,18) dc, bool bo, varchar(25) v, char(25) c, timestamp ts, date dt} > serialization.format | > serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > totalSize 15901336 > transient_lastDdlTime 1460612683 > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: default.all100k > name: default.all100k > Truncated Path -> Alias: > hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/all100k [all100k] > Map 4 > Map Operator Tree: > TableScan > alias: over100k > Statistics: Num rows: 100000 Data size: 6631229 Basic stats: COMPLETE Column stats: COMPLETE > GatherStats: false > Select Operator > Statistics: Num rows: 100000 Data size: 4000000 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: 2016-04-14 18:31:19.0 (type: timestamp) > outputColumnNames: _col0 > Statistics: Num rows: 200000 Data size: 8000000 Basic stats: COMPLETE Column stats: COMPLETE > Group By Operator > keys: _col0 (type: timestamp) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: timestamp) > null sort order: a > sort order: + > Map-reduce partition columns: _col0 (type: timestamp) > Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE > tag: -1 > TopN: 5 > TopN Hash Memory Usage: 0.04 > auto parallelism: true > Execution mode: llap > LLAP IO: no inputs > Path -> Alias: > hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/over100k [over100k] > Path -> Partition: > hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/over100k > Partition > base file name: over100k > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > properties: > COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"t":"true","si":"true","i":"true","b":"true","f":"true","d":"true","bo":"true","s":"true","bin":"true"}} > EXTERNAL TRUE > bucket_count -1 > columns t,si,i,b,f,d,bo,s,bin > columns.comments > columns.types tinyint:smallint:int:bigint:float:double:boolean:string:binary > field.delim : > file.inputformat org.apache.hadoop.mapred.TextInputFormat > file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > location hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/over100k > name default.over100k > numFiles 1 > numRows 100000 > rawDataSize 6631229 > serialization.ddl struct over100k { byte t, i16 si, i32 i, i64 b, float f, double d, bool bo, string s, binary bin} > serialization.format : > serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > totalSize 6731229 > transient_lastDdlTime 1460612798 > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > properties: > COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"t":"true","si":"true","i":"true","b":"true","f":"true","d":"true","bo":"true","s":"true","bin":"true"}} > EXTERNAL TRUE > bucket_count -1 > columns t,si,i,b,f,d,bo,s,bin > columns.comments > columns.types tinyint:smallint:int:bigint:float:double:boolean:string:binary > field.delim : > file.inputformat org.apache.hadoop.mapred.TextInputFormat > file.outputformat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > location hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/over100k > name default.over100k > numFiles 1 > numRows 100000 > rawDataSize 6631229 > serialization.ddl struct over100k { byte t, i16 si, i32 i, i64 b, float f, double d, bool bo, string s, binary bin} > serialization.format : > serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > totalSize 6731229 > transient_lastDdlTime 1460612798 > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: default.over100k > name: default.over100k > Truncated Path -> Alias: > hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/user/hcat/tests/data/over100k [over100k] > Reducer 3 > Execution mode: vectorized, llap > Needs Tagging: false > Reduce Operator Tree: > Group By Operator > keys: KEY._col0 (type: timestamp) > mode: mergepartial > outputColumnNames: _col0 > Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE > Limit > Number of rows: 5 > Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > GlobalTableId: 0 > directory: hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/tmp/hive/hrt_qa/ec0773d7-0ac2-45c7-b9cb-568bbed2c49c/hive_2016-04-14_18-31-19_532_3480081382837900888-1/-mr-10001/.hive-staging_hive_2016-04-14_18-31-19_532_3480081382837900888-1/-ext-10002 > NumFilesPerFileSink: 1 > Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: COMPLETE > Stats Publishing Key Prefix: hdfs://os-r6-qugztu-hive-1-5.novalocal:8020/tmp/hive/hrt_qa/ec0773d7-0ac2-45c7-b9cb-568bbed2c49c/hive_2016-04-14_18-31-19_532_3480081382837900888-1/-mr-10001/.hive-staging_hive_2016-04-14_18-31-19_532_3480081382837900888-1/-ext-10002/ > table: > input format: org.apache.hadoop.mapred.SequenceFileInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > properties: > columns _col0 > columns.types timestamp > escape.delim \ > hive.serialization.extend.additional.nesting.levels true > serialization.escape.crlf true > serialization.format 1 > serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > TotalFiles: 1 > GatherStats: false > MultiFileSpray: false > Union 2 > Vertex: Union 2 > Stage: Stage-0 > Fetch Operator > limit: 5 > Processor Tree: > ListSink > Time taken: 0.301 seconds, Fetched: 284 row(s) > {noformat} > Both the queries used return timestamp with YYYY-MM-DD HH:MM:SS.fff format in past releases. > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)