From issues-return-176510-archive-asf-public=cust-asf.ponee.io@hive.apache.org Wed Jan 15 22:39:03 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 31D7F180661 for ; Wed, 15 Jan 2020 23:39:03 +0100 (CET) Received: (qmail 51532 invoked by uid 500); 15 Jan 2020 22:39:02 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 51521 invoked by uid 99); 15 Jan 2020 22:39:02 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Jan 2020 22:39:02 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 0135AE30AA for ; Wed, 15 Jan 2020 22:39:00 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 2604E78047E for ; Wed, 15 Jan 2020 22:39:00 +0000 (UTC) Date: Wed, 15 Jan 2020 22:39:00 +0000 (UTC) From: "Hive QA (Jira)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-22489) Reduce Sink operator should order nulls by parameter MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-22489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17016381#comment-17016381 ] Hive QA commented on HIVE-22489: -------------------------------- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 28s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 42s{color} | {color:blue} serde in master has 198 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 4m 32s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 35s{color} | {color:blue} accumulo-handler in master has 20 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 24s{color} | {color:blue} contrib in master has 11 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 37s{color} | {color:blue} hbase-handler in master has 15 extant Findbugs warnings. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 25s{color} | {color:blue} kudu-handler in master has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 35s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} serde: The patch generated 0 new + 564 unchanged - 2 fixed = 564 total (was 566) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} ql: The patch generated 0 new + 794 unchanged - 1 fixed = 794 total (was 795) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} The patch accumulo-handler passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} The patch contrib passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} The patch hbase-handler passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} The patch kudu-handler passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} The patch hive-blobstore passed checkstyle {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s{color} | {color:red} The patch has 121 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 5s{color} | {color:red} The patch 25483 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} serde generated 0 new + 197 unchanged - 1 fixed = 197 total (was 198) {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 4m 41s{color} | {color:red} ql generated 1 new + 1531 unchanged - 0 fixed = 1532 total (was 1531) {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s{color} | {color:green} accumulo-handler in the patch passed. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 32s{color} | {color:green} contrib in the patch passed. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s{color} | {color:green} hbase-handler in the patch passed. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 31s{color} | {color:green} kudu-handler in the patch passed. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 35s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 17s{color} | {color:red} The patch generated 11 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 71m 38s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:ql | | | The field org.apache.hadoop.hive.ql.exec.CommonMergeJoinOperator.nullOrdering is transient but isn't set by deserialization In CommonMergeJoinOperator.java:but isn't set by deserialization In CommonMergeJoinOperator.java | \\ \\ || Subsystem || Report/Notes || | Optional Tests | asflicense javac javadoc findbugs checkstyle compile | | uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux | | Build tool | maven | | Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-20192/dev-support/hive-personality.sh | | git revision | master / 3b1138b | | Default Java | 1.8.0_111 | | findbugs | v3.0.1 | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-20192/yetus/whitespace-eol.txt | | whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-20192/yetus/whitespace-tabs.txt | | findbugs | http://104.198.109.242/logs//PreCommit-HIVE-Build-20192/yetus/new-findbugs-ql.html | | asflicense | http://104.198.109.242/logs//PreCommit-HIVE-Build-20192/yetus/patch-asflicense-problems.txt | | modules | C: serde ql accumulo-handler contrib hbase-handler kudu-handler itests/hive-blobstore U: . | | Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-20192/yetus.txt | | Powered by | Apache Yetus http://yetus.apache.org | This message was automatically generated. > Reduce Sink operator should order nulls by parameter > ----------------------------------------------------- > > Key: HIVE-22489 > URL: https://issues.apache.org/jira/browse/HIVE-22489 > Project: Hive > Issue Type: Bug > Components: Query Planning > Reporter: Krisztian Kasa > Assignee: Krisztian Kasa > Priority: Major > Attachments: HIVE-22489.1.patch, HIVE-22489.10.patch, HIVE-22489.10.patch, HIVE-22489.11.patch, HIVE-22489.12.patch, HIVE-22489.13.patch, HIVE-22489.2.patch, HIVE-22489.3.patch, HIVE-22489.3.patch, HIVE-22489.4.patch, HIVE-22489.5.patch, HIVE-22489.6.patch, HIVE-22489.7.patch, HIVE-22489.8.patch, HIVE-22489.9.patch, HIVE-22489.9.patch > > > When the property hive.default.nulls.last is set to true and no null order is explicitly specified in the ORDER BY clause of the query null ordering should be NULLS LAST. > But some of the Reduce Sink operators still orders null first. > {code} > SET hive.default.nulls.last=true; > EXPLAIN EXTENDED > SELECT src1.key, src2.value FROM src src1 JOIN src src2 ON (src1.key = src2.key) ORDER BY src1.key LIMIT 5; > {code} > {code} > PREHOOK: query: EXPLAIN EXTENDED > SELECT src1.key, src2.value FROM src src1 JOIN src src2 ON (src1.key = src2.key) ORDER BY src1.key > PREHOOK: type: QUERY > PREHOOK: Input: default@src > #### A masked pattern was here #### > POSTHOOK: query: EXPLAIN EXTENDED > SELECT src1.key, src2.value FROM src src1 JOIN src src2 ON (src1.key = src2.key) ORDER BY src1.key > POSTHOOK: type: QUERY > POSTHOOK: Input: default@src > #### A masked pattern was here #### > OPTIMIZED SQL: SELECT `t0`.`key`, `t2`.`value` > FROM (SELECT `key` > FROM `default`.`src` > WHERE `key` IS NOT NULL) AS `t0` > INNER JOIN (SELECT `key`, `value` > FROM `default`.`src` > WHERE `key` IS NOT NULL) AS `t2` ON `t0`.`key` = `t2`.`key` > ORDER BY `t0`.`key` > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > #### A masked pattern was here #### > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > #### A masked pattern was here #### > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: src1 > filterExpr: key is not null (type: boolean) > Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE > GatherStats: false > Filter Operator > isSamplingPred: false > predicate: key is not null (type: boolean) > Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: key (type: string) > outputColumnNames: _col0 > Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: string) > null sort order: a > sort order: + > Map-reduce partition columns: _col0 (type: string) > Statistics: Num rows: 500 Data size: 43500 Basic stats: COMPLETE Column stats: COMPLETE > tag: 0 > auto parallelism: true > Execution mode: vectorized, llap > LLAP IO: no inputs > Path -> Alias: > #### A masked pattern was here #### > Path -> Partition: > #### A masked pattern was here #### > Partition > base file name: src > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > properties: > COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"key":"true","value":"true"}} > bucket_count -1 > bucketing_version 2 > column.name.delimiter , > columns key,value > columns.comments 'default','default' > columns.types string:string > #### A masked pattern was here #### > name default.src > numFiles 1 > numRows 500 > rawDataSize 5312 > serialization.ddl struct src { string key, string value} > serialization.format 1 > serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > totalSize 5812 > #### A masked pattern was here #### > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > properties: > COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"key":"true","value":"true"}} > bucket_count -1 > bucketing_version 2 > column.name.delimiter , > columns key,value > columns.comments 'default','default' > columns.types string:string > #### A masked pattern was here #### > name default.src > numFiles 1 > numRows 500 > rawDataSize 5312 > serialization.ddl struct src { string key, string value} > serialization.format 1 > serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > totalSize 5812 > #### A masked pattern was here #### > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: default.src > name: default.src > Truncated Path -> Alias: > /src [src1] > Map 4 > Map Operator Tree: > TableScan > alias: src2 > filterExpr: key is not null (type: boolean) > Statistics: Num rows: 500 Data size: 89000 Basic stats: COMPLETE Column stats: COMPLETE > GatherStats: false > Filter Operator > isSamplingPred: false > predicate: key is not null (type: boolean) > Statistics: Num rows: 500 Data size: 89000 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: key (type: string), value (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 500 Data size: 89000 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: string) > null sort order: a > sort order: + > Map-reduce partition columns: _col0 (type: string) > Statistics: Num rows: 500 Data size: 89000 Basic stats: COMPLETE Column stats: COMPLETE > tag: 1 > value expressions: _col1 (type: string) > auto parallelism: true > Execution mode: vectorized, llap > LLAP IO: no inputs > Path -> Alias: > #### A masked pattern was here #### > Path -> Partition: > #### A masked pattern was here #### > Partition > base file name: src > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > properties: > COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"key":"true","value":"true"}} > bucket_count -1 > bucketing_version 2 > column.name.delimiter , > columns key,value > columns.comments 'default','default' > columns.types string:string > #### A masked pattern was here #### > name default.src > numFiles 1 > numRows 500 > rawDataSize 5312 > serialization.ddl struct src { string key, string value} > serialization.format 1 > serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > totalSize 5812 > #### A masked pattern was here #### > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > properties: > COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"key":"true","value":"true"}} > bucket_count -1 > bucketing_version 2 > column.name.delimiter , > columns key,value > columns.comments 'default','default' > columns.types string:string > #### A masked pattern was here #### > name default.src > numFiles 1 > numRows 500 > rawDataSize 5312 > serialization.ddl struct src { string key, string value} > serialization.format 1 > serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > totalSize 5812 > #### A masked pattern was here #### > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > name: default.src > name: default.src > Truncated Path -> Alias: > /src [src2] > Reducer 2 > Execution mode: llap > Needs Tagging: false > Reduce Operator Tree: > Merge Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col0 (type: string) > 1 _col0 (type: string) > outputColumnNames: _col0, _col2 > Position of Big Table: 1 > Statistics: Num rows: 791 Data size: 140798 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col0 (type: string), _col2 (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 791 Data size: 140798 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: string) > null sort order: z > sort order: + > Statistics: Num rows: 791 Data size: 140798 Basic stats: COMPLETE Column stats: COMPLETE > tag: -1 > value expressions: _col1 (type: string) > auto parallelism: false > Reducer 3 > Execution mode: vectorized, llap > Needs Tagging: false > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: string), VALUE._col0 (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 791 Data size: 140798 Basic stats: COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > GlobalTableId: 0 > #### A masked pattern was here #### > NumFilesPerFileSink: 1 > Statistics: Num rows: 791 Data size: 140798 Basic stats: COMPLETE Column stats: COMPLETE > #### A masked pattern was here #### > table: > input format: org.apache.hadoop.mapred.SequenceFileInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > properties: > columns _col0,_col1 > columns.types string:string > escape.delim \ > hive.serialization.extend.additional.nesting.levels true > serialization.escape.crlf true > serialization.format 1 > serialization.lib org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > TotalFiles: 1 > GatherStats: false > MultiFileSpray: false > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)