From issues-return-128444-archive-asf-public=cust-asf.ponee.io@hive.apache.org Thu Jul 12 19:29:06 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 93378180654 for ; Thu, 12 Jul 2018 19:29:05 +0200 (CEST) Received: (qmail 31297 invoked by uid 500); 12 Jul 2018 17:29:04 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 31288 invoked by uid 99); 12 Jul 2018 17:29:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Jul 2018 17:29:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 421DCC6D96 for ; Thu, 12 Jul 2018 17:29:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -108.3 X-Spam-Level: X-Spam-Status: No, score=-108.3 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_BADIPHTTP=2, NORMAL_HTTP_TO_IP=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id WoflUXTRfgq9 for ; Thu, 12 Jul 2018 17:29:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 5F6125F117 for ; Thu, 12 Jul 2018 17:29:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 101AEE2568 for ; Thu, 12 Jul 2018 17:29:01 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 5525223F9F for ; Thu, 12 Jul 2018 17:29:00 +0000 (UTC) Date: Thu, 12 Jul 2018 17:29:00 +0000 (UTC) From: "Hive QA (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-19940) Push predicates with deterministic UDFs with RBO MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-19940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541995#comment-16541995 ] Hive QA commented on HIVE-19940: -------------------------------- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12931239/HIVE-19940.3.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/12561/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/12561/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-12561/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2018-07-12 17:27:41.500 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-12561/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2018-07-12 17:27:41.504 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 5ade740 HIVE-20088: Beeline config location path is assembled incorrectly (Denes Bodo via Zoltan Haindrich) + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 5ade740 HIVE-20088: Beeline config location path is assembled incorrectly (Denes Bodo via Zoltan Haindrich) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2018-07-12 17:27:42.170 + rm -rf ../yetus_PreCommit-HIVE-Build-12561 + mkdir ../yetus_PreCommit-HIVE-Build-12561 + git gc + cp -R . ../yetus_PreCommit-HIVE-Build-12561 + mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-12561/yetus + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: ql/src/test/results/clientpositive/masking_disablecbo_2.q.out:555 Falling back to three-way merge... Applied patch to 'ql/src/test/results/clientpositive/masking_disablecbo_2.q.out' with conflicts. Going to apply patch with: git apply -p0 /data/hiveptest/working/scratch/build.patch:420: trailing whitespace. sort order: /data/hiveptest/working/scratch/build.patch:635: trailing whitespace. sort order: /data/hiveptest/working/scratch/build.patch:1715: trailing whitespace. Reducer 11 error: patch failed: ql/src/test/results/clientpositive/masking_disablecbo_2.q.out:555 Falling back to three-way merge... Applied patch to 'ql/src/test/results/clientpositive/masking_disablecbo_2.q.out' with conflicts. /data/hiveptest/working/scratch/build.patch:3290: new blank line at EOF. + U ql/src/test/results/clientpositive/masking_disablecbo_2.q.out warning: 4 lines add whitespace errors. + result=1 + '[' 1 -ne 0 ']' + rm -rf yetus_PreCommit-HIVE-Build-12561 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12931239 - PreCommit-HIVE-Build > Push predicates with deterministic UDFs with RBO > ------------------------------------------------ > > Key: HIVE-19940 > URL: https://issues.apache.org/jira/browse/HIVE-19940 > Project: Hive > Issue Type: Improvement > Reporter: Janaki Lahorani > Assignee: Janaki Lahorani > Priority: Major > Attachments: HIVE-19940.1.patch, HIVE-19940.2.patch, HIVE-19940.3.patch > > > With RBO, predicates with any UDF doesn't get pushed down. It makes sense to not pushdown the predicates with non-deterministic function as the meaning of the query changes after the predicate is resolved to use the function. But pushing a deterministic function is beneficial. > Test Case: > {code} > set hive.cbo.enable=false; > CREATE TABLE `testb`( > `cola` string COMMENT '', > `colb` string COMMENT '', > `colc` string COMMENT '') > PARTITIONED BY ( > `part1` string, > `part2` string, > `part3` string) > STORED AS AVRO; > CREATE TABLE `testa`( > `col1` string COMMENT '', > `col2` string COMMENT '', > `col3` string COMMENT '', > `col4` string COMMENT '', > `col5` string COMMENT '') > PARTITIONED BY ( > `part1` string, > `part2` string, > `part3` string) > STORED AS AVRO; > insert into testA partition (part1='US', part2='ABC', part3='123') > values ('12.34', '100', '200', '300', 'abc'), > ('12.341', '1001', '2001', '3001', 'abcd'); > insert into testA partition (part1='UK', part2='DEF', part3='123') > values ('12.34', '100', '200', '300', 'abc'), > ('12.341', '1001', '2001', '3001', 'abcd'); > insert into testA partition (part1='US', part2='DEF', part3='200') > values ('12.34', '100', '200', '300', 'abc'), > ('12.341', '1001', '2001', '3001', 'abcd'); > insert into testA partition (part1='CA', part2='ABC', part3='300') > values ('12.34', '100', '200', '300', 'abc'), > ('12.341', '1001', '2001', '3001', 'abcd'); > insert into testB partition (part1='CA', part2='ABC', part3='300') > values ('600', '700', 'abc'), ('601', '701', 'abcd'); > insert into testB partition (part1='CA', part2='ABC', part3='400') > values ( '600', '700', 'abc'), ( '601', '701', 'abcd'); > insert into testB partition (part1='UK', part2='PQR', part3='500') > values ('600', '700', 'abc'), ('601', '701', 'abcd'); > insert into testB partition (part1='US', part2='DEF', part3='200') > values ( '600', '700', 'abc'), ('601', '701', 'abcd'); > insert into testB partition (part1='US', part2='PQR', part3='123') > values ( '600', '700', 'abc'), ('601', '701', 'abcd'); > -- views with deterministic functions > create view viewDeterministicUDFA partitioned on (vpart1, vpart2, vpart3) as select > cast(col1 as decimal(38,18)) as vcol1, > cast(col2 as decimal(38,18)) as vcol2, > cast(col3 as decimal(38,18)) as vcol3, > cast(col4 as decimal(38,18)) as vcol4, > cast(col5 as char(10)) as vcol5, > cast(part1 as char(2)) as vpart1, > cast(part2 as char(3)) as vpart2, > cast(part3 as char(3)) as vpart3 > from testa > where part1 in ('US', 'CA'); > create view viewDeterministicUDFB partitioned on (vpart1, vpart2, vpart3) as select > cast(cola as decimal(38,18)) as vcolA, > cast(colb as decimal(38,18)) as vcolB, > cast(colc as char(10)) as vcolC, > cast(part1 as char(2)) as vpart1, > cast(part2 as char(3)) as vpart2, > cast(part3 as char(3)) as vpart3 > from testb > where part1 in ('US', 'CA'); > explain > select vcol1, vcol2, vcol3, vcola, vcolb > from viewDeterministicUDFA a inner join viewDeterministicUDFB b > on a.vpart1 = b.vpart1 > and a.vpart2 = b.vpart2 > and a.vpart3 = b.vpart3 > and a.vpart1 = 'US' > and a.vpart2 = 'DEF' > and a.vpart3 = '200'; > {code} > Plan where the CAST is not pushed down. > {code} > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Map Operator Tree: > TableScan > alias: testa > filterExpr: (part1) IN ('US', 'CA') (type: boolean) > Statistics: Num rows: 6 Data size: 13740 Basic stats: COMPLETE Column stats: NONE > Select Operator > expressions: CAST( col1 AS decimal(38,18)) (type: decimal(38,18)), CAST( col2 AS decimal(38,18)) (type: decimal(38,18)), CAST( col3 AS decimal(38,18)) (type: decimal(38,18)), CAST( part1 AS CHAR(2)) (type: char(2)), CAST( part2 AS CHAR(3)) (type: char(3)), CAST( part3 AS CHAR(3)) (type: char(3)) > outputColumnNames: _col0, _col1, _col2, _col5, _col6, _col7 > Statistics: Num rows: 6 Data size: 13740 Basic stats: COMPLETE Column stats: NONE > Filter Operator > predicate: ((_col5 = 'US') and (_col6 = 'DEF') and (_col7 = '200')) (type: boolean) > Statistics: Num rows: 1 Data size: 2290 Basic stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: 'US' (type: char(2)), 'DEF' (type: char(3)), '200' (type: char(3)) > sort order: +++ > Map-reduce partition columns: 'US' (type: char(2)), 'DEF' (type: char(3)), '200' (type: char(3)) > Statistics: Num rows: 1 Data size: 2290 Basic stats: COMPLETE Column stats: NONE > value expressions: _col0 (type: decimal(38,18)), _col1 (type: decimal(38,18)), _col2 (type: decimal(38,18)) > TableScan > alias: testb > filterExpr: (part1) IN ('US', 'CA') (type: boolean) > Statistics: Num rows: 8 Data size: 12720 Basic stats: COMPLETE Column stats: NONE > Select Operator > expressions: CAST( cola AS decimal(38,18)) (type: decimal(38,18)), CAST( colb AS decimal(38,18)) (type: decimal(38,18)), CAST( part1 AS CHAR(2)) (type: char(2)), CAST( part2 AS CHAR(3)) (type: char(3)), CAST( part3 AS CHAR(3)) (type: char(3)) > outputColumnNames: _col0, _col1, _col3, _col4, _col5 > Statistics: Num rows: 8 Data size: 12720 Basic stats: COMPLETE Column stats: NONE > Filter Operator > predicate: ((_col5 = '200') and _col3 is not null and _col4 is not null) (type: boolean) > Statistics: Num rows: 4 Data size: 6360 Basic stats: COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col3 (type: char(2)), _col4 (type: char(3)), '200' (type: char(3)) > sort order: +++ > Map-reduce partition columns: _col3 (type: char(2)), _col4 (type: char(3)), '200' (type: char(3)) > Statistics: Num rows: 4 Data size: 6360 Basic stats: COMPLETE Column stats: NONE > value expressions: _col0 (type: decimal(38,18)), _col1 (type: decimal(38,18)) > Reduce Operator Tree: > Join Operator > condition map: > Inner Join 0 to 1 > keys: > 0 _col5 (type: char(2)), _col6 (type: char(3)), _col7 (type: char(3)) > 1 _col3 (type: char(2)), _col4 (type: char(3)), _col5 (type: char(3)) > outputColumnNames: _col0, _col1, _col2, _col8, _col9 > Statistics: Num rows: 4 Data size: 6996 Basic stats: COMPLETE Column stats: NONE > Select Operator > expressions: _col0 (type: decimal(38,18)), _col1 (type: decimal(38,18)), _col2 (type: decimal(38,18)), _col8 (type: decimal(38,18)), _col9 (type: decimal(38,18)) > outputColumnNames: _col0, _col1, _col2, _col3, _col4 > Statistics: Num rows: 4 Data size: 6996 Basic stats: COMPLETE Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 4 Data size: 6996 Basic stats: COMPLETE Column stats: NONE > table: > input format: org.apache.hadoop.mapred.SequenceFileInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)