Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0E6AF1826E for ; Tue, 6 Oct 2015 13:55:27 +0000 (UTC) Received: (qmail 28771 invoked by uid 500); 6 Oct 2015 13:55:26 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 28748 invoked by uid 500); 6 Oct 2015 13:55:26 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 28645 invoked by uid 99); 6 Oct 2015 13:55:26 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Oct 2015 13:55:26 +0000 Date: Tue, 6 Oct 2015 13:55:26 +0000 (UTC) From: "Hive QA (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-9695) Redundant filter operator in reducer Vertex when CBO is disabled MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-9695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945051#comment-14945051 ] Hive QA commented on HIVE-9695: ------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12765164/HIVE-9695.01.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9639 tests executed *Failed tests:* {noformat} TestCliDriver-udf_bitmap_empty.q-multigroupby_singlemr.q-quotedid_basic.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5546/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5546/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5546/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12765164 - PreCommit-HIVE-TRUNK-Build > Redundant filter operator in reducer Vertex when CBO is disabled > ---------------------------------------------------------------- > > Key: HIVE-9695 > URL: https://issues.apache.org/jira/browse/HIVE-9695 > Project: Hive > Issue Type: Improvement > Components: Physical Optimizer > Affects Versions: 2.0.0 > Reporter: Mostafa Mokhtar > Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-9695.01.patch, HIVE-9695.patch > > > There is a redundant filter operator in reducer Vertex when CBO is disabled. > Query > {code} > select > ss_item_sk, ss_ticket_number, ss_store_sk > from > store_sales a, store_returns b, store > where > a.ss_item_sk = b.sr_item_sk > and a.ss_ticket_number = b.sr_ticket_number > and ss_sold_date_sk between 2450816 and 2451500 > and sr_returned_date_sk between 2450816 and 2451500 > and s_store_sk = ss_store_sk; > {code} > Plan snippet > {code} > Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (((((_col1 = _col27) and (_col8 = _col34)) and _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) and (_col49 = _col6)) (type: boolean) > {code} > Full plan with CBO disabled > {code} > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE), Map 3 (BROADCAST_EDGE), Map 4 (SIMPLE_EDGE) > DagName: mmokhtar_20150214182626_ad6820c7-b667-4652-ab25-cb60deed1a6d:13 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: b > filterExpr: ((sr_item_sk is not null and sr_ticket_number is not null) and sr_returned_date_sk BETWEEN 2450816 AND 2451500) (type: boolean) > Statistics: Num rows: 2370038095 Data size: 170506118656 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (sr_item_sk is not null and sr_ticket_number is not null) (type: boolean) > Statistics: Num rows: 706893063 Data size: 6498502768 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: sr_item_sk (type: int), sr_ticket_number (type: int) > sort order: ++ > Map-reduce partition columns: sr_item_sk (type: int), sr_ticket_number (type: int) > Statistics: Num rows: 706893063 Data size: 6498502768 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: sr_returned_date_sk (type: int) > Execution mode: vectorized > Map 3 > Map Operator Tree: > TableScan > alias: store > filterExpr: s_store_sk is not null (type: boolean) > Statistics: Num rows: 1704 Data size: 3256276 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: s_store_sk is not null (type: boolean) > Statistics: Num rows: 1704 Data size: 6816 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: s_store_sk (type: int) > sort order: + > Map-reduce partition columns: s_store_sk (type: int) > Statistics: Num rows: 1704 Data size: 6816 Basic stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Map 4 > Map Operator Tree: > TableScan > alias: a > filterExpr: (((ss_item_sk is not null and ss_ticket_number is not null) and ss_store_sk is not null) and ss_sold_date_sk BETWEEN 2450816 AND 2451500) (type: boolean) > Statistics: Num rows: 28878719387 Data size: 2405805439460 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((ss_item_sk is not null and ss_ticket_number is not null) and ss_store_sk is not null) (type: boolean) > Statistics: Num rows: 8405840828 Data size: 110101408700 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: ss_item_sk (type: int), ss_ticket_number (type: int) > sort order: ++ > Map-reduce partition columns: ss_item_sk (type: int), ss_ticket_number (type: int) > Statistics: Num rows: 8405840828 Data size: 110101408700 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: ss_store_sk (type: int), ss_sold_date_sk (type: int) > Execution mode: vectorized > Reducer 2 > Reduce Operator Tree: > Merge Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {KEY.reducesinkkey0} {VALUE._col5} {KEY.reducesinkkey1} {VALUE._col20} > 1 {KEY.reducesinkkey0} {KEY.reducesinkkey1} {VALUE._col17} > outputColumnNames: _col1, _col6, _col8, _col22, _col27, _col34, _col45 > Statistics: Num rows: 57439343 Data size: 1148786860 Basic stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {_col1} {_col6} {_col8} {_col22} {_col27} {_col34} {_col45} > 1 {s_store_sk} > keys: > 0 _col6 (type: int) > 1 s_store_sk (type: int) > outputColumnNames: _col1, _col6, _col8, _col22, _col27, _col34, _col45, _col49 > input vertices: > 1 Map 3 > Statistics: Num rows: 57439344 Data size: 1838059008 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (((((_col1 = _col27) and (_col8 = _col34)) and _col22 BETWEEN 2450816 AND 2451500) and _col45 BETWEEN 2450816 AND 2451500) and (_col49 = _col6)) (type: boolean) > Statistics: Num rows: 1794979 Data size: 57439328 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col1 (type: int), _col8 (type: int), _col6 (type: int) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 1794979 Data size: 21539748 Basic stats: COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 1794979 Data size: 21539748 Basic stats: COMPLETE Column stats: COMPLETE > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} > Full plan with CBO enabled > {code} > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Map 4 <- Map 1 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE), Map 4 (SIMPLE_EDGE) > DagName: mmokhtar_20150214182525_63a9838f-db9f-40e9-8ae1-77c77143dccf:12 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: store > filterExpr: s_store_sk is not null (type: boolean) > Statistics: Num rows: 1704 Data size: 3256276 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: s_store_sk is not null (type: boolean) > Statistics: Num rows: 1704 Data size: 6816 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: s_store_sk (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 1704 Data size: 6816 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 1704 Data size: 6816 Basic stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Map 2 > Map Operator Tree: > TableScan > alias: b > filterExpr: (sr_item_sk is not null and sr_ticket_number is not null) (type: boolean) > Statistics: Num rows: 2370038095 Data size: 170506118656 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (sr_item_sk is not null and sr_ticket_number is not null) (type: boolean) > Statistics: Num rows: 706893063 Data size: 3670930516 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: sr_item_sk (type: int), sr_ticket_number (type: int) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 706893063 Data size: 3670930516 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int), _col1 (type: int) > sort order: ++ > Map-reduce partition columns: _col0 (type: int), _col1 (type: int) > Statistics: Num rows: 706893063 Data size: 3670930516 Basic stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Map 4 > Map Operator Tree: > TableScan > alias: a > filterExpr: ((ss_store_sk is not null and ss_item_sk is not null) and ss_ticket_number is not null) (type: boolean) > Statistics: Num rows: 28878719387 Data size: 2405805439460 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((ss_store_sk is not null and ss_item_sk is not null) and ss_ticket_number is not null) (type: boolean) > Statistics: Num rows: 8405840828 Data size: 76478045388 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: ss_item_sk (type: int), ss_store_sk (type: int), ss_ticket_number (type: int) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 8405840828 Data size: 76478045388 Basic stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {_col0} {_col1} {_col2} > 1 > keys: > 0 _col1 (type: int) > 1 _col0 (type: int) > outputColumnNames: _col0, _col1, _col2 > input vertices: > 1 Map 1 > Statistics: Num rows: 8405840896 Data size: 100870090752 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int), _col2 (type: int) > sort order: ++ > Map-reduce partition columns: _col0 (type: int), _col2 (type: int) > Statistics: Num rows: 8405840896 Data size: 100870090752 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: int) > Execution mode: vectorized > Reducer 3 > Reduce Operator Tree: > Merge Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {KEY.reducesinkkey0} {VALUE._col0} {KEY.reducesinkkey1} > 1 > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 75912751 Data size: 910953012 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col0 (type: int), _col2 (type: int), _col1 (type: int) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 75912751 Data size: 910953012 Basic stats: COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 75912751 Data size: 910953012 Basic stats: COMPLETE Column stats: COMPLETE > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Stage: Stage-0 > Fetch Operator > limit: -1 > Processor Tree: > ListSink > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)