Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 17A791788D for ; Sun, 24 May 2015 15:12:18 +0000 (UTC) Received: (qmail 51923 invoked by uid 500); 24 May 2015 15:12:18 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 51799 invoked by uid 500); 24 May 2015 15:12:18 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 51545 invoked by uid 99); 24 May 2015 15:12:17 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 24 May 2015 15:12:17 +0000 Date: Sun, 24 May 2015 15:12:17 +0000 (UTC) From: "Hive QA (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-9069) Simplify filter predicates for CBO MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14557755#comment-14557755 ] Hive QA commented on HIVE-9069: ------------------------------- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12735066/HIVE-9069.13.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8974 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby3_map_multi_distinct {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4029/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4029/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4029/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12735066 - PreCommit-HIVE-TRUNK-Build > Simplify filter predicates for CBO > ---------------------------------- > > Key: HIVE-9069 > URL: https://issues.apache.org/jira/browse/HIVE-9069 > Project: Hive > Issue Type: Bug > Components: CBO > Affects Versions: 0.14.0 > Reporter: Mostafa Mokhtar > Assignee: Jesus Camacho Rodriguez > Fix For: 0.14.1 > > Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, HIVE-9069.08.patch, HIVE-9069.09.patch, HIVE-9069.10.patch, HIVE-9069.11.patch, HIVE-9069.12.patch, HIVE-9069.13.patch, HIVE-9069.patch > > > Simplify predicates for disjunctive predicates so that can get pushed down to the scan. > Looks like this is still an issue, some of the filters can be pushed down to the scan. > {code} > set hive.cbo.enable=true > set hive.stats.fetch.column.stats=true > set hive.exec.dynamic.partition.mode=nonstrict > set hive.tez.auto.reducer.parallelism=true > set hive.auto.convert.join.noconditionaltask.size=320000000 > set hive.exec.reducers.bytes.per.reducer=100000000 > set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager > set hive.support.concurrency=false > set hive.tez.exec.print.summary=true > explain > select substr(r_reason_desc,1,20) as r > ,avg(ws_quantity) wq > ,avg(wr_refunded_cash) ref > ,avg(wr_fee) fee > from web_sales, web_returns, web_page, customer_demographics cd1, > customer_demographics cd2, customer_address, date_dim, reason > where web_sales.ws_web_page_sk = web_page.wp_web_page_sk > and web_sales.ws_item_sk = web_returns.wr_item_sk > and web_sales.ws_order_number = web_returns.wr_order_number > and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998 > and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk > and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk > and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk > and reason.r_reason_sk = web_returns.wr_reason_sk > and > ( > ( > cd1.cd_marital_status = 'M' > and > cd1.cd_marital_status = cd2.cd_marital_status > and > cd1.cd_education_status = '4 yr Degree' > and > cd1.cd_education_status = cd2.cd_education_status > and > ws_sales_price between 100.00 and 150.00 > ) > or > ( > cd1.cd_marital_status = 'D' > and > cd1.cd_marital_status = cd2.cd_marital_status > and > cd1.cd_education_status = 'Primary' > and > cd1.cd_education_status = cd2.cd_education_status > and > ws_sales_price between 50.00 and 100.00 > ) > or > ( > cd1.cd_marital_status = 'U' > and > cd1.cd_marital_status = cd2.cd_marital_status > and > cd1.cd_education_status = 'Advanced Degree' > and > cd1.cd_education_status = cd2.cd_education_status > and > ws_sales_price between 150.00 and 200.00 > ) > ) > and > ( > ( > ca_country = 'United States' > and > ca_state in ('KY', 'GA', 'NM') > and ws_net_profit between 100 and 200 > ) > or > ( > ca_country = 'United States' > and > ca_state in ('MT', 'OR', 'IN') > and ws_net_profit between 150 and 300 > ) > or > ( > ca_country = 'United States' > and > ca_state in ('WI', 'MO', 'WV') > and ws_net_profit between 50 and 250 > ) > ) > group by r_reason_desc > order by r, wq, ref, fee > limit 100 > OK > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > Edges: > Map 9 <- Map 1 (BROADCAST_EDGE) > Reducer 3 <- Map 13 (SIMPLE_EDGE), Map 2 (SIMPLE_EDGE) > Reducer 4 <- Map 9 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE) > Reducer 5 <- Map 14 (SIMPLE_EDGE), Reducer 4 (SIMPLE_EDGE) > Reducer 6 <- Map 10 (SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map 12 (BROADCAST_EDGE), Reducer 5 (SIMPLE_EDGE) > Reducer 7 <- Reducer 6 (SIMPLE_EDGE) > Reducer 8 <- Reducer 7 (SIMPLE_EDGE) > DagName: mmokhtar_20141111161818_f5fd23ba-d783-4b13-8507-7faa65851798:1 > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: web_page > filterExpr: wp_web_page_sk is not null (type: boolean) > Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: wp_web_page_sk is not null (type: boolean) > Statistics: Num rows: 4602 Data size: 18408 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: wp_web_page_sk (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 4602 Data size: 18408 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 4602 Data size: 18408 Basic stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Map 10 > Map Operator Tree: > TableScan > alias: customer_address > filterExpr: ((ca_country = 'United States') and ca_address_sk is not null) (type: boolean) > Statistics: Num rows: 40000000 Data size: 40595195284 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((ca_country = 'United States') and ca_address_sk is not null) (type: boolean) > Statistics: Num rows: 20000000 Data size: 3740000000 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: ca_address_sk (type: int), ca_state (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 20000000 Data size: 1800000000 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 20000000 Data size: 1800000000 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: string) > Execution mode: vectorized > Map 11 > Map Operator Tree: > TableScan > alias: date_dim > filterExpr: ((d_year = 1998) and d_date_sk is not null) (type: boolean) > Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((d_year = 1998) and d_date_sk is not null) (type: boolean) > Statistics: Num rows: 652 Data size: 5216 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: d_date_sk (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 652 Data size: 2608 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 652 Data size: 2608 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col0 (type: int) > outputColumnNames: _col0 > Statistics: Num rows: 652 Data size: 2608 Basic stats: COMPLETE Column stats: COMPLETE > Group By Operator > keys: _col0 (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 326 Data size: 1304 Basic stats: COMPLETE Column stats: COMPLETE > Dynamic Partitioning Event Operator > Target Input: web_sales > Partition key expr: ws_sold_date_sk > Statistics: Num rows: 326 Data size: 1304 Basic stats: COMPLETE Column stats: COMPLETE > Target column: ws_sold_date_sk > Target Vertex: Map 9 > Execution mode: vectorized > Map 12 > Map Operator Tree: > TableScan > alias: reason > filterExpr: r_reason_sk is not null (type: boolean) > Statistics: Num rows: 72 Data size: 14400 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: r_reason_sk is not null (type: boolean) > Statistics: Num rows: 72 Data size: 7272 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: r_reason_sk (type: int), r_reason_desc (type: string) > outputColumnNames: _col0, _col1 > Statistics: Num rows: 72 Data size: 7272 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 72 Data size: 7272 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: string) > Execution mode: vectorized > Map 13 > Map Operator Tree: > TableScan > alias: web_returns > filterExpr: (((((wr_refunded_cdemo_sk is not null and wr_item_sk is not null) and wr_order_number is not null) and wr_returning_cdemo_sk is not null) and wr_refunded_addr_sk is not null) and wr_reason_sk is not null) (type: boolean) > Statistics: Num rows: 2062802370 Data size: 185695406284 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (((((wr_refunded_cdemo_sk is not null and wr_item_sk is not null) and wr_order_number is not null) and wr_returning_cdemo_sk is not null) and wr_refunded_addr_sk is not null) and wr_reason_sk is not null) (type: boolean) > Statistics: Num rows: 1875154722 Data size: 58944640412 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: wr_item_sk (type: int), wr_refunded_cdemo_sk (type: int), wr_refunded_addr_sk (type: int), wr_returning_cdemo_sk (type: int), wr_reason_sk (type: int), wr_order_number (type: int), wr_fee (type: float), wr_refunded_cash (type: float) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7 > Statistics: Num rows: 1875154722 Data size: 58944640412 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col1 (type: int) > sort order: + > Map-reduce partition columns: _col1 (type: int) > Statistics: Num rows: 1875154722 Data size: 58944640412 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col0 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: int), _col5 (type: int), _col6 (type: float), _col7 (type: float) > Execution mode: vectorized > Map 14 > Map Operator Tree: > TableScan > alias: cd1 > filterExpr: ((cd_demo_sk is not null and cd_marital_status is not null) and cd_education_status is not null) (type: boolean) > Statistics: Num rows: 1920800 Data size: 718379200 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((cd_demo_sk is not null and cd_marital_status is not null) and cd_education_status is not null) (type: boolean) > Statistics: Num rows: 1920800 Data size: 351506400 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: cd_demo_sk (type: int), cd_marital_status (type: string), cd_education_status (type: string) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 1920800 Data size: 351506400 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int), _col1 (type: string), _col2 (type: string) > sort order: +++ > Map-reduce partition columns: _col0 (type: int), _col1 (type: string), _col2 (type: string) > Statistics: Num rows: 1920800 Data size: 351506400 Basic stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized > Map 2 > Map Operator Tree: > TableScan > alias: cd1 > filterExpr: ((cd_demo_sk is not null and cd_marital_status is not null) and cd_education_status is not null) (type: boolean) > Statistics: Num rows: 1920800 Data size: 718379200 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((cd_demo_sk is not null and cd_marital_status is not null) and cd_education_status is not null) (type: boolean) > Statistics: Num rows: 1920800 Data size: 351506400 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: cd_demo_sk (type: int), cd_marital_status (type: string), cd_education_status (type: string) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 1920800 Data size: 351506400 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 1920800 Data size: 351506400 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: string), _col2 (type: string) > Execution mode: vectorized > Map 9 > Map Operator Tree: > TableScan > alias: web_sales > filterExpr: ((ws_web_page_sk is not null and ws_item_sk is not null) and ws_order_number is not null) (type: boolean) > Statistics: Num rows: 21594638446 Data size: 2850189889652 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: ((ws_web_page_sk is not null and ws_item_sk is not null) and ws_order_number is not null) (type: boolean) > Statistics: Num rows: 21591939929 Data size: 604541956128 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: ws_item_sk (type: int), ws_web_page_sk (type: int), ws_order_number (type: int), ws_quantity (type: int), ws_sales_price (type: float), ws_net_profit (type: float), ws_sold_date_sk (type: int) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 > Statistics: Num rows: 21591939929 Data size: 604541956128 Basic stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {_col0} {_col2} {_col3} {_col4} {_col5} {_col6} > 1 > keys: > 0 _col1 (type: int) > 1 _col0 (type: int) > outputColumnNames: _col0, _col2, _col3, _col4, _col5, _col6 > input vertices: > 1 Map 1 > Statistics: Num rows: 21591939072 Data size: 518206537728 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int), _col2 (type: int) > sort order: ++ > Map-reduce partition columns: _col0 (type: int), _col2 (type: int) > Statistics: Num rows: 21591939072 Data size: 518206537728 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col3 (type: int), _col4 (type: float), _col5 (type: float), _col6 (type: int) > Execution mode: vectorized > Reducer 3 > Reduce Operator Tree: > Merge Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {VALUE._col0} {VALUE._col1} {VALUE._col2} {VALUE._col3} {VALUE._col4} {VALUE._col5} {VALUE._col6} > 1 {VALUE._col0} {VALUE._col1} > outputColumnNames: _col0, _col2, _col3, _col4, _col5, _col6, _col7, _col9, _col10 > Statistics: Num rows: 1875154688 Data size: 373155782912 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col0 (type: int), _col10 (type: string), _col2 (type: int), _col3 (type: int), _col4 (type: int), _col5 (type: int), _col6 (type: float), _col7 (type: float), _col9 (type: string) > outputColumnNames: _col0, _col10, _col2, _col3, _col4, _col5, _col6, _col7, _col9 > Statistics: Num rows: 1875154688 Data size: 373155782912 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int), _col5 (type: int) > sort order: ++ > Map-reduce partition columns: _col0 (type: int), _col5 (type: int) > Statistics: Num rows: 1875154688 Data size: 373155782912 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col2 (type: int), _col3 (type: int), _col4 (type: int), _col6 (type: float), _col7 (type: float), _col9 (type: string), _col10 (type: string) > Reducer 4 > Reduce Operator Tree: > Merge Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {VALUE._col1} {VALUE._col2} {VALUE._col3} {VALUE._col4} > 1 {VALUE._col1} {VALUE._col2} {VALUE._col3} {VALUE._col4} {VALUE._col5} {VALUE._col7} {VALUE._col8} > outputColumnNames: _col3, _col4, _col5, _col6, _col10, _col11, _col12, _col14, _col15, _col17, _col18 > Statistics: Num rows: 57653145 Data size: 11472975855 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (((_col17 = 'M') and ((_col18 = '4 yr Degree') and _col4 BETWEEN 100.0 AND 150.0)) or (((_col17 = 'D') and ((_col18 = 'Primary') and _col4 BETWEEN 50.0 AND 100.0)) or ((_col17 = 'U') and ((_col18 = 'Advanced Degree') and _col4 BETWEEN 150.0 AND 200.0)))) (type: boolean) > Statistics: Num rows: 57653145 Data size: 11472975855 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col11 (type: int), _col12 (type: int), _col14 (type: float), _col15 (type: float), _col17 (type: string), _col18 (type: string), _col3 (type: int), _col5 (type: float), _col6 (type: int), _col10 (type: int) > outputColumnNames: _col10, _col11, _col13, _col14, _col17, _col18, _col3, _col5, _col6, _col9 > Statistics: Num rows: 57653145 Data size: 11472975855 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col10 (type: int), _col17 (type: string), _col18 (type: string) > sort order: +++ > Map-reduce partition columns: _col10 (type: int), _col17 (type: string), _col18 (type: string) > Statistics: Num rows: 57653145 Data size: 11472975855 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col3 (type: int), _col5 (type: float), _col6 (type: int), _col9 (type: int), _col11 (type: int), _col13 (type: float), _col14 (type: float) > Reducer 5 > Reduce Operator Tree: > Merge Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 > 1 {VALUE._col3} {VALUE._col5} {VALUE._col6} {VALUE._col9} {VALUE._col10} {VALUE._col12} {VALUE._col13} > outputColumnNames: _col6, _col8, _col9, _col12, _col14, _col16, _col17 > Statistics: Num rows: 3187317548 Data size: 50997080768 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col12 (type: int), _col14 (type: int), _col16 (type: float), _col17 (type: float), _col6 (type: int), _col8 (type: float), _col9 (type: int) > outputColumnNames: _col12, _col14, _col16, _col17, _col6, _col8, _col9 > Statistics: Num rows: 3187317548 Data size: 50997080768 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col12 (type: int) > sort order: + > Map-reduce partition columns: _col12 (type: int) > Statistics: Num rows: 3187317548 Data size: 50997080768 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col6 (type: int), _col8 (type: float), _col9 (type: int), _col14 (type: int), _col16 (type: float), _col17 (type: float) > Reducer 6 > Reduce Operator Tree: > Merge Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {VALUE._col0} > 1 {VALUE._col6} {VALUE._col8} {VALUE._col9} {VALUE._col13} {VALUE._col15} {VALUE._col16} > outputColumnNames: _col1, _col9, _col11, _col12, _col17, _col19, _col20 > Statistics: Num rows: 1593658752 Data size: 156178557696 Basic stats: COMPLETE Column stats: COMPLETE > Filter Operator > predicate: (((_col1) IN ('KY', 'GA', 'NM') and _col11 BETWEEN 100 AND 200) or (((_col1) IN ('MT', 'OR', 'IN') and _col11 BETWEEN 150 AND 300) or ((_col1) IN ('WI', 'MO', 'WV') and _col11 BETWEEN 50 AND 250))) (type: boolean) > Statistics: Num rows: 1195244064 Data size: 117133918272 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col17 (type: int), _col19 (type: float), _col20 (type: float), _col9 (type: int), _col12 (type: int) > outputColumnNames: _col11, _col13, _col14, _col3, _col6 > Statistics: Num rows: 1195244064 Data size: 14342928768 Basic stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 > 1 {_col3} {_col11} {_col13} {_col14} > keys: > 0 _col0 (type: int) > 1 _col6 (type: int) > outputColumnNames: _col5, _col13, _col15, _col16 > input vertices: > 0 Map 11 > Statistics: Num rows: 1334416318 Data size: 16012995816 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col13 (type: int), _col15 (type: float), _col16 (type: float), _col5 (type: int) > outputColumnNames: _col13, _col15, _col16, _col5 > Statistics: Num rows: 1334416318 Data size: 16012995816 Basic stats: COMPLETE Column stats: COMPLETE > Map Join Operator > condition map: > Inner Join 0 to 1 > condition expressions: > 0 {_col1} > 1 {_col5} {_col15} {_col16} > keys: > 0 _col0 (type: int) > 1 _col13 (type: int) > outputColumnNames: _col1, _col7, _col17, _col18 > input vertices: > 0 Map 12 > Statistics: Num rows: 1334416256 Data size: 140113706880 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: _col1 (type: string), _col7 (type: int), _col18 (type: float), _col17 (type: float) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 1334416256 Data size: 140113706880 Basic stats: COMPLETE Column stats: COMPLETE > Group By Operator > aggregations: avg(_col1), avg(_col2), avg(_col3) > keys: _col0 (type: string) > mode: hash > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 157024 Data size: 15231328 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: string) > sort order: + > Map-reduce partition columns: _col0 (type: string) > Statistics: Num rows: 157024 Data size: 15231328 Basic stats: COMPLETE Column stats: COMPLETE > value expressions: _col1 (type: struct), _col2 (type: struct), _col3 (type: struct) > Reducer 7 > Reduce Operator Tree: > Group By Operator > aggregations: avg(VALUE._col0), avg(VALUE._col1), avg(VALUE._col2) > keys: KEY._col0 (type: string) > mode: mergepartial > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 112 Data size: 13552 Basic stats: COMPLETE Column stats: COMPLETE > Select Operator > expressions: substr(_col0, 1, 20) (type: string), _col1 (type: double), _col2 (type: double), _col3 (type: double) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 112 Data size: 23296 Basic stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: string), _col1 (type: double), _col2 (type: double), _col3 (type: double) > sort order: ++++ > Statistics: Num rows: 112 Data size: 23296 Basic stats: COMPLETE Column stats: COMPLETE > TopN Hash Memory Usage: 0.04 > Reducer 8 > Reduce Operator Tree: > Select Operator > expressions: KEY.reducesinkkey0 (type: string), KEY.reducesinkkey1 (type: double), KEY.reducesinkkey2 (type: double), KEY.reducesinkkey3 (type: double) > outputColumnNames: _col0, _col1, _col2, _col3 > Statistics: Num rows: 112 Data size: 23296 Basic stats: COMPLETE Column stats: COMPLETE > Limit > Number of rows: 100 > Statistics: Num rows: 100 Data size: 20800 Basic stats: COMPLETE Column stats: COMPLETE > File Output Operator > compressed: false > Statistics: Num rows: 100 Data size: 20800 Basic stats: COMPLETE Column stats: COMPLETE > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe > Execution mode: vectorized > Stage: Stage-0 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)