hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gunther Hagleitner (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-8260) CBO : Query query has date_dim d1,date_dim d2 and date_dim d3 but the explain has d1, d1 and d1
Date Fri, 26 Sep 2014 01:01:34 GMT

     [ https://issues.apache.org/jira/browse/HIVE-8260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gunther Hagleitner updated HIVE-8260:
-------------------------------------
    Assignee: Laljo John Pullokkaran  (was: Gunther Hagleitner)

> CBO : Query query has date_dim d1,date_dim d2 and date_dim d3 but the explain has d1,
d1 and d1 
> ------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-8260
>                 URL: https://issues.apache.org/jira/browse/HIVE-8260
>             Project: Hive
>          Issue Type: Bug
>          Components: Physical Optimizer
>    Affects Versions: 0.14.0
>            Reporter: Mostafa Mokhtar
>            Assignee: Laljo John Pullokkaran
>             Fix For: 0.14.0
>
>
> For TPC-DS Q64 there is  date_dim d1,date_dim d2 and date_dim d3 but the explain has
d1, d1 and d1.
>  This is a simplified version of query 64 that demonstrates the same issue :
> {code}
> select count(*)
>   FROM   store_sales
>         JOIN store_returns ON store_sales.ss_item_sk = store_returns.sr_item_sk and store_sales.ss_ticket_number
= store_returns.sr_ticket_number
>         JOIN customer ON store_sales.ss_customer_sk = customer.c_customer_sk
>         JOIN date_dim d1 ON store_sales.ss_sold_date_sk = d1.d_date_sk
>         JOIN date_dim d2 ON customer.c_first_sales_date_sk = d2.d_date_sk 
>         JOIN date_dim d3 ON customer.c_first_shipto_date_sk = d3.d_date_sk
>         JOIN store ON store_sales.ss_store_sk = store.s_store_sk
>         JOIN customer_demographics cd1 ON store_sales.ss_cdemo_sk= cd1.cd_demo_sk
>         JOIN customer_demographics cd2 ON customer.c_current_cdemo_sk = cd2.cd_demo_sk
>         JOIN promotion ON store_sales.ss_promo_sk = promotion.p_promo_sk
>         JOIN household_demographics hd1 ON store_sales.ss_hdemo_sk = hd1.hd_demo_sk
>         JOIN household_demographics hd2 ON customer.c_current_hdemo_sk = hd2.hd_demo_sk
>         JOIN customer_address ad1 ON store_sales.ss_addr_sk = ad1.ca_address_sk
>         JOIN customer_address ad2 ON customer.c_current_addr_sk = ad2.ca_address_sk
>         JOIN income_band ib1 ON hd1.hd_income_band_sk = ib1.ib_income_band_sk
>         JOIN income_band ib2 ON hd2.hd_income_band_sk = ib2.ib_income_band_sk
>         JOIN item ON store_sales.ss_item_sk = item.i_item_sk
> {code}
> The plan generated 
> {code}
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
>       Edges:
>         Map 13 <- Map 10 (BROADCAST_EDGE), Map 11 (BROADCAST_EDGE), Map 12 (BROADCAST_EDGE),
Map 15 (BROADCAST_EDGE), Map 16 (BROADCAST_EDGE), Map 18 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE),
Map 3 (BROADCAST_EDGE), Map 8 (BROADCAST_EDGE)
>         Map 16 <- Map 7 (BROADCAST_EDGE)
>         Map 18 <- Map 1 (BROADCAST_EDGE), Map 17 (BROADCAST_EDGE), Map 4 (BROADCAST_EDGE),
Map 5 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)
>         Map 5 <- Map 6 (BROADCAST_EDGE)
>         Reducer 14 <- Map 13 (SIMPLE_EDGE)
>       DagName: mmokhtar_20140925180101_9c3b1d6b-61d3-44bc-a881-2beaf2ab143f:2
>       Vertices:
>         Map 1
>             Map Operator Tree:
>                 TableScan
>                   alias: cd1
>                   filterExpr: cd_demo_sk is not null (type: boolean)
>                   Statistics: Num rows: 1920800 Data size: 718379200 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Filter Operator
>                     predicate: cd_demo_sk is not null (type: boolean)
>                     Statistics: Num rows: 1920800 Data size: 7683200 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: cd_demo_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 1920800 Data size: 7683200 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 1920800 Data size: 7683200 Basic stats:
COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 10
>             Map Operator Tree:
>                 TableScan
>                   alias: item
>                   filterExpr: i_item_sk is not null (type: boolean)
>                   Statistics: Num rows: 48000 Data size: 68732712 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Filter Operator
>                     predicate: i_item_sk is not null (type: boolean)
>                     Statistics: Num rows: 48000 Data size: 192000 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: i_item_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 48000 Data size: 192000 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 48000 Data size: 192000 Basic stats: COMPLETE
Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 11
>             Map Operator Tree:
>                 TableScan
>                   alias: promotion
>                   filterExpr: p_promo_sk is not null (type: boolean)
>                   Statistics: Num rows: 450 Data size: 530848 Basic stats: COMPLETE Column
stats: COMPLETE
>                   Filter Operator
>                     predicate: p_promo_sk is not null (type: boolean)
>                     Statistics: Num rows: 450 Data size: 1800 Basic stats: COMPLETE Column
stats: COMPLETE
>                     Select Operator
>                       expressions: p_promo_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 450 Data size: 1800 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 450 Data size: 1800 Basic stats: COMPLETE
Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 12
>             Map Operator Tree:
>                 TableScan
>                   alias: cd1
>                   filterExpr: cd_demo_sk is not null (type: boolean)
>                   Statistics: Num rows: 1920800 Data size: 718379200 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Filter Operator
>                     predicate: cd_demo_sk is not null (type: boolean)
>                     Statistics: Num rows: 1920800 Data size: 7683200 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: cd_demo_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 1920800 Data size: 7683200 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 1920800 Data size: 7683200 Basic stats:
COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 13
>             Map Operator Tree:
>                 TableScan
>                   alias: store_sales
>                   filterExpr: ((((((((ss_hdemo_sk is not null and ss_item_sk is not null)
and ss_cdemo_sk is not null) and ss_sold_date_sk is not null) and ss_addr_sk is not null)
and ss_store_sk is not null) and ss_promo_sk is not null) and ss_customer_sk is not null)
and ss_ticket_number is not null) (type: boolean)
>                   Statistics: Num rows: 550076554 Data size: 24008004411 Basic stats:
COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: ((((((((ss_hdemo_sk is not null and ss_item_sk is not
null) and ss_cdemo_sk is not null) and ss_sold_date_sk is not null) and ss_addr_sk is not
null) and ss_store_sk is not null) and ss_promo_sk is not null) and ss_customer_sk is not
null) and ss_ticket_number is not null) (type: boolean)
>                     Statistics: Num rows: 476766966 Data size: 16894069044 Basic stats:
COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: ss_sold_date_sk (type: int), ss_item_sk (type: int),
ss_customer_sk (type: int), ss_cdemo_sk (type: int), ss_hdemo_sk (type: int), ss_addr_sk (type:
int), ss_store_sk (type: int), ss_promo_sk (type: int), ss_ticket_number (type: int)
>                       outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6,
_col7, _col8
>                       Statistics: Num rows: 476766966 Data size: 16894069044 Basic stats:
COMPLETE Column stats: COMPLETE
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         condition expressions:
>                           0 {_col0} {_col1} {_col2} {_col3} {_col5} {_col6} {_col7} {_col8}
>                           1
>                         keys:
>                           0 _col4 (type: int)
>                           1 _col0 (type: int)
>                         outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6,
_col7, _col8
>                         input vertices:
>                           1 Map 16
>                         Statistics: Num rows: 410166225 Data size: 13125319200 Basic
stats: COMPLETE Column stats: COMPLETE
>                         Map Join Operator
>                           condition map:
>                                Inner Join 0 to 1
>                           condition expressions:
>                             0 {_col0} {_col1} {_col2} {_col3} {_col5} {_col6} {_col7}
{_col8}
>                             1
>                           keys:
>                             0 _col1 (type: int)
>                             1 _col0 (type: int)
>                           outputColumnNames: _col0, _col1, _col2, _col3, _col5, _col6,
_col7, _col8
>                           input vertices:
>                             1 Map 10
>                           Statistics: Num rows: 314695482 Data size: 10070255424 Basic
stats: COMPLETE Column stats: COMPLETE
>                           Map Join Operator
>                             condition map:
>                                  Inner Join 0 to 1
>                             condition expressions:
>                               0 {_col0} {_col1} {_col2} {_col5} {_col6} {_col7} {_col8}
>                               1
>                             keys:
>                               0 _col3 (type: int)
>                               1 _col0 (type: int)
>                             outputColumnNames: _col0, _col1, _col2, _col5, _col6, _col7,
_col8
>                             input vertices:
>                               1 Map 12
>                             Statistics: Num rows: 329259309 Data size: 9219260652 Basic
stats: COMPLETE Column stats: COMPLETE
>                             Map Join Operator
>                               condition map:
>                                    Inner Join 0 to 1
>                               condition expressions:
>                                 0 {_col1} {_col2} {_col5} {_col6} {_col7} {_col8}
>                                 1
>                               keys:
>                                 0 _col0 (type: int)
>                                 1 _col0 (type: int)
>                               outputColumnNames: _col1, _col2, _col5, _col6, _col7, _col8
>                               input vertices:
>                                 1 Map 3
>                               Statistics: Num rows: 368151338 Data size: 8835632112 Basic
stats: COMPLETE Column stats: COMPLETE
>                               Map Join Operator
>                                 condition map:
>                                      Inner Join 0 to 1
>                                 condition expressions:
>                                   0 {_col1} {_col2} {_col6} {_col7} {_col8}
>                                   1
>                                 keys:
>                                   0 _col5 (type: int)
>                                   1 _col0 (type: int)
>                                 outputColumnNames: _col1, _col2, _col6, _col7, _col8
>                                 input vertices:
>                                   1 Map 2
>                                 Statistics: Num rows: 416100702 Data size: 8322014040
Basic stats: COMPLETE Column stats: COMPLETE
>                                 Map Join Operator
>                                   condition map:
>                                        Inner Join 0 to 1
>                                   condition expressions:
>                                     0 {_col1} {_col2} {_col7} {_col8}
>                                     1
>                                   keys:
>                                     0 _col6 (type: int)
>                                     1 _col0 (type: int)
>                                   outputColumnNames: _col1, _col2, _col7, _col8
>                                   input vertices:
>                                     1 Map 15
>                                   Statistics: Num rows: 512868307 Data size: 8205892912
Basic stats: COMPLETE Column stats: COMPLETE
>                                   Map Join Operator
>                                     condition map:
>                                          Inner Join 0 to 1
>                                     condition expressions:
>                                       0 {_col1} {_col2} {_col8}
>                                       1
>                                     keys:
>                                       0 _col7 (type: int)
>                                       1 _col0 (type: int)
>                                     outputColumnNames: _col1, _col2, _col8
>                                     input vertices:
>                                       1 Map 11
>                                     Statistics: Num rows: 1030315795 Data size: 12363789540
Basic stats: COMPLETE Column stats: COMPLETE
>                                     Map Join Operator
>                                       condition map:
>                                            Inner Join 0 to 1
>                                       condition expressions:
>                                         0 {_col1} {_col8}
>                                         1
>                                       keys:
>                                         0 _col2 (type: int)
>                                         1 _col1 (type: int)
>                                       outputColumnNames: _col1, _col8
>                                       input vertices:
>                                         1 Map 18
>                                       Statistics: Num rows: 2999114775 Data size: 23992918200
Basic stats: COMPLETE Column stats: COMPLETE
>                                       Map Join Operator
>                                         condition map:
>                                              Inner Join 0 to 1
>                                         condition expressions:
>                                           0
>                                           1
>                                         keys:
>                                           0 _col1 (type: int), _col8 (type: int)
>                                           1 _col0 (type: int), _col1 (type: int)
>                                         input vertices:
>                                           1 Map 8
>                                         Statistics: Num rows: 60227570 Data size: 0 Basic
stats: PARTIAL Column stats: COMPLETE
>                                         Select Operator
>                                           Statistics: Num rows: 60227570 Data size: 0
Basic stats: PARTIAL Column stats: COMPLETE
>                                           Group By Operator
>                                             aggregations: count()
>                                             mode: hash
>                                             outputColumnNames: _col0
>                                             Statistics: Num rows: 1 Data size: 8 Basic
stats: COMPLETE Column stats: COMPLETE
>                                             Reduce Output Operator
>                                               sort order:
>                                               Statistics: Num rows: 1 Data size: 8 Basic
stats: COMPLETE Column stats: COMPLETE
>                                               value expressions: _col0 (type: bigint)
>             Execution mode: vectorized
>         Map 15
>             Map Operator Tree:
>                 TableScan
>                   alias: store
>                   filterExpr: s_store_sk is not null (type: boolean)
>                   Statistics: Num rows: 212 Data size: 405680 Basic stats: COMPLETE Column
stats: COMPLETE
>                   Filter Operator
>                     predicate: s_store_sk is not null (type: boolean)
>                     Statistics: Num rows: 212 Data size: 848 Basic stats: COMPLETE Column
stats: COMPLETE
>                     Select Operator
>                       expressions: s_store_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 212 Data size: 848 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 212 Data size: 848 Basic stats: COMPLETE
Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 16
>             Map Operator Tree:
>                 TableScan
>                   alias: hd1
>                   filterExpr: (hd_income_band_sk is not null and hd_demo_sk is not null)
(type: boolean)
>                   Statistics: Num rows: 7200 Data size: 799 Basic stats: COMPLETE Column
stats: COMPLETE
>                   Filter Operator
>                     predicate: (hd_income_band_sk is not null and hd_demo_sk is not null)
(type: boolean)
>                     Statistics: Num rows: 7200 Data size: 57600 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: hd_demo_sk (type: int), hd_income_band_sk (type: int)
>                       outputColumnNames: _col0, _col1
>                       Statistics: Num rows: 7200 Data size: 57600 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         condition expressions:
>                           0 {_col0}
>                           1
>                         keys:
>                           0 _col1 (type: int)
>                           1 _col0 (type: int)
>                         outputColumnNames: _col0
>                         input vertices:
>                           1 Map 7
>                         Statistics: Num rows: 8000 Data size: 32000 Basic stats: COMPLETE
Column stats: COMPLETE
>                         Select Operator
>                           expressions: _col0 (type: int)
>                           outputColumnNames: _col0
>                           Statistics: Num rows: 8000 Data size: 32000 Basic stats: COMPLETE
Column stats: COMPLETE
>                           Reduce Output Operator
>                             key expressions: _col0 (type: int)
>                             sort order: +
>                             Map-reduce partition columns: _col0 (type: int)
>                             Statistics: Num rows: 8000 Data size: 32000 Basic stats:
COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 17
>             Map Operator Tree:
>                 TableScan
>                   alias: d1
>                   filterExpr: d_date_sk is not null (type: boolean)
>                   Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Filter Operator
>                     predicate: d_date_sk is not null (type: boolean)
>                     Statistics: Num rows: 73049 Data size: 292196 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: d_date_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 73049 Data size: 292196 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 73049 Data size: 292196 Basic stats: COMPLETE
Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 18
>             Map Operator Tree:
>                 TableScan
>                   alias: customer
>                   filterExpr: (((((c_current_hdemo_sk is not null and c_current_cdemo_sk
is not null) and c_first_sales_date_sk is not null) and c_first_shipto_date_sk is not null)
and c_current_addr_sk is not null) and c_customer_sk is not null) (type: boolean)
>                   Statistics: Num rows: 1600000 Data size: 1376033128 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (((((c_current_hdemo_sk is not null and c_current_cdemo_sk
is not null) and c_first_sales_date_sk is not null) and c_first_shipto_date_sk is not null)
and c_current_addr_sk is not null) and c_customer_sk is not null) (type: boolean)
>                     Statistics: Num rows: 1387731 Data size: 32529348 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: c_customer_sk (type: int), c_current_cdemo_sk (type:
int), c_current_hdemo_sk (type: int), c_current_addr_sk (type: int), c_first_shipto_date_sk
(type: int), c_first_sales_date_sk (type: int)
>                       outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>                       Statistics: Num rows: 1387731 Data size: 32529348 Basic stats:
COMPLETE Column stats: COMPLETE
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         condition expressions:
>                           0 {_col0} {_col1} {_col3} {_col4} {_col5}
>                           1
>                         keys:
>                           0 _col2 (type: int)
>                           1 _col0 (type: int)
>                         outputColumnNames: _col0, _col1, _col3, _col4, _col5
>                         input vertices:
>                           1 Map 5
>                         Statistics: Num rows: 1193875 Data size: 23877500 Basic stats:
COMPLETE Column stats: COMPLETE
>                         Select Operator
>                           expressions: _col0 (type: int), _col1 (type: int), _col3 (type:
int), _col4 (type: int), _col5 (type: int)
>                           outputColumnNames: _col0, _col1, _col3, _col4, _col5
>                           Statistics: Num rows: 1193875 Data size: 23877500 Basic stats:
COMPLETE Column stats: COMPLETE
>                           Map Join Operator
>                             condition map:
>                                  Inner Join 0 to 1
>                             condition expressions:
>                               0
>                               1 {_col0} {_col3} {_col4} {_col5}
>                             keys:
>                               0 _col0 (type: int)
>                               1 _col1 (type: int)
>                             outputColumnNames: _col1, _col4, _col5, _col6
>                             input vertices:
>                               0 Map 1
>                             Statistics: Num rows: 2529344 Data size: 40469504 Basic stats:
COMPLETE Column stats: COMPLETE
>                             Map Join Operator
>                               condition map:
>                                    Inner Join 0 to 1
>                               condition expressions:
>                                 0 {_col1} {_col4} {_col5}
>                                 1
>                               keys:
>                                 0 _col6 (type: int)
>                                 1 _col0 (type: int)
>                               outputColumnNames: _col1, _col4, _col5
>                               input vertices:
>                                 1 Map 17
>                               Statistics: Num rows: 2828109 Data size: 33937308 Basic
stats: COMPLETE Column stats: COMPLETE
>                               Map Join Operator
>                                 condition map:
>                                      Inner Join 0 to 1
>                                 condition expressions:
>                                   0 {_col1} {_col4}
>                                   1
>                                 keys:
>                                   0 _col5 (type: int)
>                                   1 _col0 (type: int)
>                                 outputColumnNames: _col1, _col4
>                                 input vertices:
>                                   1 Map 4
>                                 Statistics: Num rows: 3162164 Data size: 25297312 Basic
stats: COMPLETE Column stats: COMPLETE
>                                 Map Join Operator
>                                   condition map:
>                                        Inner Join 0 to 1
>                                   condition expressions:
>                                     0 {_col1}
>                                     1
>                                   keys:
>                                     0 _col4 (type: int)
>                                     1 _col0 (type: int)
>                                   outputColumnNames: _col1
>                                   input vertices:
>                                     1 Map 9
>                                   Statistics: Num rows: 3574015 Data size: 14296060 Basic
stats: COMPLETE Column stats: COMPLETE
>                                   Select Operator
>                                     expressions: _col1 (type: int)
>                                     outputColumnNames: _col1
>                                     Statistics: Num rows: 3574015 Data size: 14296060
Basic stats: COMPLETE Column stats: COMPLETE
>                                     Reduce Output Operator
>                                       key expressions: _col1 (type: int)
>                                       sort order: +
>                                       Map-reduce partition columns: _col1 (type: int)
>                                       Statistics: Num rows: 3574015 Data size: 14296060
Basic stats: COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 2
>             Map Operator Tree:
>                 TableScan
>                   alias: ad1
>                   filterExpr: ca_address_sk is not null (type: boolean)
>                   Statistics: Num rows: 800000 Data size: 811903688 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Filter Operator
>                     predicate: ca_address_sk is not null (type: boolean)
>                     Statistics: Num rows: 800000 Data size: 3200000 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: ca_address_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 800000 Data size: 3200000 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 800000 Data size: 3200000 Basic stats:
COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 3
>             Map Operator Tree:
>                 TableScan
>                   alias: d1
>                   filterExpr: d_date_sk is not null (type: boolean)
>                   Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Filter Operator
>                     predicate: d_date_sk is not null (type: boolean)
>                     Statistics: Num rows: 73049 Data size: 292196 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: d_date_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 73049 Data size: 292196 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 73049 Data size: 292196 Basic stats: COMPLETE
Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 4
>             Map Operator Tree:
>                 TableScan
>                   alias: d1
>                   filterExpr: d_date_sk is not null (type: boolean)
>                   Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Filter Operator
>                     predicate: d_date_sk is not null (type: boolean)
>                     Statistics: Num rows: 73049 Data size: 292196 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: d_date_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 73049 Data size: 292196 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 73049 Data size: 292196 Basic stats: COMPLETE
Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 5
>             Map Operator Tree:
>                 TableScan
>                   alias: hd1
>                   filterExpr: (hd_income_band_sk is not null and hd_demo_sk is not null)
(type: boolean)
>                   Statistics: Num rows: 7200 Data size: 799 Basic stats: COMPLETE Column
stats: COMPLETE
>                   Filter Operator
>                     predicate: (hd_income_band_sk is not null and hd_demo_sk is not null)
(type: boolean)
>                     Statistics: Num rows: 7200 Data size: 57600 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: hd_demo_sk (type: int), hd_income_band_sk (type: int)
>                       outputColumnNames: _col0, _col1
>                       Statistics: Num rows: 7200 Data size: 57600 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         condition expressions:
>                           0 {_col0}
>                           1
>                         keys:
>                           0 _col1 (type: int)
>                           1 _col0 (type: int)
>                         outputColumnNames: _col0
>                         input vertices:
>                           1 Map 6
>                         Statistics: Num rows: 8000 Data size: 32000 Basic stats: COMPLETE
Column stats: COMPLETE
>                         Select Operator
>                           expressions: _col0 (type: int)
>                           outputColumnNames: _col0
>                           Statistics: Num rows: 8000 Data size: 32000 Basic stats: COMPLETE
Column stats: COMPLETE
>                           Reduce Output Operator
>                             key expressions: _col0 (type: int)
>                             sort order: +
>                             Map-reduce partition columns: _col0 (type: int)
>                             Statistics: Num rows: 8000 Data size: 32000 Basic stats:
COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 6
>             Map Operator Tree:
>                 TableScan
>                   alias: ib1
>                   filterExpr: ib_income_band_sk is not null (type: boolean)
>                   Statistics: Num rows: 20 Data size: 240 Basic stats: COMPLETE Column
stats: COMPLETE
>                   Filter Operator
>                     predicate: ib_income_band_sk is not null (type: boolean)
>                     Statistics: Num rows: 20 Data size: 80 Basic stats: COMPLETE Column
stats: COMPLETE
>                     Select Operator
>                       expressions: ib_income_band_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 20 Data size: 80 Basic stats: COMPLETE Column
stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 20 Data size: 80 Basic stats: COMPLETE
Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 7
>             Map Operator Tree:
>                 TableScan
>                   alias: ib1
>                   filterExpr: ib_income_band_sk is not null (type: boolean)
>                   Statistics: Num rows: 20 Data size: 240 Basic stats: COMPLETE Column
stats: COMPLETE
>                   Filter Operator
>                     predicate: ib_income_band_sk is not null (type: boolean)
>                     Statistics: Num rows: 20 Data size: 80 Basic stats: COMPLETE Column
stats: COMPLETE
>                     Select Operator
>                       expressions: ib_income_band_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 20 Data size: 80 Basic stats: COMPLETE Column
stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 20 Data size: 80 Basic stats: COMPLETE
Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 8
>             Map Operator Tree:
>                 TableScan
>                   alias: store_returns
>                   filterExpr: (sr_item_sk is not null and sr_ticket_number is not null)
(type: boolean)
>                   Statistics: Num rows: 55578005 Data size: 4377627636 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (sr_item_sk is not null and sr_ticket_number is not null)
(type: boolean)
>                     Statistics: Num rows: 55578005 Data size: 444624040 Basic stats:
COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: sr_item_sk (type: int), sr_ticket_number (type: int)
>                       outputColumnNames: _col0, _col1
>                       Statistics: Num rows: 55578005 Data size: 444624040 Basic stats:
COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int), _col1 (type: int)
>                         sort order: ++
>                         Map-reduce partition columns: _col0 (type: int), _col1 (type:
int)
>                         Statistics: Num rows: 55578005 Data size: 444624040 Basic stats:
COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 9
>             Map Operator Tree:
>                 TableScan
>                   alias: ad1
>                   filterExpr: ca_address_sk is not null (type: boolean)
>                   Statistics: Num rows: 800000 Data size: 811903688 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Filter Operator
>                     predicate: ca_address_sk is not null (type: boolean)
>                     Statistics: Num rows: 800000 Data size: 3200000 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Select Operator
>                       expressions: ca_address_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 800000 Data size: 3200000 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 800000 Data size: 3200000 Basic stats:
COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Reducer 14
>             Reduce Operator Tree:
>               Group By Operator
>                 aggregations: count(VALUE._col0)
>                 mode: mergepartial
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats:
COMPLETE
>                 Select Operator
>                   expressions: _col0 (type: bigint)
>                   outputColumnNames: _col0
>                   Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats:
COMPLETE
>                   File Output Operator
>                     compressed: false
>                     Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column
stats: COMPLETE
>                     table:
>                         input format: org.apache.hadoop.mapred.TextInputFormat
>                         output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                         serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>             Execution mode: vectorized
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message