hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mostafa Mokhtar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-9604) CBO : Presence of hybrid join condition sets of join order optimizations
Date Fri, 06 Feb 2015 23:11:35 GMT

     [ https://issues.apache.org/jira/browse/HIVE-9604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mostafa Mokhtar updated HIVE-9604:
----------------------------------
          Component/s: CBO
    Affects Version/s: 0.14.0

> CBO : Presence of hybrid join condition sets of join order optimizations
> ------------------------------------------------------------------------
>
>                 Key: HIVE-9604
>                 URL: https://issues.apache.org/jira/browse/HIVE-9604
>             Project: Hive
>          Issue Type: New Feature
>          Components: CBO
>    Affects Versions: 0.14.0
>            Reporter: Mostafa Mokhtar
>            Assignee: Laljo John Pullokkaran
>
> When a query has a join between two tables on hybrid join condition (conjunction + disjunction) the resulting join order from CBO is suboptimal.
> Re-writing the query results in 9x performance improvement. 
> This was observed in several TPC-DS queries like Q72 and Q64.
> The culprit join conditions are :
> {code}
> catalog_sales.cs_item_sk = inventory.inv_item_sk
>   and inv_quantity_on_hand < cs_quantity 
> {code}
> This is a simplified version of Q72.
> {code}
> select  count(*) total_cnt
> from catalog_sales
> join inventory on (catalog_sales.cs_item_sk = inventory.inv_item_sk)
> join warehouse on (warehouse.w_warehouse_sk=inventory.inv_warehouse_sk)
> join item on (item.i_item_sk = catalog_sales.cs_item_sk)
> join customer_demographics on (catalog_sales.cs_bill_cdemo_sk = customer_demographics.cd_demo_sk)
> join household_demographics on (catalog_sales.cs_bill_hdemo_sk = household_demographics.hd_demo_sk)
> join date_dim d1 on (catalog_sales.cs_sold_date_sk = d1.d_date_sk)
> join date_dim d2 on (inventory.inv_date_sk = d2.d_date_sk)
> join date_dim d3 on (catalog_sales.cs_ship_date_sk = d3.d_date_sk)
> where d1.d_week_seq = d2.d_week_seq
>   and inv_quantity_on_hand < cs_quantity 
>   and d3.d_date > d1.d_date + 5
>   and hd_buy_potential = '1001-5000'
>   and d1.d_year = 2001
>   and hd_buy_potential = '1001-5000'
>   and cd_marital_status = 'M'
>   and d1.d_year = 2001;
> {code}
> If the inventory table is moved down in the join order a more efficient plan is generated 
> Modified query
> {code}
> select  count(*) total_cnt
> from catalog_sales
> join item on (item.i_item_sk = catalog_sales.cs_item_sk)
> join customer_demographics on (catalog_sales.cs_bill_cdemo_sk = customer_demographics.cd_demo_sk)
> join household_demographics on (catalog_sales.cs_bill_hdemo_sk = household_demographics.hd_demo_sk)
> join date_dim d1 on (catalog_sales.cs_sold_date_sk = d1.d_date_sk)
> join date_dim d3 on (catalog_sales.cs_ship_date_sk = d3.d_date_sk)
> join inventory on (catalog_sales.cs_item_sk = inventory.inv_item_sk)
> join warehouse on (warehouse.w_warehouse_sk=inventory.inv_warehouse_sk)
> join date_dim d2 on (inventory.inv_date_sk = d2.d_date_sk)
> where d1.d_week_seq = d2.d_week_seq
>   and inv_quantity_on_hand < cs_quantity 
>   and d3.d_date > d1.d_date + 5
>   and hd_buy_potential = '1001-5000'
>   and d1.d_year = 2001
>   and hd_buy_potential = '1001-5000'
>   and cd_marital_status = 'M'
>   and d1.d_year = 2001;
> {code}
> Plan with base query notice how catalog_sales joins with inventory first
> {code}
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
>       Edges:
>         Map 3 <- Map 1 (BROADCAST_EDGE), Map 10 (BROADCAST_EDGE), Map 11 (BROADCAST_EDGE), Map 12 (BROADCAST_EDGE), Map 13 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)
>         Reducer 4 <- Map 3 (SIMPLE_EDGE)
>         Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
>       DagName: mmokhtar_20141015151414_a08eae06-7250-4833-9e1d-8e58eb69780e:1
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: d1
>                   filterExpr: (d_date_sk is not null and d_week_seq is not null) (type: boolean)
>                   Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (d_date_sk is not null and d_week_seq is not null) (type: boolean)
>                     Statistics: Num rows: 73049 Data size: 584392 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: d_date_sk (type: int), d_week_seq (type: int)
>                       outputColumnNames: _col0, _col1
>                       Statistics: Num rows: 73049 Data size: 584392 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int), _col1 (type: int)
>                         sort order: ++
>                         Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
>                         Statistics: Num rows: 73049 Data size: 584392 Basic stats: COMPLETE Column stats: COMPLETE
>                       Select Operator
>                         expressions: _col0 (type: int)
>                         outputColumnNames: _col0
>                         Statistics: Num rows: 73049 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE
>                         Group By Operator
>                           keys: _col0 (type: int)
>                           mode: hash
>                           outputColumnNames: _col0
>                           Statistics: Num rows: 73049 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE
>                           Dynamic Partitioning Event Operator
>                             Target Input: inventory
>                             Partition key expr: inv_date_sk
>                             Statistics: Num rows: 73049 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE
>                             Target column: inv_date_sk
>                             Target Vertex: Map 7
>             Execution mode: vectorized
>         Map 10 
>             Map Operator Tree:
>                 TableScan
>                   alias: catalog_returns
>                   Statistics: Num rows: 28798881 Data size: 2942039156 Basic stats: COMPLETE Column stats: COMPLETE
>                   Select Operator
>                     expressions: cr_item_sk (type: int), cr_order_number (type: int)
>                     outputColumnNames: _col0, _col1
>                     Statistics: Num rows: 28798881 Data size: 230391048 Basic stats: COMPLETE Column stats: COMPLETE
>                     Reduce Output Operator
>                       key expressions: _col0 (type: int), _col1 (type: int)
>                       sort order: ++
>                       Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
>                       Statistics: Num rows: 28798881 Data size: 230391048 Basic stats: COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 11 
>             Map Operator Tree:
>                 TableScan
>                   alias: customer_demographics
>                   filterExpr: ((cd_marital_status = 'M') and cd_demo_sk is not null) (type: boolean)
>                   Statistics: Num rows: 1920800 Data size: 718379200 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: ((cd_marital_status = 'M') and cd_demo_sk is not null) (type: boolean)
>                     Statistics: Num rows: 274400 Data size: 24421600 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: cd_demo_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 274400 Data size: 1097600 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 274400 Data size: 1097600 Basic stats: COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 12 
>             Map Operator Tree:
>                 TableScan
>                   alias: promotion
>                   Statistics: Num rows: 450 Data size: 530848 Basic stats: COMPLETE Column stats: COMPLETE
>                   Select Operator
>                     expressions: p_promo_sk (type: int)
>                     outputColumnNames: _col0
>                     Statistics: Num rows: 450 Data size: 1800 Basic stats: COMPLETE Column stats: COMPLETE
>                     Reduce Output Operator
>                       key expressions: _col0 (type: int)
>                       sort order: +
>                       Map-reduce partition columns: _col0 (type: int)
>                       Statistics: Num rows: 450 Data size: 1800 Basic stats: COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 13 
>             Map Operator Tree:
>                 TableScan
>                   alias: household_demographics
>                   filterExpr: ((hd_buy_potential = '1001-5000') and hd_demo_sk is not null) (type: boolean)
>                   Statistics: Num rows: 7200 Data size: 770400 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: ((hd_buy_potential = '1001-5000') and hd_demo_sk is not null) (type: boolean)
>                     Statistics: Num rows: 1440 Data size: 138240 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: hd_demo_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 1440 Data size: 5760 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 1440 Data size: 5760 Basic stats: COMPLETE Column stats: COMPLETE
>             Execution mode: vectorized
>         Map 2 
>             Map Operator Tree:
>                 TableScan
>                   alias: d1
>                   filterExpr: d_date_sk is not null (type: boolean)
>                   Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: d_date_sk is not null (type: boolean)
>                     Statistics: Num rows: 73049 Data size: 7158802 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: d_date_sk (type: int), d_date (type: string)
>                       outputColumnNames: _col0, _col1
>                       Statistics: Num rows: 73049 Data size: 7158802 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 73049 Data size: 7158802 Basic stats: COMPLETE Column stats: COMPLETE
>                         value expressions: _col1 (type: string)
>             Execution mode: vectorized
>         Map 3 
>             Map Operator Tree:
>                 TableScan
>                   alias: catalog_sales
>                   filterExpr: (((cs_item_sk is not null and cs_bill_hdemo_sk is not null) and cs_bill_cdemo_sk is not null) and cs_ship_date_sk is not null) (type: boolean)
>                   Statistics: Num rows: 286549727 Data size: 37743959324 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (((cs_item_sk is not null and cs_bill_hdemo_sk is not null) and cs_bill_cdemo_sk is not null) and cs_ship_date_sk is not null) (type: boolean)
>                     Statistics: Num rows: 284396955 Data size: 7948828532 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: cs_ship_date_sk (type: int), cs_bill_cdemo_sk (type: int), cs_bill_hdemo_sk (type: int), cs_item_sk (type: int), cs_promo_sk (type: int), cs_order_number (type: int), cs_quantity (type: int), cs_sold_date_sk (type: int)
>                       outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7
>                       Statistics: Num rows: 284396955 Data size: 9086416352 Basic stats: COMPLETE Column stats: COMPLETE
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         condition expressions:
>                           0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} {_col6} {_col7}
>                           1 {_col1} {_col2} {_col3}
>                         keys:
>                           0 _col3 (type: int)
>                           1 _col0 (type: int)
>                         outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col9, _col10, _col11
>                         input vertices:
>                           1 Map 7
>                         Statistics: Num rows: 275157677926 Data size: 8805045693632 Basic stats: COMPLETE Column stats: COMPLETE
>                         Filter Operator
>                           predicate: (_col10 < _col6) (type: boolean)
>                           Statistics: Num rows: 91719225975 Data size: 2935015231200 Basic stats: COMPLETE Column stats: COMPLETE
>                           Select Operator
>                             expressions: _col0 (type: int), _col1 (type: int), _col11 (type: int), _col2 (type: int), _col3 (type: int), _col4 (type: int), _col5 (type: int), _col7 (type: int), _col9 (type: int)
>                             outputColumnNames: _col0, _col1, _col11, _col2, _col3, _col4, _col5, _col7, _col9
>                             Statistics: Num rows: 91719225975 Data size: 2201261423400 Basic stats: COMPLETE Column stats: COMPLETE
>                             Map Join Operator
>                               condition map:
>                                    Inner Join 0 to 1
>                               condition expressions:
>                                 0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} {_col9} {_col11}
>                                 1 {_col1} {_col2}
>                               keys:
>                                 0 _col7 (type: int)
>                                 1 _col0 (type: int)
>                               outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col9, _col11, _col13, _col14
>                               input vertices:
>                                 1 Map 8
>                               Statistics: Num rows: 102398861876 Data size: 12083065701368 Basic stats: COMPLETE Column stats: COMPLETE
>                               Map Join Operator
>                                 condition map:
>                                      Inner Join 0 to 1
>                                 condition expressions:
>                                   0 {_col0} {_col1} {_col2} {_col3} {_col4} {_col5} {_col9} {_col13} {_col14}
>                                   1 
>                                 keys:
>                                   0 _col11 (type: int), _col14 (type: int)
>                                   1 _col0 (type: int), _col1 (type: int)
>                                 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col9, _col13, _col14
>                                 input vertices:
>                                   1 Map 1
>                                 Statistics: Num rows: 8705458 Data size: 992422212 Basic stats: COMPLETE Column stats: COMPLETE
>                                 Map Join Operator
>                                   condition map:
>                                        Inner Join 0 to 1
>                                   condition expressions:
>                                     0 {_col0} {_col1} {_col3} {_col4} {_col5} {_col9} {_col13} {_col14}
>                                     1 
>                                   keys:
>                                     0 _col2 (type: int)
>                                     1 _col0 (type: int)
>                                   outputColumnNames: _col0, _col1, _col3, _col4, _col5, _col9, _col13, _col14
>                                   input vertices:
>                                     1 Map 13
>                                   Statistics: Num rows: 6739709 Data size: 768326826 Basic stats: COMPLETE Column stats: COMPLETE
>                                   Select Operator
>                                     expressions: _col0 (type: int), _col1 (type: int), _col13 (type: string), _col14 (type: int), _col3 (type: int), _col4 (type: int), _col5 (type: int), _col9 (type: int)
>                                     outputColumnNames: _col0, _col1, _col13, _col14, _col3, _col4, _col5, _col9
>                                     Statistics: Num rows: 6739709 Data size: 768326826 Basic stats: COMPLETE Column stats: COMPLETE
>                                     Map Join Operator
>                                       condition map:
>                                            Inner Join 0 to 1
>                                       condition expressions:
>                                         0 
>                                         1 {_col0} {_col3} {_col4} {_col5} {_col9} {_col13} {_col14}
>                                       keys:
>                                         0 _col0 (type: int)
>                                         1 _col1 (type: int)
>                                       outputColumnNames: _col2, _col5, _col6, _col7, _col11, _col15, _col16
>                                       input vertices:
>                                         0 Map 11
>                                       Statistics: Num rows: 7051609 Data size: 803883426 Basic stats: COMPLETE Column stats: COMPLETE
>                                       Select Operator
>                                         expressions: _col11 (type: int), _col15 (type: string), _col16 (type: int), _col2 (type: int), _col5 (type: int), _col6 (type: int), _col7 (type: int)
>                                         outputColumnNames: _col11, _col15, _col16, _col2, _col5, _col6, _col7
>                                         Statistics: Num rows: 7051609 Data size: 803883426 Basic stats: COMPLETE Column stats: COMPLETE
>                                         Map Join Operator
>                                           condition map:
>                                                Inner Join 0 to 1
>                                           condition expressions:
>                                             0 {_col1}
>                                             1 {_col5} {_col6} {_col7} {_col11} {_col15} {_col16}
>                                           keys:
>                                             0 _col0 (type: int)
>                                             1 _col2 (type: int)
>                                           outputColumnNames: _col1, _col7, _col8, _col9, _col13, _col17, _col18
>                                           input vertices:
>                                             0 Map 2
>                                           Statistics: Num rows: 7884543 Data size: 1639984944 Basic stats: COMPLETE Column stats: COMPLETE
>                                           Filter Operator
>                                             predicate: (UDFToDouble(_col1) > (UDFToDouble(_col17) + UDFToDouble(5))) (type: boolean)
>                                             Statistics: Num rows: 2628181 Data size: 546661648 Basic stats: COMPLETE Column stats: COMPLETE
>                                             Select Operator
>                                               expressions: _col13 (type: int), _col18 (type: int), _col7 (type: int), _col8 (type: int), _col9 (type: int)
>                                               outputColumnNames: _col13, _col18, _col7, _col8, _col9
>                                               Statistics: Num rows: 2628181 Data size: 52563620 Basic stats: COMPLETE Column stats: COMPLETE
>                                               Map Join Operator
>                                                 condition map:
>                                                      Inner Join 0 to 1
>                                                 condition expressions:
>                                                   0 {_col1}
>                                                   1 {_col7} {_col8} {_col9} {_col13} {_col18}
>                                                 keys:
>                                                   0 _col0 (type: int)
>                                                   1 _col7 (type: int)
>                                                 outputColumnNames: _col1, _col9, _col10, _col11, _col15, _col20
>                                                 input vertices:
>                                                   0 Map 6
>                                                 Statistics: Num rows: 29222304 Data size: 5961350016 Basic stats: COMPLETE Column stats: COMPLETE
>                                                 Map Join Operator
>                                                   condition map:
>                                                        Inner Join 0 to 1
>                                                   condition expressions:
>                                                     0 {_col1} {_col9} {_col10} {_col11} {_col20}
>                                                     1 {_col1}
>                                                   keys:
>                                                     0 _col15 (type: int)
>                                                     1 _col0 (type: int)
>                                                   outputColumnNames: _col1, _col9, _col10, _col11, _col20, _col27
>                                                   input vertices:
>                                                     1 Map 9
>                                                   Statistics: Num rows: 35066764 Data size: 10484962436 Basic stats: COMPLETE Column stats: COMPLETE
>                                                   Select Operator
>                                                     expressions: _col27 (type: string), _col1 (type: string), _col20 (type: int), _col9 (type: int), _col10 (type: int), _col11 (type: int)
>                                                     outputColumnNames: _col13, _col15, _col22, _col3, _col4, _col5
>                                                     Statistics: Num rows: 35066764 Data size: 10484962436 Basic stats: COMPLETE Column stats: COMPLETE
>                                                     Map Join Operator
>                                                       condition map:
>                                                            Right Outer Join0 to 1
>                                                       condition expressions:
>                                                         0 {_col0}
>                                                         1 {_col3} {_col5} {_col13} {_col15} {_col22}
>                                                       keys:
>                                                         0 _col0 (type: int)
>                                                         1 _col4 (type: int)
>                                                       outputColumnNames: _col0, _col4, _col6, _col14, _col16, _col23
>                                                       input vertices:
>                                                         0 Map 12
>                                                       Statistics: Num rows: 631201752 Data size: 188729323848 Basic stats: COMPLETE Column stats: COMPLETE
>                                                       Select Operator
>                                                         expressions: _col0 (type: int), _col14 (type: string), _col16 (type: string), _col23 (type: int), _col4 (type: int), _col6 (type: int)
>                                                         outputColumnNames: _col0, _col14, _col16, _col23, _col4, _col6
>                                                         Statistics: Num rows: 631201752 Data size: 188729323848 Basic stats: COMPLETE Column stats: COMPLETE
>                                                         Map Join Operator
>                                                           condition map:
>                                                                Right Outer Join0 to 1
>                                                           condition expressions:
>                                                             0 
>                                                             1 {_col0} {_col14} {_col16} {_col23}
>                                                           keys:
>                                                             0 _col0 (type: int), _col1 (type: int)
>                                                             1 _col4 (type: int), _col6 (type: int)
>                                                           outputColumnNames: _col2, _col16, _col18, _col25
>                                                           input vertices:
>                                                             0 Map 10
>                                                           Statistics: Num rows: 610256859 Data size: 177584745969 Basic stats: COMPLETE Column stats: COMPLETE
>                                                           Select Operator
>                                                             expressions: _col18 (type: string), _col16 (type: string), _col25 (type: int), CASE WHEN (_col2 is null) THEN (1) ELSE (0) END (type: int), CASE WHEN (_col2 is not null) THEN (1) ELSE (0) END (type: int)
>                                                             outputColumnNames: _col0, _col1, _col2, _col3, _col4
>                                                             Statistics: Num rows: 610256859 Data size: 177584745969 Basic stats: COMPLETE Column stats: COMPLETE
>                                                             Group By Operator
>                                                               aggregations: count(_col3), count(_col4), count()
>                                                               keys: _col0 (type: string), _col1 (type: string), _col2 (type: int)
>                                                               mode: hash
>                                                               outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>                                                               Statistics: Num rows: 44400 Data size: 1243200 Basic stats: COMPLETE Column stats: COMPLETE
>                                                               Reduce Output Operator
>                                                                 key expressions: _col0 (type: string), _col1 (type: string), _col2 (type: int)
>                                                                 sort order: +++
>                                                                 Map-reduce partition columns: _col0 (type: string), _col1 (type: string), _col2 (type: int)
>                                                                 Statistics: Num rows: 44400 Data size: 1243200 Basic stats: COMPLETE Column stats: COMPLETE
>                                                                 value expressions: _col3 (type: bigint), _col4 (type: bigint), _col5 (type: bigint)
>             Execution mode: vectorized
>         Map 6 
>             Map Operator Tree:
>                 TableScan
>                   alias: item
>                   filterExpr: i_item_sk is not null (type: boolean)
>                   Statistics: Num rows: 48000 Data size: 68732712 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: i_item_sk is not null (type: boolean)
>                     Statistics: Num rows: 48000 Data size: 9024000 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: i_item_sk (type: int), i_item_desc (type: string)
>                       outputColumnNames: _col0, _col1
>                       Statistics: Num rows: 48000 Data size: 9024000 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 48000 Data size: 9024000 Basic stats: COMPLETE Column stats: COMPLETE
>                         value expressions: _col1 (type: string)
>             Execution mode: vectorized
>         Map 7 
>             Map Operator Tree:
>                 TableScan
>                   alias: inventory
>                   filterExpr: (inv_item_sk is not null and inv_warehouse_sk is not null) (type: boolean)
>                   Statistics: Num rows: 37584000 Data size: 443485104 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (inv_item_sk is not null and inv_warehouse_sk is not null) (type: boolean)
>                     Statistics: Num rows: 37584000 Data size: 443485104 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: inv_item_sk (type: int), inv_warehouse_sk (type: int), inv_quantity_on_hand (type: int), inv_date_sk (type: int)
>                       outputColumnNames: _col0, _col1, _col2, _col3
>                       Statistics: Num rows: 37584000 Data size: 593821104 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 37584000 Data size: 593821104 Basic stats: COMPLETE Column stats: COMPLETE
>                         value expressions: _col1 (type: int), _col2 (type: int), _col3 (type: int)
>             Execution mode: vectorized
>         Map 8 
>             Map Operator Tree:
>                 TableScan
>                   alias: d1
>                   filterExpr: (((d_year = 2001) and d_date_sk is not null) and d_week_seq is not null) (type: boolean)
>                   Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (((d_year = 2001) and d_date_sk is not null) and d_week_seq is not null) (type: boolean)
>                     Statistics: Num rows: 652 Data size: 69112 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: d_date_sk (type: int), d_date (type: string), d_week_seq (type: int)
>                       outputColumnNames: _col0, _col1, _col2
>                       Statistics: Num rows: 652 Data size: 66504 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 652 Data size: 66504 Basic stats: COMPLETE Column stats: COMPLETE
>                         value expressions: _col1 (type: string), _col2 (type: int)
>                       Select Operator
>                         expressions: _col0 (type: int)
>                         outputColumnNames: _col0
>                         Statistics: Num rows: 652 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE
>                         Group By Operator
>                           keys: _col0 (type: int)
>                           mode: hash
>                           outputColumnNames: _col0
>                           Statistics: Num rows: 652 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE
>                           Dynamic Partitioning Event Operator
>                             Target Input: catalog_sales
>                             Partition key expr: cs_sold_date_sk
>                             Statistics: Num rows: 652 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE
>                             Target column: cs_sold_date_sk
>                             Target Vertex: Map 3
>             Execution mode: vectorized
>         Map 9 
>             Map Operator Tree:
>                 TableScan
>                   alias: warehouse
>                   filterExpr: w_warehouse_sk is not null (type: boolean)
>                   Statistics: Num rows: 6 Data size: 6166 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: w_warehouse_sk is not null (type: boolean)
>                     Statistics: Num rows: 6 Data size: 618 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: w_warehouse_sk (type: int), w_warehouse_name (type: string)
>                       outputColumnNames: _col0, _col1
>                       Statistics: Num rows: 6 Data size: 618 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 6 Data size: 618 Basic stats: COMPLETE Column stats: COMPLETE
>                         value expressions: _col1 (type: string)
>             Execution mode: vectorized
>         Reducer 4 
>             Reduce Operator Tree:
>               Group By Operator
>                 aggregations: count(VALUE._col0), count(VALUE._col1), count(VALUE._col2)
>                 keys: KEY._col0 (type: string), KEY._col1 (type: string), KEY._col2 (type: int)
>                 mode: mergepartial
>                 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>                 Statistics: Num rows: 44400 Data size: 2131200 Basic stats: COMPLETE Column stats: COMPLETE
>                 Select Operator
>                   expressions: _col0 (type: string), _col1 (type: string), _col2 (type: int), _col3 (type: bigint), _col4 (type: bigint), _col5 (type: bigint)
>                   outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>                   Statistics: Num rows: 44400 Data size: 2131200 Basic stats: COMPLETE Column stats: COMPLETE
>                   Reduce Output Operator
>                     key expressions: _col5 (type: bigint), _col0 (type: string), _col1 (type: string), _col2 (type: int)
>                     sort order: -+++
>                     Statistics: Num rows: 44400 Data size: 2131200 Basic stats: COMPLETE Column stats: COMPLETE
>                     TopN Hash Memory Usage: 0.04
>                     value expressions: _col3 (type: bigint), _col4 (type: bigint)
>             Execution mode: vectorized
>         Reducer 5 
>             Reduce Operator Tree:
>               Select Operator
>                 expressions: KEY.reducesinkkey1 (type: string), KEY.reducesinkkey2 (type: string), KEY.reducesinkkey3 (type: int), VALUE._col0 (type: bigint), VALUE._col1 (type: bigint), KEY.reducesinkkey0 (type: bigint)
>                 outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>                 Statistics: Num rows: 44400 Data size: 2131200 Basic stats: COMPLETE Column stats: COMPLETE
>                 Limit
>                   Number of rows: 100
>                   Statistics: Num rows: 100 Data size: 4800 Basic stats: COMPLETE Column stats: COMPLETE
>                   File Output Operator
>                     compressed: false
>                     Statistics: Num rows: 100 Data size: 4800 Basic stats: COMPLETE Column stats: COMPLETE
>                     table:
>                         input format: org.apache.hadoop.mapred.TextInputFormat
>                         output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                         serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>             Execution mode: vectorized
>   Stage: Stage-0
>     Fetch Operator
>       limit: 100
>       Processor Tree:
>         ListSink
> Time taken: 16.504 seconds, Fetched: 463 row(s)
> {code}
> Plan with modified query 
> {code}
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
>       Edges:
>         Map 2 <- Map 1 (BROADCAST_EDGE), Map 10 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE)
>         Map 7 <- Map 4 (BROADCAST_EDGE), Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)
>         Reducer 3 <- Map 2 (SIMPLE_EDGE)
>       DagName: mmokhtar_20150206180606_2f891814-bf41-4888-ab80-dddf4425205f:1
>       Vertices:
>         Map 1
>             Map Operator Tree:
>                 TableScan
>                   alias: d1
>                   filterExpr: (d_date_sk is not null and d_week_seq is not null) (type: boolean)
>                   Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (d_date_sk is not null and d_week_seq is not null) (type: boolean)
>                     Statistics: Num rows: 73049 Data size: 584392 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: d_date_sk (type: int), d_week_seq (type: int)
>                       outputColumnNames: _col0, _col1
>                       Statistics: Num rows: 73049 Data size: 584392 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int), _col1 (type: int)
>                         sort order: ++
>                         Map-reduce partition columns: _col0 (type: int), _col1 (type: int)
>                         Statistics: Num rows: 73049 Data size: 584392 Basic stats: COMPLETE Column stats: COMPLETE
>                       Select Operator
>                         expressions: _col0 (type: int)
>                         outputColumnNames: _col0
>                         Statistics: Num rows: 73049 Data size: 584392 Basic stats: COMPLETE Column stats: COMPLETE
>                         Group By Operator
>                           keys: _col0 (type: int)
>                           mode: hash
>                           outputColumnNames: _col0
>                           Statistics: Num rows: 73049 Data size: 584392 Basic stats: COMPLETE Column stats: COMPLETE
>                           Dynamic Partitioning Event Operator
>                             Target Input: inventory
>                             Partition key expr: inv_date_sk
>                             Statistics: Num rows: 73049 Data size: 584392 Basic stats: COMPLETE Column stats: COMPLETE
>                             Target column: inv_date_sk
>                             Target Vertex: Map 2
>         Map 10
>             Map Operator Tree:
>                 TableScan
>                   alias: warehouse
>                   filterExpr: w_warehouse_sk is not null (type: boolean)
>                   Statistics: Num rows: 6 Data size: 6166 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: w_warehouse_sk is not null (type: boolean)
>                     Statistics: Num rows: 6 Data size: 24 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: w_warehouse_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 6 Data size: 24 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 6 Data size: 24 Basic stats: COMPLETE Column stats: COMPLETE
>         Map 2
>             Map Operator Tree:
>                 TableScan
>                   alias: inventory
>                   filterExpr: (inv_item_sk is not null and inv_warehouse_sk is not null) (type: boolean)
>                   Statistics: Num rows: 37584000 Data size: 443485104 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (inv_item_sk is not null and inv_warehouse_sk is not null) (type: boolean)
>                     Statistics: Num rows: 37584000 Data size: 593821104 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: inv_item_sk (type: int), inv_warehouse_sk (type: int), inv_quantity_on_hand (type: int), inv_date_sk (type: int)
>                       outputColumnNames: _col0, _col1, _col2, _col3
>                       Statistics: Num rows: 37584000 Data size: 593821104 Basic stats: COMPLETE Column stats: COMPLETE
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         keys:
>                           0 _col0 (type: int)
>                           1 _col3 (type: int)
>                         outputColumnNames: _col1, _col2, _col3, _col8, _col17
>                         input vertices:
>                           1 Map 7
>                         Statistics: Num rows: 937245073 Data size: 14995921168 Basic stats: COMPLETE Column stats: COMPLETE
>                         Filter Operator
>                           predicate: (_col2 < _col8) (type: boolean)
>                           Statistics: Num rows: 312415024 Data size: 4998640384 Basic stats: COMPLETE Column stats: COMPLETE
>                           Select Operator
>                             expressions: _col17 (type: int), _col1 (type: int), _col3 (type: int)
>                             outputColumnNames: _col13, _col18, _col20
>                             Statistics: Num rows: 312415024 Data size: 2499320192 Basic stats: COMPLETE Column stats: COMPLETE
>                             Map Join Operator
>                               condition map:
>                                    Inner Join 0 to 1
>                               keys:
>                                 0 _col0 (type: int), _col1 (type: int)
>                                 1 _col20 (type: int), _col13 (type: int)
>                               outputColumnNames: _col20
>                               input vertices:
>                                 0 Map 1
>                               Statistics: Num rows: 26560 Data size: 106240 Basic stats: COMPLETE Column stats: COMPLETE
>                               Map Join Operator
>                                 condition map:
>                                      Inner Join 0 to 1
>                                 keys:
>                                   0 _col20 (type: int)
>                                   1 _col0 (type: int)
>                                 input vertices:
>                                   1 Map 10
>                                 Statistics: Num rows: 26560 Data size: 0 Basic stats: PARTIAL Column stats: COMPLETE
>                                 Group By Operator
>                                   aggregations: count()
>                                   mode: hash
>                                   outputColumnNames: _col0
>                                   Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
>                                   Reduce Output Operator
>                                     sort order:
>                                     Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
>                                     value expressions: _col0 (type: bigint)
>         Map 4
>             Map Operator Tree:
>                 TableScan
>                   alias: d1
>                   filterExpr: d_date_sk is not null (type: boolean)
>                   Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: d_date_sk is not null (type: boolean)
>                     Statistics: Num rows: 73049 Data size: 7158802 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: d_date_sk (type: int), d_date (type: string)
>                       outputColumnNames: _col0, _col1
>                       Statistics: Num rows: 73049 Data size: 7158802 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 73049 Data size: 7158802 Basic stats: COMPLETE Column stats: COMPLETE
>                         value expressions: _col1 (type: string)
>         Map 5
>             Map Operator Tree:
>                 TableScan
>                   alias: item
>                   filterExpr: i_item_sk is not null (type: boolean)
>                   Statistics: Num rows: 48000 Data size: 68732712 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: i_item_sk is not null (type: boolean)
>                     Statistics: Num rows: 48000 Data size: 192000 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: i_item_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 48000 Data size: 192000 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 48000 Data size: 192000 Basic stats: COMPLETE Column stats: COMPLETE
>         Map 6
>             Map Operator Tree:
>                 TableScan
>                   alias: customer_demographics
>                   filterExpr: ((cd_marital_status = 'M') and cd_demo_sk is not null) (type: boolean)
>                   Statistics: Num rows: 1920800 Data size: 718379200 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: ((cd_marital_status = 'M') and cd_demo_sk is not null) (type: boolean)
>                     Statistics: Num rows: 274400 Data size: 24421600 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: cd_demo_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 274400 Data size: 1097600 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 274400 Data size: 1097600 Basic stats: COMPLETE Column stats: COMPLETE
>         Map 7
>             Map Operator Tree:
>                 TableScan
>                   alias: catalog_sales
>                   filterExpr: (((cs_bill_hdemo_sk is not null and cs_bill_cdemo_sk is not null) and cs_item_sk is not null) and cs_ship_date_sk is not null) (type: boolean)
>                   Statistics: Num rows: 286549727 Data size: 37743959324 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (((cs_bill_hdemo_sk is not null and cs_bill_cdemo_sk is not null) and cs_item_sk is not null) and cs_ship_date_sk is not null) (type: boolean)
>                     Statistics: Num rows: 284396955 Data size: 6814100200 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: cs_ship_date_sk (type: int), cs_bill_cdemo_sk (type: int), cs_bill_hdemo_sk (type: int), cs_item_sk (type: int), cs_quantity (type: int), cs_sold_date_sk (type: int)
>                       outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
>                       Statistics: Num rows: 284396955 Data size: 6814100200 Basic stats: COMPLETE Column stats: COMPLETE
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         keys:
>                           0 _col5 (type: int)
>                           1 _col0 (type: int)
>                         outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col7, _col8
>                         input vertices:
>                           1 Map 8
>                         Statistics: Num rows: 101715202 Data size: 12002393836 Basic stats: COMPLETE Column stats: COMPLETE
>                         Map Join Operator
>                           condition map:
>                                Inner Join 0 to 1
>                           keys:
>                             0 _col2 (type: int)
>                             1 _col0 (type: int)
>                           outputColumnNames: _col0, _col1, _col3, _col4, _col7, _col8
>                           input vertices:
>                             1 Map 9
>                           Statistics: Num rows: 20343040 Data size: 2319106560 Basic stats: COMPLETE Column stats: COMPLETE
>                           Map Join Operator
>                             condition map:
>                                  Inner Join 0 to 1
>                             keys:
>                               0 _col0 (type: int)
>                               1 _col1 (type: int)
>                             outputColumnNames: _col2, _col5, _col6, _col9, _col10
>                             input vertices:
>                               0 Map 6
>                             Statistics: Num rows: 2906149 Data size: 34873788 Basic stats: COMPLETE Column stats: COMPLETE
>                             Select Operator
>                               expressions: _col10 (type: int), _col2 (type: int), _col5 (type: int), _col6 (type: int), _col9 (type: string)
>                               outputColumnNames: _col10, _col2, _col5, _col6, _col9
>                               Statistics: Num rows: 2906149 Data size: 34873788 Basic stats: COMPLETE Column stats: COMPLETE
>                               Map Join Operator
>                                 condition map:
>                                      Inner Join 0 to 1
>                                 keys:
>                                   0 _col0 (type: int)
>                                   1 _col5 (type: int)
>                                 outputColumnNames: _col3, _col6, _col7, _col10, _col11
>                                 input vertices:
>                                   0 Map 5
>                                 Statistics: Num rows: 2906149 Data size: 34873788 Basic stats: COMPLETE Column stats: COMPLETE
>                                 Select Operator
>                                   expressions: _col10 (type: string), _col11 (type: int), _col3 (type: int), _col6 (type: int), _col7 (type: int)
>                                   outputColumnNames: _col10, _col11, _col3, _col6, _col7
>                                   Statistics: Num rows: 2906149 Data size: 34873788 Basic stats: COMPLETE Column stats: COMPLETE
>                                   Map Join Operator
>                                     condition map:
>                                          Inner Join 0 to 1
>                                     keys:
>                                       0 _col0 (type: int)
>                                       1 _col3 (type: int)
>                                     outputColumnNames: _col1, _col8, _col9, _col12, _col13
>                                     input vertices:
>                                       0 Map 4
>                                     Statistics: Num rows: 2906149 Data size: 296427198 Basic stats: COMPLETE Column stats: COMPLETE
>                                     Filter Operator
>                                       predicate: (UDFToDouble(_col1) > (UDFToDouble(_col12) + 5.0)) (type: boolean)
>                                       Statistics: Num rows: 968716 Data size: 98809032 Basic stats: COMPLETE Column stats: COMPLETE
>                                       Select Operator
>                                         expressions: _col13 (type: int), _col8 (type: int), _col9 (type: int)
>                                         outputColumnNames: _col13, _col3, _col4
>                                         Statistics: Num rows: 968716 Data size: 7749728 Basic stats: COMPLETE Column stats: COMPLETE
>                                         Reduce Output Operator
>                                           key expressions: _col3 (type: int)
>                                           sort order: +
>                                           Map-reduce partition columns: _col3 (type: int)
>                                           Statistics: Num rows: 968716 Data size: 7749728 Basic stats: COMPLETE Column stats: COMPLETE
>                                           value expressions: _col4 (type: int), _col13 (type: int)
>         Map 8
>             Map Operator Tree:
>                 TableScan
>                   alias: d1
>                   filterExpr: (((d_year = 2001) and d_date_sk is not null) and d_week_seq is not null) (type: boolean)
>                   Statistics: Num rows: 73049 Data size: 81741831 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: (((d_year = 2001) and d_date_sk is not null) and d_week_seq is not null) (type: boolean)
>                     Statistics: Num rows: 652 Data size: 69112 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: d_date_sk (type: int), d_date (type: string), d_week_seq (type: int)
>                       outputColumnNames: _col0, _col1, _col2
>                       Statistics: Num rows: 652 Data size: 66504 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 652 Data size: 66504 Basic stats: COMPLETE Column stats: COMPLETE
>                         value expressions: _col1 (type: string), _col2 (type: int)
>                       Select Operator
>                         expressions: _col0 (type: int)
>                         outputColumnNames: _col0
>                         Statistics: Num rows: 652 Data size: 66504 Basic stats: COMPLETE Column stats: COMPLETE
>                         Group By Operator
>                           keys: _col0 (type: int)
>                           mode: hash
>                           outputColumnNames: _col0
>                           Statistics: Num rows: 652 Data size: 66504 Basic stats: COMPLETE Column stats: COMPLETE
>                           Dynamic Partitioning Event Operator
>                             Target Input: catalog_sales
>                             Partition key expr: cs_sold_date_sk
>                             Statistics: Num rows: 652 Data size: 66504 Basic stats: COMPLETE Column stats: COMPLETE
>                             Target column: cs_sold_date_sk
>                             Target Vertex: Map 7
>         Map 9
>             Map Operator Tree:
>                 TableScan
>                   alias: household_demographics
>                   filterExpr: ((hd_buy_potential = '1001-5000') and hd_demo_sk is not null) (type: boolean)
>                   Statistics: Num rows: 7200 Data size: 770400 Basic stats: COMPLETE Column stats: COMPLETE
>                   Filter Operator
>                     predicate: ((hd_buy_potential = '1001-5000') and hd_demo_sk is not null) (type: boolean)
>                     Statistics: Num rows: 1440 Data size: 138240 Basic stats: COMPLETE Column stats: COMPLETE
>                     Select Operator
>                       expressions: hd_demo_sk (type: int)
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 1440 Data size: 5760 Basic stats: COMPLETE Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 1440 Data size: 5760 Basic stats: COMPLETE Column stats: COMPLETE
>         Reducer 3
>             Reduce Operator Tree:
>               Group By Operator
>                 aggregations: count(VALUE._col0)
>                 mode: mergepartial
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
>                 File Output Operator
>                   compressed: false
>                   Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: COMPLETE
>                   table:
>                       input format: org.apache.hadoop.mapred.TextInputFormat
>                       output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                       serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message