hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-12477) CBO: Left Semijoins are incompatible with a cross-product
Date Fri, 20 Nov 2015 03:36:10 GMT
Gopal V created HIVE-12477:
------------------------------

             Summary: CBO: Left Semijoins are incompatible with a cross-product
                 Key: HIVE-12477
                 URL: https://issues.apache.org/jira/browse/HIVE-12477
             Project: Hive
          Issue Type: Bug
          Components: CBO
    Affects Versions: 2.0.0
            Reporter: Gopal V
            Assignee: Jesus Camacho Rodriguez


with HIVE-12017 in place, a few queries generate left sem-joins without a key.

This is an invalid plan and can be produced by doing.

{code}
explain logical select count(1) from store_sales where ss_sold_date_sk in (select d_date_sk
from date_dim where d_date_sk = 1);

LOGICAL PLAN:  
$hdt$_0:$hdt$_0:$hdt$_0:store_sales
  TableScan (TS_0)
    alias: store_sales
    filterExpr: (ss_sold_date_sk = 1) (type: boolean)
    Filter Operator (FIL_20)
      predicate: (ss_sold_date_sk = 1) (type: boolean)
      Select Operator (SEL_2)
        Reduce Output Operator (RS_9)
          sort order: 
          Join Operator (JOIN_11)
            condition map:
                 Left Semi Join 0 to 1
            keys:
              0 
              1 
            Group By Operator (GBY_14)
              aggregations: count(1)
              mode: hash
{code}

without CBO

{code}
sq_1:date_dim
  TableScan (TS_1)
    alias: date_dim
    filterExpr: ((1) IN (RS[6]) and (d_date_sk = 1)) (type: boolean)
    Filter Operator (FIL_21)
      predicate: ((1) IN (RS[6]) and (d_date_sk = 1)) (type: boolean)
      Select Operator (SEL_3)
        expressions: 1 (type: int)
        outputColumnNames: _col0
        Group By Operator (GBY_5)
          keys: _col0 (type: int)
          mode: hash
          outputColumnNames: _col0
          Reduce Output Operator (RS_8)
            key expressions: _col0 (type: int)
            sort order: +
            Map-reduce partition columns: _col0 (type: int)
            Join Operator (JOIN_9)
              condition map:
                   Left Semi Join 0 to 1
              keys:
                0 ss_sold_date_sk (type: int)
                1 _col0 (type: int)
              Group By Operator (GBY_12)
                aggregations: count(1)
                mode: hash
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message