hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesus Camacho Rodriguez (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-12477) Left Semijoins are incompatible with a cross-product
Date Thu, 02 Jun 2016 10:18:02 GMT

     [ https://issues.apache.org/jira/browse/HIVE-12477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jesus Camacho Rodriguez updated HIVE-12477:
-------------------------------------------
    Fix Version/s:     (was: 2.1.0)
                       (was: 1.3.0)

> Left Semijoins are incompatible with a cross-product
> ----------------------------------------------------
>
>                 Key: HIVE-12477
>                 URL: https://issues.apache.org/jira/browse/HIVE-12477
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO
>    Affects Versions: 2.0.0
>            Reporter: Gopal V
>            Assignee: Jesus Camacho Rodriguez
>             Fix For: 2.0.0
>
>         Attachments: HIVE-12477.01.patch, HIVE-12477.01.patch, HIVE-12477.02.patch, HIVE-12477.patch
>
>
> with HIVE-12017 in place, a few queries generate left sem-joins without a key.
> This is an invalid plan and can be produced by doing.
> {code}
> explain logical select count(1) from store_sales where ss_sold_date_sk in (select d_date_sk
from date_dim where d_date_sk = 1);
> LOGICAL PLAN:  
> $hdt$_0:$hdt$_0:$hdt$_0:store_sales
>   TableScan (TS_0)
>     alias: store_sales
>     filterExpr: (ss_sold_date_sk = 1) (type: boolean)
>     Filter Operator (FIL_20)
>       predicate: (ss_sold_date_sk = 1) (type: boolean)
>       Select Operator (SEL_2)
>         Reduce Output Operator (RS_9)
>           sort order: 
>           Join Operator (JOIN_11)
>             condition map:
>                  Left Semi Join 0 to 1
>             keys:
>               0 
>               1 
>             Group By Operator (GBY_14)
>               aggregations: count(1)
>               mode: hash
> {code}
> without CBO
> {code}
> sq_1:date_dim
>   TableScan (TS_1)
>     alias: date_dim
>     filterExpr: ((1) IN (RS[6]) and (d_date_sk = 1)) (type: boolean)
>     Filter Operator (FIL_21)
>       predicate: ((1) IN (RS[6]) and (d_date_sk = 1)) (type: boolean)
>       Select Operator (SEL_3)
>         expressions: 1 (type: int)
>         outputColumnNames: _col0
>         Group By Operator (GBY_5)
>           keys: _col0 (type: int)
>           mode: hash
>           outputColumnNames: _col0
>           Reduce Output Operator (RS_8)
>             key expressions: _col0 (type: int)
>             sort order: +
>             Map-reduce partition columns: _col0 (type: int)
>             Join Operator (JOIN_9)
>               condition map:
>                    Left Semi Join 0 to 1
>               keys:
>                 0 ss_sold_date_sk (type: int)
>                 1 _col0 (type: int)
>               Group By Operator (GBY_12)
>                 aggregations: count(1)
>                 mode: hash
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message