hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-20739) CBO: Rewrite INTERSECTS queries as EXISTS semi-joins
Date Fri, 12 Oct 2018 22:43:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-20739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gopal V updated HIVE-20739:
---------------------------
    Component/s: CBO

> CBO: Rewrite INTERSECTS queries as EXISTS semi-joins
> ----------------------------------------------------
>
>                 Key: HIVE-20739
>                 URL: https://issues.apache.org/jira/browse/HIVE-20739
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>            Reporter: Gopal V
>            Priority: Major
>
> INTERSECTS clause currently materializes both sides before removing entries from each
other.
> This means that if the intersects has filters on one side which apply to the other side,
it is not transitively pushed to the other side.
> Here's the snippet from the TPC-DS Query8 where that is visible.
> {code}
>     SELECT substr(ca_zip,1,5) ca_zip
>       FROM customer_address
>       WHERE substr(ca_zip,1,5) IN (
>                           '89436', /* ... */ ','15492'))
>      intersect
>      (select ca_zip
>       from (SELECT substr(ca_zip,1,5) ca_zip,count(*) cnt
>             FROM customer_address, customer
>             WHERE ca_address_sk = c_current_addr_sk and
>                   c_preferred_cust_flag='Y'
>             group by ca_zip
>             having count(*) > 10)A1))A2
> {code}
> https://github.com/hortonworks/hive-testbench/blob/hdp3/sample-queries-tpcds/query8.sql#L92
> where the entire join output of (customer_address x customer) is produced where the transitive
filter inference could be applied to push the filter to the customer_address in the 2nd set
in intersection.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message