spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [spark] wangyum opened a new pull request #31193: [SPARK-34117][SQL] Disable LeftSemi/LeftAnti push down over Aggregate
Date Fri, 15 Jan 2021 06:36:12 GMT

wangyum opened a new pull request #31193:
URL: https://github.com/apache/spark/pull/31193


   ### What changes were proposed in this pull request?
   
   This pr disable LeftSemi/LeftAnti push down over Aggregate.
   
   
   ### Why are the changes needed?
   
   LeftSemi/LeftAnti push down over Aggregate may affect performance. for example:
   ```sql
   SELECT i_item_sk ss_item_sk
     FROM item,
       (SELECT
         distinct
         iss.i_brand_id brand_id,
         iss.i_class_id class_id,
         iss.i_category_id category_id
       FROM store_sales, item iss, date_dim d1
       WHERE ss_item_sk = iss.i_item_sk
         AND ss_sold_date_sk = d1.d_date_sk
         AND d1.d_year BETWEEN 1999 AND 1999 + 2
       INTERSECT
       SELECT
       distinct
         ics.i_brand_id,
         ics.i_class_id,
         ics.i_category_id
       FROM catalog_sales, item ics, date_dim d2
       WHERE cs_item_sk = ics.i_item_sk
         AND cs_sold_date_sk = d2.d_date_sk
         AND d2.d_year BETWEEN 1999 AND 1999 + 2
       INTERSECT
       SELECT
       distinct
         iws.i_brand_id,
         iws.i_class_id,
         iws.i_category_id
       FROM web_sales, item iws, date_dim d3
       WHERE ws_item_sk = iws.i_item_sk
         AND ws_sold_date_sk = d3.d_date_sk
         AND d3.d_year BETWEEN 1999 AND 1999 + 2) x
     WHERE i_brand_id = brand_id
       AND i_class_id = class_id
       AND i_category_id = category_id;
   ```
   
   This query is rewritten from [q14b](https://github.com/apache/spark/blob/a78d6ce376edf2a8836e01f47b9dff5371058d4c/sql/core/src/test/resources/tpcds/q14b.sql#L2-L32).
   
   
   CBO enabled | CBO disabled
   -- | --
   The statistics are more accurate, so it will be converted to broadcast join, and then push
down aggregate. | It will not push down aggregate.
   ![image](https://issues.apache.org/jira/secure/attachment/13018801/current.jpg) | ![image](https://issues.apache.org/jira/secure/attachment/13018802/disable_pushdown.jpg)
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   
   ### How was this patch tested?
   
   Unit test.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message