spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wzhfy <...@git.apache.org>
Subject [GitHub] spark issue #17286: [SPARK-19915][SQL] Exclude cartesian product candidates ...
Date Thu, 16 Mar 2017 08:49:05 GMT
Github user wzhfy commented on the issue:

    https://github.com/apache/spark/pull/17286
  
    Co-existing cross joins (join without a condition) and inner joins make the reordering
procedure cumbersome. On one hand, putting cross join candidates into memo is not good in
terms of search performance and memory consumption. On the other hand, adding cross joins
after inner join exploration also has problems with multiple unjoinable groups, just putting
them at the end of the plan is not right.
    
    So I decide to do reordering only for consecutive inner joins, which are separated by
other plans (including cross joins). For bushy trees, this means we will reorder joins within
each inner joinable "groups". For left deep trees, the `ReorderJoin` rule puts cross joins
to the end, so all the previous inner joins will be reordered. Since join reorder is a complicated
problem, we may need several levels of optimization, we can consider `ReorderJoin` as a heuristic
algorithm.
    
    For common cases when there's no cross join, this algorithm behaves just like before.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message