spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From antonoal <...@git.apache.org>
Subject [GitHub] spark pull request: Added transitive closure transformation to Cat...
Date Thu, 17 Mar 2016 08:48:24 GMT
GitHub user antonoal opened a pull request:

    https://github.com/apache/spark/pull/11777

    Added transitive closure transformation to Catalyst

    ## What changes were proposed in this pull request?
    A relatively simple transformation is missing from Catalyst's arsenal - generation of
transitive predicates. For instance, if you have got the following query:
    `select * 
    from   table1 t1
    join   table2 t2
    on     t1.a = t2.b
    where  t1.a = 42`
    then it is a fair assumption that t2.b also equals 42 hence an additional predicate could
be generated. The additional predicate could in turn be pushed down through the join and improve
performance of the whole query by filtering out the data before joining it.
    Such a transformation exists in Oracle DB.
    Please note, in this PR a transitive predicate would be created for the following operations:

    * a BinaryComparison (=, >=, etc.) to a foldable
    * in (1, 2, 3) where all the values in the sequence are foldable
    * Not of any of the above
    * Or of any of the above
    
    ## How was this patch tested?
    I've added a new TransitiveClosureSuite with a series of unit tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/antonoal/spark transitive-closure

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11777.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11777
    
----
commit 7df4117749f7afc2e5e95190cf93a961b9c6ed3a
Author: Alex Antonov <3091455@gmail.com>
Date:   2016-03-16T21:53:38Z

    Added transitive closure transformation to Catalyst

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message