spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From maryannxue <...@git.apache.org>
Subject [GitHub] spark pull request #22447: [SPARK-25450][SQL] PushProjectThroughUnion rule u...
Date Tue, 18 Sep 2018 02:59:30 GMT
GitHub user maryannxue opened a pull request:

    https://github.com/apache/spark/pull/22447

    [SPARK-25450][SQL] PushProjectThroughUnion rule uses the same exprId for project expressions
in each Union child, causing mistakes in constant propagation

    ## What changes were proposed in this pull request?
    
    The problem was cause by the PushProjectThroughUnion rule, which, when creating new Project
for each child of Union, uses the same exprId for expressions of the same position. This is
wrong because, for each child of Union, the expressions are all independent, and it can lead
to a wrong result if other rules like FoldablePropagation kicks in, taking two different expressions
as the same.
    
    This fix is to create new expressions in the new Project for each child of Union.
    
    ## How was this patch tested?
    
    Added UT.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maryannxue/spark push-project-thru-union-bug

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22447.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22447
    
----
commit 7193de3ad8675229eef131214ed62f2ece5cd416
Author: maryannxue <maryannxue@...>
Date:   2018-09-18T02:56:07Z

    fix

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message