spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Reynold Xin (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-12616) Improve union logical plan efficiency
Date Mon, 04 Jan 2016 06:34:39 GMT

     [ https://issues.apache.org/jira/browse/SPARK-12616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Reynold Xin updated SPARK-12616:
--------------------------------
    Description: 
Union logical plan is a binary node. However, a typical use case for union is to union a very
large number of input sources (DataFrames, RDDs, or files). In this case, our optimizer can
become very slow due to the large number of logical unions. We should change the Union logical
plan to support an arbitrary number of children, and add a single rule in the optimizer (or
analyzer?) to collapse all adjacent Unions into one.

Note that this problem doesn't exist in physical plan, because the physical Union already
supports arbitrary number of children.




  was:
Union logical plan is a binary node. However, a typical use case for union is to union a very
large number of input sources (DataFrames, RDDs, or files). In this case, our optimizer can
become very slow due to the large number of logical unions. We should change the Union logical
plan to support an arbitrary number of children, and add a single rule in the optimizer (or
analyzer?) to collapse all adjacent Unions into one.





> Improve union logical plan efficiency
> -------------------------------------
>
>                 Key: SPARK-12616
>                 URL: https://issues.apache.org/jira/browse/SPARK-12616
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Reynold Xin
>
> Union logical plan is a binary node. However, a typical use case for union is to union
a very large number of input sources (DataFrames, RDDs, or files). In this case, our optimizer
can become very slow due to the large number of logical unions. We should change the Union
logical plan to support an arbitrary number of children, and add a single rule in the optimizer
(or analyzer?) to collapse all adjacent Unions into one.
> Note that this problem doesn't exist in physical plan, because the physical Union already
supports arbitrary number of children.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message