spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yhuai <...@git.apache.org>
Subject [GitHub] spark pull request: SPARK-13827[SQL] Can't add subquery to an oper...
Date Wed, 16 Mar 2016 02:07:15 GMT
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11658#discussion_r56272714
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
    @@ -316,31 +319,55 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext)
extends Loggi
             // `Aggregate`s.
             CollapseProject),
           Batch("Recover Scoping Info", Once,
    -        // Used to handle other auxiliary `Project`s added by analyzer (e.g.
    -        // `ResolveAggregateFunctions` rule)
    -        AddSubquery,
    -        // Previous rule will add extra sub-queries, this rule is used to re-propagate
and update
    -        // the qualifiers bottom up, e.g.:
    -        //
    -        // Sort
    -        //   ordering = t1.a
    -        //   Project
    -        //     projectList = [t1.a, t1.b]
    -        //     Subquery gen_subquery
    -        //       child ...
    -        //
    -        // will be transformed to:
    -        //
    -        // Sort
    -        //   ordering = gen_subquery.a
    -        //   Project
    -        //     projectList = [gen_subquery.a, gen_subquery.b]
    -        //     Subquery gen_subquery
    -        //       child ...
    -        UpdateQualifiers
    +        // Remove all sub queries, as we will insert new ones when it's necessary.
    +        EliminateSubqueryAliases,
    +        // A logical plan is allowed to have same-name outputs with different qualifiers(e.g.
the
    +        // `Join` operator). However, this kind of plan can't be put under a sub query
as we will
    +        // erase and assign a new qualifier to all outputs and make it impossible to
distinguish
    +        // same-name outputs. This rule renames all attributes, to guarantee different
    +        // attributes(with different exprId) always have different names. It also removes
all
    +        // qualifiers, as attributes have unique names now and we don't need qualifiers
to resolve
    +        // ambiguity.
    --- End diff --
    
    Where do we reassign the needed column names?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message