spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rxin <...@git.apache.org>
Subject [GitHub] spark pull request: [SPARK-12719][SQL] SQL generation support for ...
Date Tue, 15 Mar 2016 06:33:58 GMT
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11696#discussion_r56120392
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
    @@ -316,31 +345,73 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext)
extends Loggi
             // `Aggregate`s.
             CollapseProject),
           Batch("Recover Scoping Info", Once,
    -        // Used to handle other auxiliary `Project`s added by analyzer (e.g.
    -        // `ResolveAggregateFunctions` rule)
    -        AddSubquery,
    -        // Previous rule will add extra sub-queries, this rule is used to re-propagate
and update
    -        // the qualifiers bottom up, e.g.:
    -        //
    -        // Sort
    -        //   ordering = t1.a
    -        //   Project
    -        //     projectList = [t1.a, t1.b]
    -        //     Subquery gen_subquery
    -        //       child ...
    -        //
    -        // will be transformed to:
    -        //
    -        // Sort
    -        //   ordering = gen_subquery.a
    -        //   Project
    -        //     projectList = [gen_subquery.a, gen_subquery.b]
    -        //     Subquery gen_subquery
    -        //       child ...
    -        UpdateQualifiers
    +        // Remove all sub queries, as we will insert new ones when it's necessary.
    +        EliminateSubqueryAliases,
    +        // A logical plan is allowed to have same-name outputs with different qualifiers(e.g.
the
    +        // `Join` operator). However, this kind of plan can't be put under a sub query
as we will
    +        // erase and assign a new qualifier to all outputs and make it impossible to
distinguish
    +        // same-name outputs. This rule renames all attributes, to guarantee different
    +        // attributes(with different exprId) always have different names. It also removes
all
    +        // qualifiers, as attributes have unique names now and we don't need qualifiers
to resolve
    +        // ambiguity.
    +        NormalizedAttribute,
    +        // Wraps table information with SQLTable, and combine `Sample` operator if there
are any.
    +        ResolveSQLTable,
    +        // Re-order operators to let them generate legal SQL string, e.g. we should push
down
    +        // `Generate` through `Filter`. We should enrich this rule in the future.
    +        ReOrderOperators,
    --- End diff --
    
    let's put your comment actually in the code so we know why we need to re-order the operators


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message