Mailing-List: contact reviews-help@spark.apache.org; run by ezmlm
Precedence: bulk
From: rxin <git@git.apache.org>
To: reviews@spark.apache.org
Reply-To: reviews@spark.apache.org
References: <git-pr-11696-spark@git.apache.org>
In-Reply-To: <git-pr-11696-spark@git.apache.org>
Subject: [GitHub] spark pull request: [SPARK-12719][SQL] SQL generation
 support for ...
Content-Type: text/plain
Message-Id: <20160315063358.71C4CE098D@git1-us-west.apache.org>
Date: Tue, 15 Mar 2016 06:33:58 +0000 (UTC)

Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11696#discussion_r56120392
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/SQLBuilder.scala ---
    @@ -316,31 +345,73 @@ class SQLBuilder(logicalPlan: LogicalPlan, sqlContext: SQLContext) extends Loggi
             // `Aggregate`s.
             CollapseProject),
           Batch("Recover Scoping Info", Once,
    -        // Used to handle other auxiliary `Project`s added by analyzer (e.g.
    -        // `ResolveAggregateFunctions` rule)
    -        AddSubquery,
    -        // Previous rule will add extra sub-queries, this rule is used to re-propagate and update
    -        // the qualifiers bottom up, e.g.:
    -        //
    -        // Sort
    -        //   ordering = t1.a
    -        //   Project
    -        //     projectList = [t1.a, t1.b]
    -        //     Subquery gen_subquery
    -        //       child ...
    -        //
    -        // will be transformed to:
    -        //
    -        // Sort
    -        //   ordering = gen_subquery.a
    -        //   Project
    -        //     projectList = [gen_subquery.a, gen_subquery.b]
    -        //     Subquery gen_subquery
    -        //       child ...
    -        UpdateQualifiers
    +        // Remove all sub queries, as we will insert new ones when it's necessary.
    +        EliminateSubqueryAliases,
    +        // A logical plan is allowed to have same-name outputs with different qualifiers(e.g. the
    +        // `Join` operator). However, this kind of plan can't be put under a sub query as we will
    +        // erase and assign a new qualifier to all outputs and make it impossible to distinguish
    +        // same-name outputs. This rule renames all attributes, to guarantee different
    +        // attributes(with different exprId) always have different names. It also removes all
    +        // qualifiers, as attributes have unique names now and we don't need qualifiers to resolve
    +        // ambiguity.
    +        NormalizedAttribute,
    +        // Wraps table information with SQLTable, and combine `Sample` operator if there are any.
    +        ResolveSQLTable,
    +        // Re-order operators to let them generate legal SQL string, e.g. we should push down
    +        // `Generate` through `Filter`. We should enrich this rule in the future.
    +        ReOrderOperators,
    --- End diff --
    
    let's put your comment actually in the code so we know why we need to re-order the operators


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org