Return-Path: X-Original-To: apmail-spark-reviews-archive@minotaur.apache.org Delivered-To: apmail-spark-reviews-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 76C8318FC1 for ; Mon, 2 Nov 2015 21:06:19 +0000 (UTC) Received: (qmail 33784 invoked by uid 500); 2 Nov 2015 21:06:13 -0000 Delivered-To: apmail-spark-reviews-archive@spark.apache.org Received: (qmail 33759 invoked by uid 500); 2 Nov 2015 21:06:13 -0000 Mailing-List: contact reviews-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@spark.apache.org Received: (qmail 33747 invoked by uid 99); 2 Nov 2015 21:06:12 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 02 Nov 2015 21:06:12 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id A318EE0901; Mon, 2 Nov 2015 21:06:12 +0000 (UTC) From: holdenk To: reviews@spark.apache.org Reply-To: reviews@spark.apache.org References: In-Reply-To: Subject: [GitHub] spark pull request: [SPARK-11275][SQL][WIP] Rollup and Cube Genera... Content-Type: text/plain Message-Id: <20151102210612.A318EE0901@git1-us-west.apache.org> Date: Mon, 2 Nov 2015 21:06:12 +0000 (UTC) Github user holdenk commented on a diff in the pull request: https://github.com/apache/spark/pull/9419#discussion_r43682031 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -240,10 +240,52 @@ class Analyzer( x.child } + // We will insert another Projection if the GROUP BY keys are contained in the + // aggregation. And the top operators can references those keys by its alias. + // e.g. SELECT a, b, sum(a) FROM src GROUP BY a, b with rollup ==> + // SELECT a, b, sum(a1) FROM (SELECT a, b, a AS a1 FROM src) GROUP BY a, b with rollup + + // collect all the distinct attributes that are in both aggregation functions and + // group by clauses + val attrInAggregatedFuncAndGroupBy = aggregation.collect { + case aggFunc: Alias => aggFunc.collect { + case a : Attribute if newGroupByExprs.contains(a) => a} --- End diff -- Were doing a contains here on a sequence, this could maybe get a bit slow with a large number of aggregates / grouping expressions --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org For additional commands, e-mail: reviews-help@spark.apache.org