spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yh...@apache.org
Subject spark git commit: [SPARK-12077][SQL] change the default plan for single distinct
Date Wed, 02 Dec 2015 04:17:48 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-1.6 84c44b500 -> a5743affc


[SPARK-12077][SQL] change the default plan for single distinct

Use try to match the behavior for single distinct aggregation with Spark 1.5, but that's not
scalable, we should be robust by default, have a flag to address performance regression for
low cardinality aggregation.

cc yhuai nongli

Author: Davies Liu <davies@databricks.com>

Closes #10075 from davies/agg_15.

(cherry picked from commit 96691feae0229fd693c29475620be2c4059dd080)
Signed-off-by: Yin Huai <yhuai@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a5743aff
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a5743aff
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a5743aff

Branch: refs/heads/branch-1.6
Commit: a5743affcf73f7bf71517171583cbddc44cc9368
Parents: 84c44b5
Author: Davies Liu <davies@databricks.com>
Authored: Tue Dec 1 20:17:12 2015 -0800
Committer: Yin Huai <yhuai@databricks.com>
Committed: Tue Dec 1 20:17:44 2015 -0800

----------------------------------------------------------------------
 sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala       | 2 +-
 .../test/scala/org/apache/spark/sql/execution/PlannerSuite.scala | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/a5743aff/sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala
----------------------------------------------------------------------
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala b/sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala
index 5ef3a48..58adf64 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala
@@ -451,7 +451,7 @@ private[spark] object SQLConf {
 
   val SPECIALIZE_SINGLE_DISTINCT_AGG_PLANNING =
     booleanConf("spark.sql.specializeSingleDistinctAggPlanning",
-      defaultValue = Some(true),
+      defaultValue = Some(false),
       isPublic = false,
       doc = "When true, if a query only has a single distinct column and it has " +
         "grouping expressions, we will use our planner rule to handle this distinct " +

http://git-wip-us.apache.org/repos/asf/spark/blob/a5743aff/sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
----------------------------------------------------------------------
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
index dfec139..a462625 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala
@@ -44,10 +44,10 @@ class PlannerSuite extends SharedSQLContext {
         fail(s"Could query play aggregation query $query. Is it an aggregation query?"))
     val aggregations = planned.collect { case n if n.nodeName contains "Aggregate" =>
n }
 
-    // For the new aggregation code path, there will be three aggregate operator for
+    // For the new aggregation code path, there will be four aggregate operator for
     // distinct aggregations.
     assert(
-      aggregations.size == 2 || aggregations.size == 3,
+      aggregations.size == 2 || aggregations.size == 4,
       s"The plan of query $query does not have partial aggregations.")
   }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message