spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wenc...@apache.org
Subject spark git commit: [SPARK-17182][SQL] Mark Collect as non-deterministic
Date Tue, 23 Aug 2016 01:14:54 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-2.0 225898961 -> eaea1c86b


[SPARK-17182][SQL] Mark Collect as non-deterministic

## What changes were proposed in this pull request?

This PR marks the abstract class `Collect` as non-deterministic since the results of `CollectList`
and `CollectSet` depend on the actual order of input rows.

## How was this patch tested?

Existing test cases should be enough.

Author: Cheng Lian <lian@databricks.com>

Closes #14749 from liancheng/spark-17182-non-deterministic-collect.

(cherry picked from commit 2cdd92a7cd6f85186c846635b422b977bdafbcdd)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/eaea1c86
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/eaea1c86
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/eaea1c86

Branch: refs/heads/branch-2.0
Commit: eaea1c86b897d302107a9b6833a27a2b24ca31a0
Parents: 2258989
Author: Cheng Lian <lian@databricks.com>
Authored: Tue Aug 23 09:11:47 2016 +0800
Committer: Wenchen Fan <wenchen@databricks.com>
Committed: Tue Aug 23 09:14:47 2016 +0800

----------------------------------------------------------------------
 .../spark/sql/catalyst/expressions/aggregate/collect.scala       | 4 ++++
 1 file changed, 4 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/eaea1c86/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala
----------------------------------------------------------------------
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala
index ac2cefa..896ff61 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/collect.scala
@@ -54,6 +54,10 @@ abstract class Collect extends ImperativeAggregate {
 
   override def inputAggBufferAttributes: Seq[AttributeReference] = Nil
 
+  // Both `CollectList` and `CollectSet` are non-deterministic since their results depend
on the
+  // actual order of input rows.
+  override def deterministic: Boolean = false
+
   protected[this] val buffer: Growable[Any] with Iterable[Any]
 
   override def initialize(b: MutableRow): Unit = {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message