spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wenc...@apache.org
Subject spark git commit: [SPARK-14863][SQL] Cache TreeNode's hashCode by default
Date Sat, 23 Apr 2016 05:42:54 GMT
Repository: spark
Updated Branches:
  refs/heads/master 39a77e156 -> bdde010ed


[SPARK-14863][SQL] Cache TreeNode's hashCode by default

Caching TreeNode's `hashCode` can lead to orders-of-magnitude performance improvement in certain
optimizer rules when operating on huge/complex schemas.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #12626 from JoshRosen/cache-treenode-hashcode.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bdde010e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bdde010e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bdde010e

Branch: refs/heads/master
Commit: bdde010edbc79e506e183e2b9a2b9b19f7b226fb
Parents: 39a77e1
Author: Josh Rosen <joshrosen@databricks.com>
Authored: Sat Apr 23 13:42:44 2016 +0800
Committer: Wenchen Fan <wenchen@databricks.com>
Committed: Sat Apr 23 13:42:44 2016 +0800

----------------------------------------------------------------------
 .../scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala    | 5 +++++
 1 file changed, 5 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/bdde010e/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
----------------------------------------------------------------------
diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
index 3d0e016..5eb8fdf 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
@@ -71,7 +71,9 @@ object CurrentOrigin {
   }
 }
 
+// scalastyle:off
 abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
+// scalastyle:on
   self: BaseType =>
 
   val origin: Origin = CurrentOrigin.get
@@ -84,6 +86,9 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product
{
 
   lazy val containsChild: Set[TreeNode[_]] = children.toSet
 
+  private lazy val _hashCode: Int = scala.util.hashing.MurmurHash3.productHash(this)
+  override def hashCode(): Int = _hashCode
+
   /**
    * Faster version of equality which short-circuits when two treeNodes are the same instance.
    * We don't just override Object.equals, as doing so prevents the scala compiler from


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message