spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From r...@apache.org
Subject spark git commit: [SPARK-7957] Preserve partitioning when using randomSplit
Date Sat, 30 May 2015 05:19:25 GMT
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 400e6dbce -> 1513cffa3


[SPARK-7957] Preserve partitioning when using randomSplit

cc JoshRosen
Thanks for noticing this!

Author: Burak Yavuz <brkyvz@gmail.com>

Closes #6509 from brkyvz/sample-perf-reg and squashes the following commits:

497465d [Burak Yavuz] addressed code review
293f95f [Burak Yavuz] [SPARK-7957] Preserve partitioning when using randomSplit

(cherry picked from commit 7ed06c39922ac90acab3a78ce0f2f21184ed68a5)
Signed-off-by: Reynold Xin <rxin@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1513cffa
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1513cffa
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1513cffa

Branch: refs/heads/branch-1.4
Commit: 1513cffa35d520c2d4b620399944b19888d88fc2
Parents: 400e6db
Author: Burak Yavuz <brkyvz@gmail.com>
Authored: Fri May 29 22:19:15 2015 -0700
Committer: Reynold Xin <rxin@databricks.com>
Committed: Fri May 29 22:19:23 2015 -0700

----------------------------------------------------------------------
 core/src/main/scala/org/apache/spark/rdd/RDD.scala | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/1513cffa/core/src/main/scala/org/apache/spark/rdd/RDD.scala
----------------------------------------------------------------------
diff --git a/core/src/main/scala/org/apache/spark/rdd/RDD.scala b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
index 5fcef25..10610f4 100644
--- a/core/src/main/scala/org/apache/spark/rdd/RDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/RDD.scala
@@ -434,11 +434,11 @@ abstract class RDD[T: ClassTag](
    * @return A random sub-sample of the RDD without replacement.
    */
   private[spark] def randomSampleWithRange(lb: Double, ub: Double, seed: Long): RDD[T] =
{
-    this.mapPartitionsWithIndex { case (index, partition) =>
+    this.mapPartitionsWithIndex( { (index, partition) =>
       val sampler = new BernoulliCellSampler[T](lb, ub)
       sampler.setSeed(seed + index)
       sampler.sample(partition)
-    }
+    }, preservesPartitioning = true)
   }
 
   /**


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message