spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Erlandson <...@redhat.com>
Subject RFC: Supporting the Scala drop Method for Spark RDDs
Date Mon, 21 Jul 2014 15:24:11 GMT
A few weeks ago I submitted a PR for supporting rdd.drop(n), under SPARK-2315:
https://issues.apache.org/jira/browse/SPARK-2315

Supporting the drop method would make some operations convenient, however it forces computation
of >= 1 partition of the parent RDD, and so it would behave like a "partial action" that
returns an RDD as the result.

I wrote up a discussion of these trade-offs here:
http://erikerlandson.github.io/blog/2014/07/20/some-implications-of-supporting-the-scala-drop-method-for-spark-rdds/

Mime
View raw message