spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject How StorageLevel, CacheManager and checkpointing influence computing RDD partitions?
Date Sat, 10 Oct 2015 14:37:39 GMT
Hi,

I've been reviewing the Spark code and noticed that `iterator` method
of RDD [1] does a check whether RDD has a non-NONE storage and calls
`computeOrReadCheckpoint` private method [2] that checks RDD
checkpointing.

Is there a doc on how StorageLevel, CacheManager and checkpointing
influence partition computation?

Specifically, why would I have NONE StorageLevel and RDD checkpointing
enabled? What is the use case for such a configuration? What about the
other options?

Any pointers are greatly appreciated, including blog posts,
StackOverflow, Quora, archive.

[1] https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L260-L266
[2] https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L292-L298

Pozdrawiam,
Jacek

--
Jacek Laskowski | http://blog.japila.pl | http://blog.jaceklaskowski.pl
Follow me at https://twitter.com/jaceklaskowski
Upvote at http://stackoverflow.com/users/1305344/jacek-laskowski

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message