spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bogdan Ghit (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-6112) Provide external block store support through HDFS RAM_DISK
Date Sat, 20 Jun 2015 18:09:01 GMT

    [ https://issues.apache.org/jira/browse/SPARK-6112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594708#comment-14594708
] 

Bogdan Ghit commented on SPARK-6112:
------------------------------------

Is it possible to share an RDD across different jobs? I tried to save an RDD in /tmp/spark-hdfs
which is set to LAZY_PERSIST, but the blocks go to disk, rather than to tmpfs. Also, all properties
spark.offHeapStore.* should be changed to spark.externalBlockStore.*.



> Provide external block store support through HDFS RAM_DISK
> ----------------------------------------------------------
>
>                 Key: SPARK-6112
>                 URL: https://issues.apache.org/jira/browse/SPARK-6112
>             Project: Spark
>          Issue Type: New Feature
>          Components: Block Manager
>            Reporter: Zhan Zhang
>         Attachments: SparkOffheapsupportbyHDFS.pdf
>
>
> HDFS Lazy_Persist policy provide possibility to cache the RDD off_heap in hdfs. We may
want to provide similar capacity to Tachyon by leveraging hdfs RAM_DISK feature, if the user
environment does not have tachyon deployed. 
> With this feature, it potentially provides possibility to share RDD in memory across
different jobs and even share with jobs other than spark, and avoid the RDD recomputation
if executors crash. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message