spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Make off-heap store pluggable
Date Tue, 21 Jul 2015 06:52:16 GMT
(Related, not important comment: it would also be nice to separate out the
Tachyon dependency from core, as it's conceptually pluggable but is still
hard-coded into several places in the code, and a lot of the comments/docs
in the code.)

On Tue, Jul 21, 2015 at 5:40 AM, Reynold Xin <rxin@databricks.com> wrote:

> I sent it prematurely.
>
> They are already pluggable, or at least in the process to be more
> pluggable. In 1.4, instead of calling the external system's API directly,
> we added an API for that.  There is a patch to add support for HDFS
> in-memory cache.
>
> Somewhat orthogonal to this, longer term, I am not sure whether it makes
> sense to have the current off heap API, because there is no namespacing and
> the benefit to end users is actually not very substantial (at least I can
> think of much simpler ways to achieve exactly the same gains), and yet it
> introduces quite a bit of complexity to the codebase.
>
>
>
>
> On Mon, Jul 20, 2015 at 9:34 PM, Reynold Xin <rxin@databricks.com> wrote:
>
>> They are already pluggable.
>>
>>
>> On Mon, Jul 20, 2015 at 9:32 PM, Prashant Sharma <scrapcodes@gmail.com>
>> wrote:
>>
>>> +1 Looks like a nice idea(I do not see any harm). Would you like to work
>>> on the patch to support it ?
>>>
>>> Prashant Sharma
>>>
>>>
>>>
>>> On Tue, Jul 21, 2015 at 2:46 AM, Alexey Goncharuk <
>>> alexey.goncharuk@gmail.com> wrote:
>>>
>>>> Hello Spark community,
>>>>
>>>> I was looking through the code in order to understand better how RDD is
>>>> persisted to Tachyon off-heap filesystem. It looks like that the Tachyon
>>>> filesystem is hard-coded and there is no way to switch to another in-memory
>>>> filesystem. I think it would be great if the implementation of the
>>>> BlockManager and BlockStore would be able to plug in another filesystem.
>>>>
>>>> For example, Apache Ignite also has an implementation of in-memory
>>>> filesystem which can store data in on-heap and off-heap formats. It would
>>>> be great if it could integrate with Spark.
>>>>
>>>> I have filed a ticket in Jira:
>>>> https://issues.apache.org/jira/browse/SPARK-9203
>>>>
>>>> If it makes sense, I will be happy to contribute to it.
>>>>
>>>> Thoughts?
>>>>
>>>> -Alexey (Apache Ignite PMC)
>>>>
>>>
>>>
>>
>

Mime
View raw message