spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: ephemeral storage level in spark ?
Date Mon, 07 Apr 2014 02:59:23 GMT
The off-heap storage level is currently tied to Tachyon, but it might support other forms of
off-heap storage later. However it’s not really designed to be mixed with the other ones.
For this use case you may want to rely on memory locality and have some custom code to push
the data to the accelerator. If you can think of a way to extend the storage level concept
to handle this that would be general though, do send a proposal.

Matei

On Apr 5, 2014, at 5:14 PM, Mridul Muralidharan <mridul@gmail.com> wrote:

> No, I am thinking along lines of writing to an accelerator card or
> dedicated card with its own memory.
> 
> Regards,
> Mridul
> On Apr 6, 2014 5:19 AM, "Haoyuan Li" <haoyuan.li@gmail.com> wrote:
> 
>> Hi Mridul,
>> 
>> Do you mean the scenario that different Spark applications need to read the
>> same raw data, which is stored in a remote cluster or machines. And the
>> goal is to load the remote raw data only once?
>> 
>> Haoyuan
>> 
>> 
>> On Sat, Apr 5, 2014 at 4:30 PM, Mridul Muralidharan <mridul@gmail.com
>>> wrote:
>> 
>>> Hi,
>>> 
>>>  We have a requirement to use a (potential) ephemeral storage, which
>>> is not within the VM, which is strongly tied to a worker node. So
>>> source of truth for a block would still be within spark; but to
>>> actually do computation, we would need to copy data to external device
>>> (where it might lie around for a while : so data locality really
>>> really helps if we can avoid a subsequent copy if it is already
>>> present on computations on same block again).
>>> 
>>> I was wondering if the recently added storage level for tachyon would
>>> help in this case (note, tachyon wont help; just the storage level
>>> might).
>>> What sort of guarantees does it provide ? How extensible is it ? Or is
>>> it strongly tied to tachyon with only a generic name ?
>>> 
>>> 
>>> Thanks,
>>> Mridul
>>> 
>> 
>> 
>> 
>> --
>> Haoyuan Li
>> Algorithms, Machines, People Lab, EECS, UC Berkeley
>> http://www.cs.berkeley.edu/~haoyuan/
>> 


Mime
View raw message