reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dhruv Mahajan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (REEF-1339) Adding IInputPartition.Cache() for data download and cache
Date Mon, 25 Apr 2016 20:42:12 GMT

    [ https://issues.apache.org/jira/browse/REEF-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257007#comment-15257007
] 

Dhruv Mahajan commented on REEF-1339:
-------------------------------------

But I assume that at the IInputDataset level we provide certain default cache options that
user will configure and need not worry about? For example loading in memory, disk etc. If
he wants to do something more complicated for example ditributing cache among multiple disks
on same machine then he will resort to his own customized implementation.

> Adding IInputPartition.Cache() for data download and cache
> ----------------------------------------------------------
>
>                 Key: REEF-1339
>                 URL: https://issues.apache.org/jira/browse/REEF-1339
>             Project: REEF
>          Issue Type: Task
>            Reporter: Julia
>            Assignee: Andrew Chung
>              Labels: FT
>
> Currently, in FileSystemInputPartition, data downloading is implemented in Initilaize()
and called from GetPartitionHandle. It doesn't give client a flexibility to decide when to
download data. Besides, if client wants to cache data in advance, they need to call GetPartitionHandle()
and iterate the data. 
> We would like to expose a new API Cache() in IInputPartition which performs data download
to RAM, SSD, HDD, etc based on client's configuration. 
> The method should be called in ContextStartHandler  in IMRU scenarios. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message