reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joo Seong (Jason) Jeong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (REEF-1479) Define interface for distributed dataset
Date Thu, 30 Jun 2016 20:22:10 GMT

    [ https://issues.apache.org/jira/browse/REEF-1479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357785#comment-15357785
] 

Joo Seong (Jason) Jeong commented on REEF-1479:
-----------------------------------------------

In fact, many classes from the Org.Apache.REEF.IO package looks reusable, like {{IPartitionedInputDataSet}}
and {{IInputPartition}}.

> Define interface for distributed dataset 
> -----------------------------------------
>
>                 Key: REEF-1479
>                 URL: https://issues.apache.org/jira/browse/REEF-1479
>             Project: REEF
>          Issue Type: Sub-task
>          Components: REEF.NET
>            Reporter: Joo Seong (Jason) Jeong
>
> As a first step of [REEF-1477|https://issues.apache.org/jira/browse/REEF-1477], we'd
like to define an interface for the distributed dataset that we will work with. This dataset
interface serves as an abstraction of many dataset partitions, one on each Evaluator. In some
sense, the class {{IPartitionedInputDataSet}} is very similar to what we want, except that
the new interface will contain action methods like {{RunIMRU}} or {{RunTransform}}.
> {code}
> interface IDataSet<T> {
>   // apply a transform to this dataset
>   IDataSet<T'> RunTransform(Transform<T, T'> transform);
>   // run an IMRU job on this dataset and get some results
>   T' RunIMRU(IMRUConfiguration<T, T'> imruConfiguration);
>   // store this dataset to some destination
>   void Store(URI uri);
> }
> interface IDataSetLoader<T> {
>   // generate a dataset from some source
>   IDataSet<T> Load(URI uri);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message