reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julia (JIRA)" <>
Subject [jira] [Commented] (REEF-1206) Update Deserializer API to allow deserialize from remote files directly
Date Sat, 20 Feb 2016 01:30:18 GMT


Julia commented on REEF-1206:

I have removed the flag in the API parameter. It is up to the one who implement it to use
FileSystem to read the data. 

BTW, in the scenario where we need to copy the entire file from remote to local, we use the
IFileSystem to download the data, like HadoopFileSystem in IInputPartition. In Deserializer,
as the data is already in local, we always treat it as local file. If we would want to use
a IFileSystem in this case, that would be different IFileSystem for downloading the file in

> Update Deserializer API to allow deserialize from remote files directly
> -----------------------------------------------------------------------
>                 Key: REEF-1206
>                 URL:
>             Project: REEF
>          Issue Type: Task
>            Reporter: Julia
>            Assignee: Julia
> Currently Deserialize() API accept a folder as input parameter. It only works for local
files as only in local case, we can guarantee all the files are copied to the same folder
and the folder only contains the files that need to be deserialized.  
> For remote file case, the file paths are passed from upper stream, the file folder may
contain some other irrelevant files. In this case, passing a set of individual file oaths
would be more more sense. 
> For deserializer, it also needs to know the file paths are local or remote. For remote
files, deserializer must use IFileSystem to access the files. For local files, normal .Net
file system can be simply used. 

This message was sent by Atlassian JIRA

View raw message