reef-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julia (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (REEF-873) Fix DataSet id issue
Date Wed, 20 Jan 2016 21:38:39 GMT

    [ https://issues.apache.org/jira/browse/REEF-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109485#comment-15109485
] 

Julia commented on REEF-873:
----------------------------

Yes, we currently derive id from input file path, that meets the requirement of this Jira.

However, many file names/paths we received are auto generated from guid, it may not be meaningful
anyway. 

> Fix DataSet id issue
> --------------------
>
>                 Key: REEF-873
>                 URL: https://issues.apache.org/jira/browse/REEF-873
>             Project: REEF
>          Issue Type: Bug
>            Reporter: Julia
>            Assignee: Julia
>
> Currently the id of the dataset is formed from extract some string from the input file
name. In reality, the input file name itself can be a generated file with some random numbers
that makes the file name has no meaning at all. And since the file name can be very long,
that can also break the current way to form the id. 
> We should remove this id generation dependency on the input file names, instead, comes
out a id with a prefix like "FileSystemDataSet-" plus a guid for example, as long as it is
unique. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message