hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo Nicholas Sze (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7878) API - expose an unique file identifier
Date Fri, 10 Apr 2015 01:02:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488682#comment-14488682

Tsz Wo Nicholas Sze commented on HDFS-7878:

> 1-2) I can replace with nullable, but it seems that it's cleaner to have hasFile/getFile.

Is users supposed to call hasFileID before calling getFileID all the time?  For public APIs,
we need to design it carefully since we cannot easily change them later on.

> It is a public API. ... the intent is to get file ID and open files by file ID ...

Then, why not adding a FileSystem.open(fileID) method instead?  Yes, we need a public API
but not necessarily makePathFromFileId(..).  So, I suggest adding it to DFSUtil for the moment.
 We can move it to HdfsUtils once we have decided it is the right API.

> 4) What kind of testing? It adds a field.

I would say all FileSystem tests using regular Path should have a fileID Path test; see FileSystemContractBaseTest.
 What are your use cases in your mind?  You should also test them.  This patch also exposes
the fileID feature to users.  We should make sure that the feature is working.

> API - expose an unique file identifier
> --------------------------------------
>                 Key: HDFS-7878
>                 URL: https://issues.apache.org/jira/browse/HDFS-7878
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, HDFS-7878.03.patch, HDFS-7878.04.patch,
HDFS-7878.05.patch, HDFS-7878.patch
> See HDFS-487.
> Even though that is resolved as duplicate, the ID is actually not exposed by the JIRA
it supposedly duplicates.
> INode ID for the file should be easy to expose; alternatively ID could be derived from
block IDs, to account for appends...
> This is useful e.g. for cache key by file, to make sure cache stays correct when file
is overwritten.

This message was sent by Atlassian JIRA

View raw message