hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lei (Eddy) Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7878) API - expose an unique file identifier
Date Thu, 22 Sep 2016 22:13:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15514629#comment-15514629

Lei (Eddy) Xu commented on HDFS-7878:

[~chris.douglas] Thanks a lot for working on it. 

I'd prefer to use {{FileStatus}} with {{file id}} instead a new {{FileHandler}} or {{InodeId}}.
 As it is more familiar with the users, and if people want to use {{FileId}} alone to save
memory (e.g., using in cache), they have the choice of using {{FileStatus#getFileId()}}. 
I think it'd be easier to make this API be used by downstream projects. Additionally, {{InodeId}}
looks implementation-specific to me, which makes this API not useful to or be supported natively
by other backend (i.e., Azure or S3)?. One additional point is that {{stat(2)}} returns inode
({{stat.st_inode}}) as well, so it should not be too surprised for the end user.

And it might be worthwhile to take this chance to finally change {{FileStatus}} to be serializable
for Protobuf (HDFS-6984) from Hadoop 3 and onward.  


> API - expose an unique file identifier
> --------------------------------------
>                 Key: HDFS-7878
>                 URL: https://issues.apache.org/jira/browse/HDFS-7878
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-7878.01.patch, HDFS-7878.02.patch, HDFS-7878.03.patch, HDFS-7878.04.patch,
HDFS-7878.05.patch, HDFS-7878.06.patch, HDFS-7878.patch
> See HDFS-487.
> Even though that is resolved as duplicate, the ID is actually not exposed by the JIRA
it supposedly duplicates.
> INode ID for the file should be easy to expose; alternatively ID could be derived from
block IDs, to account for appends...
> This is useful e.g. for cache key by file, to make sure cache stays correct when file
is overwritten.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message