hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4213) When the client calls hsync, allows the client to update the file length in the NameNode
Date Wed, 28 Nov 2012 02:42:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505179#comment-13505179
] 

Aaron T. Myers commented on HDFS-4213:
--------------------------------------

bq. Why don't you change the enum to SyncFlag?

Good thinking, Nicholas. I agree that SyncFlag is better.

bq. Aaron, the sync with UPDATE_LENGTH feature is quite specific to HDFS. I am not sure if
it makes sense to other file systems. Do you have examples that it does make sense? A follow
up question is that how should the default implementation be?

I agree with you 100% that the UPDATE_LENGTH flag in particular seems highly HDFS-specific.
What doesn't seem so HDFS-specific to me, however, is an {{hsync(...)}} method which takes
an EnumSet as an argument of extra flags. Given that we currently have {{hsync()}} (no args)
as a method in FSDataOutputStream, I think we should consider also adding {{hsync(EnumSet<SyncFlag>)}}
to FSDataOutputStream. FS implementations which don't do anything with a particular SyncFlag
can just ignore it. This is similar to how {{create(... EnumSet<CreateFlag> ...)}} works
in the existing FileSystem interface. Not every FS may implement every CreateFlag, but that's
fine.

Note that I'm not wedded to this suggestion. I agree that this isn't very pertinent right
now since SyncFlag contains just UPDATE_LENGTH, but if we ever add a SyncFlag that is relevant
to multiple FileSystem implementations then it will make sense to have SyncFlag in Common
and the {{hsync(EnumSet<SyncFlag>)}} method as part of FSDataOutputStream.
                
> When the client calls hsync, allows the client to update the file length in the NameNode
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-4213
>                 URL: https://issues.apache.org/jira/browse/HDFS-4213
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 3.0.0
>            Reporter: Jing Zhao
>            Assignee: Jing Zhao
>         Attachments: HDFS-4213.001.patch, HDFS-4213.002.patch, HDFS-4213.003.patch, HDFS-4213.004.patch,
HDFS-4213.005.patch, HDFS-4213.006.patch
>
>
> As per discussion in HDFS-3960 and HDFS-2802, when clients that need strong consistency
update the file length at the NameNode, a special sync/flush is required for getting the length
of the being written files when snapshots are taken for these files. This jira implements
this sync-with-updating-length by 1) calling ClientProtocol#fsync(), and 2) adding a new field
to ClientProtocol#fsync() to indicate the length information.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message