hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Manoj Govindassamy (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-12191) Provide option to not capture the accessTime change of a file to snapshot if no other modification has been done
Date Tue, 25 Jul 2017 18:58:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100531#comment-16100531
] 

Manoj Govindassamy edited comment on HDFS-12191 at 7/25/17 6:57 PM:
--------------------------------------------------------------------

[~jingzhao], [~zhz], Nicholas,
Would like to understand the rational behind "atime" only or "mtime" only change being part
of recordModification(). Technically speaking, it makes sense to record modification for any
meta data modification, and since setTime() is changing the meta data, it wants to do record
modification. So, would like to know what we would be missing if we are going to skip recoding
modification for "atime" changes. Your thoughts please?

Also, if we want to skip the "atime" only changes to be recorded, the already existing access
precision time configuration sounds like a good approach. Having one more configuration to
skip atime in the snap diffs, which will override the existing access precision time doesn't
gel well. 


was (Author: manojg):
[~jingzhao], [~zhz], Nicholas,
Would like to understand the rational behind "atime" only or "mtime" only change being part
of recordModification(). Technically speaking, it makes sense to record modification for any
meta data modification, and since setTime() is changing the meta data, it wants to do record
modification. So, would like to know what we would be missing if we are going to skip recoding
modification for "atime" changes. 

Also, if we want to skip the "atime" only changes to be recorded, the already existing access
precision time configuration sounds like a good approach. Having one more configuration to
skip atime in the snap diffs, which will override the existing access precision time doesn't
gel well. Your thoughts please ?

> Provide option to not capture the accessTime change of a file to snapshot if no other
modification has been done
> ----------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-12191
>                 URL: https://issues.apache.org/jira/browse/HDFS-12191
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs, namenode
>    Affects Versions: 3.0.0-beta1
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-12191.001.patch
>
>
> Currently, if the accessTime of a file changed before a snapshot is taken, this accessTime
will be captured in the snapshot, even if there is no other modifications made to this file.
> Because of this, when we calculate snapshotDiff, more work need to be done for this file,
e,g,, metadataEquals method will be called, even if there is no modification is made (thus
not recorded to snapshotDiff). This can cause snapshotDiff to slow down quite a lot when there
are a lot of files to be examined.
> This jira is to provide an option to skip capturing accessTime only change to snapshot.
Thus snapshotDiff can be done faster.
> When accessTime of a file changed, if there is other modification to the file, the access
time will still be captured in snapshot.
> Sometimes we want accessTime be captured to snapshot, such that when restoring from the
snapshot, we know the accessTime of this snapshot. So this new feature is optional, and is
controlled by a config property.
> Worth to mention is, how accurately the acessTime is captured is dependent on the following
config that has default value of 1 hour, which means new access within an hour of previous
access will not be captured.
> {code}
> public static final String  DFS_NAMENODE_ACCESSTIME_PRECISION_KEY =
>       HdfsClientConfigKeys.DeprecatedKeys.DFS_NAMENODE_ACCESSTIME_PRECISION_KEY;
> public static final long    DFS_NAMENODE_ACCESSTIME_PRECISION_DEFAULT = 3600000;
> {code}
> .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message