hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1869) access times of HDFS files
Date Fri, 29 Aug 2008 00:23:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626806#action_12626806

dhruba borthakur commented on HADOOP-1869:

Ok, from Raghu's and Konstantin's comments, I guess nobody is stuck on how the API looks like.
Whether it is setAccessTime() or utimes(), everybody is ok with it. It appears that both Raghu
and Konstantin are +1 on this one. Please let me know if this is not the case.

The point that is being discussed is whether utimes/setAccessTime allows setting the time
to any user-specified value or whether it sets it to current time on namenode. I still vote
for allowing an user to set any access time... this is what POSIX does and it allows restore
utilities to use a standard API. From Raghu's comments, it appears to me that he is +1 on
it too. 

>So to me it makes more sense to introduce a new create method with FileStatus of the existing
file as a parameter or a 
> copy method with an option to replicate FileStatus fields.

I do not  like the idea of having a custom API as described above.

> access times of HDFS files
> --------------------------
>                 Key: HADOOP-1869
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1869
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>             Fix For: 0.19.0
>         Attachments: accessTime1.patch, accessTime4.patch, accessTime5.patch
> HDFS should support some type of statistics that allows an administrator to determine
when a file was last accessed. 
> Since HDFS does not have quotas yet, it is likely that users keep on accumulating files
in their home directories without much regard to the amount of space they are occupying. This
causes memory-related problems with the namenode.
> Access times are costly to maintain. AFS does not maintain access times. I thind DCE-DFS
does maintain access times with a coarse granularity.
> One proposal for HDFS would be to implement something like an "access bit". 
> 1. This access-bit is set when a file is accessed. If the access bit is already set,
then this call does not result in a transaction.
> 2. A FileSystem.clearAccessBits() indicates that the access bits of all files need to
be cleared.
> An administrator can effectively use the above mechanism (maybe a daily cron job) to
determine files that are recently used.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message