hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zesheng Wu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6382) HDFS File/Directory TTL
Date Wed, 11 Jun 2014 01:48:01 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14027325#comment-14027325
] 

Zesheng Wu commented on HDFS-6382:
----------------------------------

bq. Even if it's not implemented at first, we should think about the configuration required
here. I think we want the ability to email the admins when things go wrong. Possibly the notifier
could be pluggable or have several policies. There was nothing in the doc about configuration
in general, which I think we need to fix. For example, how is rate limiting configurable?
How do we notify admins that the rate is too slow to finish in the time given?
OK, I will update the document and post a new version soon.

bq. You can't delete a file in HDFS unless you have write permission on the containing directory.
Whether you have write permission on the file itself is not relevant. So I would expect the
same semantics here (probably enforced by setfacl itself).
That's reasonable, I'll figure it out clearly in the document.

> HDFS File/Directory TTL
> -----------------------
>
>                 Key: HDFS-6382
>                 URL: https://issues.apache.org/jira/browse/HDFS-6382
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client, namenode
>    Affects Versions: 2.4.0
>            Reporter: Zesheng Wu
>            Assignee: Zesheng Wu
>         Attachments: HDFS-TTL-Design.pdf
>
>
> In production environment, we always have scenario like this, we want to backup files
on hdfs for some time and then hope to delete these files automatically. For example, we keep
only 1 day's logs on local disk due to limited disk space, but we need to keep about 1 month's
logs in order to debug program bugs, so we keep all the logs on hdfs and delete logs which
are older than 1 month. This is a typical scenario of HDFS TTL. So here we propose that hdfs
can support TTL.
> Following are some details of this proposal:
> 1. HDFS can support TTL on a specified file or directory
> 2. If a TTL is set on a file, the file will be deleted automatically after the TTL is
expired
> 3. If a TTL is set on a directory, the child files and directories will be deleted automatically
after the TTL is expired
> 4. The child file/directory's TTL configuration should override its parent directory's
> 5. A global configuration is needed to configure that whether the deleted files/directories
should go to the trash or not
> 6. A global configuration is needed to configure that whether a directory with TTL should
be deleted when it is emptied by TTL mechanism or not.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message