hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4058) Transparent archival and restore of files from HDFS
Date Wed, 03 Sep 2008 18:09:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628079#action_12628079

dhruba borthakur commented on HADOOP-4058:

I agree with Allen to a certain extent. The  design has to be such that the "retention-policy"
be part of a layer written above the HDFS file system. I would like the design to be similar
to approach adopted by the block-rebalancing code. The block-rebalacer is not in the namenode.
It uses a few primitives in the namenode, but mos tof the logic of what-to-move, when-to-move,
etc are handled outside the namenode.

i would like the "transparent-archiving-and-restoring" code to be designed similarly. I think
the implementation would be more than a "bunch of scripts"! It would involve some server(s)
that continuously monitors the access pattern in a cluster and take appropriate action.

I like the idea of having a volume-abstraction. But I think this idea of auto-archiving-and-restore
is orthogonal to "volumes". Can you pl explain how they are related?

> Transparent archival and restore of files from HDFS
> ---------------------------------------------------
>                 Key: HADOOP-4058
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4058
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
> There should be a facility to migrate old files away from a production cluster. Access
to those files from applications should continue to work transparently, without changing application
code, but maybe with reduced performance. The policy engine  that does this could be layered
on HDFS rather than being built into HDFS itself.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message