hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7343) A comprehensive and flexible storage policy engine
Date Tue, 04 Nov 2014 06:07:33 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195760#comment-14195760

Kai Zheng commented on HDFS-7343:

Abstracted from [~Andrew.wang]'s [meeting notes|https://issues.apache.org/jira/browse/HDFS-7285?focusedCommentId=14192480&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14192480]:
* Agreement that StoragePolicy is currently an inflexible entity, hardcoded specification
of what StorageTypes to use.
* Need some higher-level blob of code that looks at file attributes, access patterns, other
higher-level metadata, and then based on that chooses the StorageType for the data.
* Ideally, flow looks like (file attributes and metadata) -> (policy engine) -> (data
temperature) -> (storage types / EC / compression to use based on what's present in cluster).
* User would not be allowed to manually set the data temperature, but could query it. This
prevents users and the policy engine from fighting each other.
* Users could possibly set some kind of "force" xattr though, which the policy engine would
* Issue of keeping the policy consistent. Things like the balancer and mover need to be aware
of the policy so they don't fight it. How is the policy distributed if it's not hardcoded,
and is some code blob? Ultimately would be good to move these responsibilities back into the
* Question of what if anything needs to be changed in branch-2.6. Since custom StoragePolicies
are not allowed, we should be good as long as whatever future policy engine respects the current

> A comprehensive and flexible storage policy engine
> --------------------------------------------------
>                 Key: HDFS-7343
>                 URL: https://issues.apache.org/jira/browse/HDFS-7343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Kai Zheng
> As discussed in HDFS-7285, it would be better to have a comprehensive and flexible storage
policy engine considering file attributes, metadata, data temperature, storage type, EC codec,
available hardware capabilities, user/application preference and etc.

This message was sent by Atlassian JIRA

View raw message