hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiao Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7343) HDFS smart storage management
Date Mon, 17 Oct 2016 20:06:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583292#comment-15583292

Xiao Chen commented on HDFS-7343:

Thanks all for the great documentation and discussions. It will be an interesting undertaking.

May be too early to ask: in order to do HDFS management work, the SSM has to run as hdfs superuser,

And related to Andrew's question on performance-based decisions, is it manual or automatic
(or both)?
The doc says {{SSM can make prediction on a file’s read based on read historical information
and cache the file automatically before the read operation happens}}, and later gives an example
of a similar rule ({{every 1d at 0:00 | age lt 30d | cache}}). I think that means both: the
description indicating the automatic part, and the rule showing a same example for a manual
control. Is it true?
If the query is not latency-sensitive, the caching-uncaching in the 'automatic' way may be
unnecessary. Is it possible to not have the automatic way happen for some workloads? I can
think of similar cases where converting between EC <-> replica may not be necessary.

> HDFS smart storage management
> -----------------------------
>                 Key: HDFS-7343
>                 URL: https://issues.apache.org/jira/browse/HDFS-7343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kai Zheng
>            Assignee: Wei Zhou
>         Attachments: HDFS-Smart-Storage-Management.pdf
> As discussed in HDFS-7285, it would be better to have a comprehensive and flexible storage
policy engine considering file attributes, metadata, data temperature, storage type, EC codec,
available hardware capabilities, user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution to provide
smart storage management service in order for convenient, intelligent and effective utilizing
of erasure coding or replicas, HDFS cache facility, HSM offering, and all kinds of tools (balancer,
mover, disk balancer and so on) in a large cluster.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message