hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7343) HDFS smart storage management
Date Wed, 26 Oct 2016 23:19:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15610007#comment-15610007

Kai Zheng commented on HDFS-7343:

Hi [~andrew.wang],

About SSD cases, I thought of an original input for the motivation from a large Hadoop deployment
in China. The user evaluated how to deploy certain amounts of SSDs via HSM to see if any help
to speed up some workloads. The overall pain mentioned was they don't want to maintain by
their operators manually what data should be kept in SSDs and then when to move out as needed
or better according to some condition change.

I agree fixed SLOs are important but I'm not sure that's all the cases. In interactive queries,
for example, data miners may try different queries adjusting the conditions, combinations
or the like, against some same data sets. We would expect the later runnings should be faster
though understand earlier runnings are slow. For repeatedly running queries like daily jobs,
it may be natural to expect them to be faster given there are enough SSDs during that time.

> HDFS smart storage management
> -----------------------------
>                 Key: HDFS-7343
>                 URL: https://issues.apache.org/jira/browse/HDFS-7343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kai Zheng
>            Assignee: Wei Zhou
>         Attachments: HDFS-Smart-Storage-Management.pdf
> As discussed in HDFS-7285, it would be better to have a comprehensive and flexible storage
policy engine considering file attributes, metadata, data temperature, storage type, EC codec,
available hardware capabilities, user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution to provide
smart storage management service in order for convenient, intelligent and effective utilizing
of erasure coding or replicas, HDFS cache facility, HSM offering, and all kinds of tools (balancer,
mover, disk balancer and so on) in a large cluster.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message