hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rakesh R (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7343) HDFS smart storage management
Date Sun, 16 Apr 2017 06:29:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970253#comment-15970253
] 

Rakesh R commented on HDFS-7343:
--------------------------------

bq. It also can bring performance issue. For example, if a rule only cares the last 5s data,
then the whole large table will be scanned in order to filter out the records needed.
Thanks [~zhouwei] for the example. I understand, SSM will have the data aggregation function
to keep data volume under control and the aggregation window plays an important role in this
module. Since this is a performance-centric area, just a suggestion to keep the database module
interface(store/retrieve/aggregation functions) pluggable during implementation so that one
can do necessary changes(later) based on the test results.

> HDFS smart storage management
> -----------------------------
>
>                 Key: HDFS-7343
>                 URL: https://issues.apache.org/jira/browse/HDFS-7343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kai Zheng
>            Assignee: Wei Zhou
>         Attachments: access_count_tables.jpg, HDFSSmartStorageManagement-General-20170315.pdf,
HDFS-Smart-Storage-Management.pdf, HDFSSmartStorageManagement-Phase1-20170315.pdf, HDFS-Smart-Storage-Management-update.pdf,
move.jpg, tables_in_ssm.xlsx
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and flexible storage
policy engine considering file attributes, metadata, data temperature, storage type, EC codec,
available hardware capabilities, user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution to provide
smart storage management service in order for convenient, intelligent and effective utilizing
of erasure coding or replicas, HDFS cache facility, HSM offering, and all kinds of tools (balancer,
mover, disk balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message