hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7343) HDFS smart storage management
Date Tue, 10 Jan 2017 09:35:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15814478#comment-15814478
] 

Wei Zhou commented on HDFS-7343:
--------------------------------

Thanks [~anu] for these insightful questions!
{quote}
Would you be able to count how many times a particular rule was triggered in a given time
window ? 
{quote}
Sure, it's a very useful feature, we will implement this.

{quote}
store them in SSM instead of storing it in NN, or feel free to store it as a file on HDFS.
{quote}
Yes, it also makes HA supporting much easier by store rule in HDFS. We will implement it in
this way. Thanks!

{quote}
Are you now saying we will support HA ?
{quote}
Sorry for not making it clear. Phase 1 supports HA.

{quote}
when the rule gets written may be there is no issue, but as the file count increases this
becomes a problem.
{quote}
Yes, I agree. So it followed by a {{Second}}. We continuing tracks the state of rules and
give feedbacks when it becomes a problem.

{quote}
I thought we did not want to restrict the number of times a rule fires since that would introduce
uncertainty.
{quote}
Agreed. I tend to not implement it either, even it could be a potential dirty solution to
anti rules that out of control.

{quote}
Why not just rely on background SSM logic and rely on the rules doing the right thing ?
{quote}
HDFS client talks to SSM only (at least for now) when it wants to query a recommended file
storage policy from SSM before creating a file. The storage policy that SSM would return is
controlled by rule. HDFS Client then set the file storage policy to the recommended one explicitly.
If HDFS client can not connect to SSM, then client just create the file with system default
policy. So there is no connection between SSM and the file writing IO. Sorry, I'm not very
clear about your question.

Thanks again!

> HDFS smart storage management
> -----------------------------
>
>                 Key: HDFS-7343
>                 URL: https://issues.apache.org/jira/browse/HDFS-7343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kai Zheng
>            Assignee: Wei Zhou
>         Attachments: HDFS-Smart-Storage-Management-update.pdf, HDFS-Smart-Storage-Management.pdf,
move.jpg
>
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and flexible storage
policy engine considering file attributes, metadata, data temperature, storage type, EC codec,
available hardware capabilities, user/application preference and etc.
> Modified the title for re-purpose.
> We'd extend this effort some bit and aim to work on a comprehensive solution to provide
smart storage management service in order for convenient, intelligent and effective utilizing
of erasure coding or replicas, HDFS cache facility, HSM offering, and all kinds of tools (balancer,
mover, disk balancer and so on) in a large cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message