hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yiqun Lin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11464) Improve the selection in choosing storage for blocks
Date Fri, 05 May 2017 08:16:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997916#comment-15997916
] 

Yiqun Lin commented on HDFS-11464:
----------------------------------

After the work in HDFS-9807, the storageID chosen from the NameNode will be passed to DataNode
and can be used in VolumeChoosingPolicy.However, currently the existing VolumeChoosingPolicies
will usually ignore the chosen storageID. But if we implement a new policy which will respect
the storageID, then the behavior of choosing storage for blocks in BlockPlacement should also
be improved.
So I'd like to add an new boolean config like {{dfs.datanode.consider.storage}} to make BlockPlacementPolicy
on the Namenode and the VolumeChoosingPolicy be consistent in the way the volumes are chosen.
I don't plan to implement a new storageID-respected VolumeChoosingPolicy now. But it doesn't
affect the improvement that did in this JIRA.

Attach the updated patch and reopen this JIRA. Any comments are welcomed. Thanks.

> Improve the selection in choosing storage for blocks
> ----------------------------------------------------
>
>                 Key: HDFS-11464
>                 URL: https://issues.apache.org/jira/browse/HDFS-11464
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>         Attachments: HDFS-11464.001.patch
>
>
> Currently the logic in choosing storage for blocks is not a good way. It always uses
the first valid storage of a given StorageType ({{see DataNodeDescriptor#chooseStorage4Block}}).
This should not be a good selection. That means blcoks will always be written to the same
volume (first volume) and other valid volumes have no choices. This problem is brought up
by this comment ( https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382
)
> There is one solution from me:
> * First, based on existing storages in one node, extract all the valid storages into
a collection.
> * Then, disrupt the order of these vaild storages, get a new collection.
> * Finally, get the first storage from the new storages collection.
> These steps will be executed in {{DataNodeDescriptor#chooseStorage4Block}} and replace
current logic. I think this improvement can be done as a subtask under HDFS-11419. Any further
comments are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message