Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Fri, 5 May 2017 08:16:05 +0000 (UTC)
From: "Yiqun Lin (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.13046663.1488208811000.134295.1493972165005@Atlassian.JIRA>
In-Reply-To: <JIRA.13046663.1488208811000@Atlassian.JIRA>
References: <JIRA.13046663.1488208811000@Atlassian.JIRA> <JIRA.13046663.1488208811706@jira-lw-us.apache.org>
Subject: [jira] [Commented] (HDFS-11464) Improve the selection in choosing
 storage for blocks
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Fri, 05 May 2017 08:16:14 -0000


    [ https://issues.apache.org/jira/browse/HDFS-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997916#comment-15997916 ] 

Yiqun Lin commented on HDFS-11464:
----------------------------------

After the work in HDFS-9807, the storageID chosen from the NameNode will be passed to DataNode and can be used in VolumeChoosingPolicy.However, currently the existing VolumeChoosingPolicies will usually ignore the chosen storageID. But if we implement a new policy which will respect the storageID, then the behavior of choosing storage for blocks in BlockPlacement should also be improved.
So I'd like to add an new boolean config like {{dfs.datanode.consider.storage}} to make BlockPlacementPolicy on the Namenode and the VolumeChoosingPolicy be consistent in the way the volumes are chosen. I don't plan to implement a new storageID-respected VolumeChoosingPolicy now. But it doesn't affect the improvement that did in this JIRA.

Attach the updated patch and reopen this JIRA. Any comments are welcomed. Thanks.

> Improve the selection in choosing storage for blocks
> ----------------------------------------------------
>
>                 Key: HDFS-11464
>                 URL: https://issues.apache.org/jira/browse/HDFS-11464
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Yiqun Lin
>            Assignee: Yiqun Lin
>         Attachments: HDFS-11464.001.patch
>
>
> Currently the logic in choosing storage for blocks is not a good way. It always uses the first valid storage of a given StorageType ({{see DataNodeDescriptor#chooseStorage4Block}}). This should not be a good selection. That means blcoks will always be written to the same volume (first volume) and other valid volumes have no choices. This problem is brought up by this comment ( https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382 )
> There is one solution from me:
> * First, based on existing storages in one node, extract all the valid storages into a collection.
> * Then, disrupt the order of these vaild storages, get a new collection.
> * Finally, get the first storage from the new storages collection.
> These steps will be executed in {{DataNodeDescriptor#chooseStorage4Block}} and replace current logic. I think this improvement can be done as a subtask under HDFS-11419. Any further comments are welcomed.


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org