Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 095E8200C7B for ; Fri, 5 May 2017 10:16:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 080E0160BAF; Fri, 5 May 2017 08:16:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5C96C160B97 for ; Fri, 5 May 2017 10:16:13 +0200 (CEST) Received: (qmail 59786 invoked by uid 500); 5 May 2017 08:16:09 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 59774 invoked by uid 99); 5 May 2017 08:16:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 May 2017 08:16:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id E8E22C14F1 for ; Fri, 5 May 2017 08:16:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id vTPl3FUkr1_X for ; Fri, 5 May 2017 08:16:08 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 8E49660DA5 for ; Fri, 5 May 2017 08:16:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 9B0D0E0DBC for ; Fri, 5 May 2017 08:16:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 01C4D21DF8 for ; Fri, 5 May 2017 08:16:05 +0000 (UTC) Date: Fri, 5 May 2017 08:16:05 +0000 (UTC) From: "Yiqun Lin (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-11464) Improve the selection in choosing storage for blocks MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 05 May 2017 08:16:14 -0000 [ https://issues.apache.org/jira/browse/HDFS-11464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997916#comment-15997916 ] Yiqun Lin commented on HDFS-11464: ---------------------------------- After the work in HDFS-9807, the storageID chosen from the NameNode will be passed to DataNode and can be used in VolumeChoosingPolicy.However, currently the existing VolumeChoosingPolicies will usually ignore the chosen storageID. But if we implement a new policy which will respect the storageID, then the behavior of choosing storage for blocks in BlockPlacement should also be improved. So I'd like to add an new boolean config like {{dfs.datanode.consider.storage}} to make BlockPlacementPolicy on the Namenode and the VolumeChoosingPolicy be consistent in the way the volumes are chosen. I don't plan to implement a new storageID-respected VolumeChoosingPolicy now. But it doesn't affect the improvement that did in this JIRA. Attach the updated patch and reopen this JIRA. Any comments are welcomed. Thanks. > Improve the selection in choosing storage for blocks > ---------------------------------------------------- > > Key: HDFS-11464 > URL: https://issues.apache.org/jira/browse/HDFS-11464 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode > Reporter: Yiqun Lin > Assignee: Yiqun Lin > Attachments: HDFS-11464.001.patch > > > Currently the logic in choosing storage for blocks is not a good way. It always uses the first valid storage of a given StorageType ({{see DataNodeDescriptor#chooseStorage4Block}}). This should not be a good selection. That means blcoks will always be written to the same volume (first volume) and other valid volumes have no choices. This problem is brought up by this comment ( https://issues.apache.org/jira/browse/HDFS-9807?focusedCommentId=15878382&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15878382 ) > There is one solution from me: > * First, based on existing storages in one node, extract all the valid storages into a collection. > * Then, disrupt the order of these vaild storages, get a new collection. > * Finally, get the first storage from the new storages collection. > These steps will be executed in {{DataNodeDescriptor#chooseStorage4Block}} and replace current logic. I think this improvement can be done as a subtask under HDFS-11419. Any further comments are welcomed. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org