hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9411) HDFS ZoneLabel support
Date Thu, 19 Nov 2015 02:03:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012619#comment-15012619
] 

Kai Zheng commented on HDFS-9411:
---------------------------------

Hi [~vinayrpet],

I had some understanding about the design doc. It looks nice! Some high level questions, thanks
for your clarifying.

1. Would it be good to support generic node label instead of ZoneLabel? I thought it may be
useful for some considerations like cluster provisioning and management, security, repl/EC
task scheduling and etc. in addition to block placement. The label could help specify some
node attributes about network, CPU, storage, usage, and some other application domains.

2. Given generic node label is used, maybe we can leverage file/directory attributes to implement
the requirement? Like we create/manage zones of files expressed in file attributes and place
blocks based on flexible node label combinations.

3. So in the design, Zone or ZoneLabel will be the first factor to block placement, and will
dominate storage policies, right? 

4. How this might relate to federation and block pool? 

> HDFS ZoneLabel support
> ----------------------
>
>                 Key: HDFS-9411
>                 URL: https://issues.apache.org/jira/browse/HDFS-9411
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>         Attachments: HDFS_ZoneLabels-16112015.pdf
>
>
> HDFS currently stores data blocks on different datanodes chosen by BlockPlacement Policy.
These datanodes are random within the scope(local-rack/different-rack/nodegroup) of network
topology. 
> In Multi-tenant (Tenant can be user/service) scenario, blocks of any tenant can be on
any datanodes.
>  Based on applications of different tenant, sometimes datanode might get busy making
the other tenant's application to slow down. It would be better if admin's have a provision
to logically divide the cluster among multi-tenants.
> ZONE_LABELS can logically divide the cluster datanodes into multiple Zones.
> High level design doc to follow soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message