hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2139) [Umbrella] Support for Disk as a Resource in YARN
Date Wed, 03 Dec 2014 22:19:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233605#comment-14233605
] 

Karthik Kambatla commented on YARN-2139:
----------------------------------------

bq. currently vdisks is counting the number of physical drives present on the box.
We see vdisks as a multiple of the number of physical disks on the box. Again, it is just
one of the ways, and we can add more ways to share disk resources in the future. 

bq. Should we consider evaluating a change in this policy that gives a container 1 local dir
to a container with 1 vdisk. This way for a machine with 6 disks (and 6 vdisks) would have
6 tasks running, each with their own "dedicated" disk. 
Good point. We were thinking of giving the AM the option to choose the amount of disk IO parallelism
at the time of launching the container, as part of the spindle locality work. I see AMs wanting
to either (1) pick a single local directory for guaranteed performance or (2) stripe accesses
across multiple disks for potentially higher throughput based on other work on the node.

Initially, we could provide a global config for all containers - vdisks to span fewest or
most disks. 

> [Umbrella] Support for Disk as a Resource in YARN 
> --------------------------------------------------
>
>                 Key: YARN-2139
>                 URL: https://issues.apache.org/jira/browse/YARN-2139
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Wei Yan
>         Attachments: Disk_IO_Isolation_Scheduling_3.pdf, Disk_IO_Scheduling_Design_1.pdf,
Disk_IO_Scheduling_Design_2.pdf, YARN-2139-prototype-2.patch, YARN-2139-prototype.patch
>
>
> YARN should consider disk as another resource for (1) scheduling tasks on nodes, (2)
isolation at runtime, (3) spindle locality. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message