hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2139) Add support for disk IO isolation/scheduling for containers
Date Thu, 06 Nov 2014 16:24:37 GMT

    [ https://issues.apache.org/jira/browse/YARN-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14200385#comment-14200385
] 

Karthik Kambatla commented on YARN-2139:
----------------------------------------

Thanks for chiming in, Arun.

This JIRA focuses on adding disk scheduling, and isolation for local disk read I/O. HDFS short-circuit
reads happen to be local-disk reads, and hence we handle that too automatically. 

bq. We shouldn't embed Linux or blkio specific semantics such as proportional weight division
into YARN.
The Linux aspects are only for isolation, and this needs to be pluggable. 

Wei and I are more familiar with FairScheduler, and talk about weighted division between queues
from that standpoint. We are eager to hear your thoughts on how we should do this with CapacityScheduler,
and augment the configs etc. if need be. I was thinking we would handle it similar to how
it handles CPU today (more on that later).

bq. We need something generic such as bandwidth which can be understood by users, supportable
on heterogenous nodes in the same cluster
Our initial thinking was along these lines. However, similar to CPU, it gets very hard for
a user to specify the bandwidth requirement. It is hard to figure out my container *needs*
200 MBps (and 2 GHz CPU). Furthermore, it is hard to enforce bandwidth isolation. When multiple
processes are accessing a disk, its aggregate bandwidth could go down significantly. To *guarantee*
bandwidth, I believe the scheduler has to be super pessimistic with its allocations. 

Given all this, we thought we should probably handle it the way we did CPU. Each process asks
for 'n' vdisks to capture the number of disks it needs. To avoid floating point computations,
we added an NM config for the available vdisks. Heterogeneity in terms of number of disks
is easily handled with vdisks-per-node knob. Heterogeneity in each disk's capacity or bandwidth
is not handled, similar to our CPU story. I propose we work on this heterogeneity as one of
the follow-up items. 

bq. Spindle locality or I/O parallelism is a real concern
Agree. Is it okay if we finish this work and follow-up with spindle-locality? We have some
thoughts on how to handle it, but left it out of the doc to keep the design focused. 

> Add support for disk IO isolation/scheduling for containers
> -----------------------------------------------------------
>
>                 Key: YARN-2139
>                 URL: https://issues.apache.org/jira/browse/YARN-2139
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Wei Yan
>            Assignee: Wei Yan
>         Attachments: Disk_IO_Scheduling_Design_1.pdf, Disk_IO_Scheduling_Design_2.pdf
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message