hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "M. C. Srivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2791) Add Disk as a resource for scheduling
Date Mon, 12 Jan 2015 19:25:37 GMT

    [ https://issues.apache.org/jira/browse/YARN-2791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274007#comment-14274007

M. C. Srivas commented on YARN-2791:

The scope in https://issues.apache.org/jira/browse/YARN-2139 is just too bloated. We have
this problem immediately with YARN overprovisioning since it doesn't take into account how
performance is impacted by the number of disks on each node. We need this fix now, not later.
YARN-2139 is too elaborate, and is trying to do too much. On the the other hand, it doesn't
take into account how running DataNodes on the same spindles will impact shuffle performance.
I would say get this piece of work done, and we can wait on YARN-2139 whenever it gets done.

> Add Disk as a resource for scheduling
> -------------------------------------
>                 Key: YARN-2791
>                 URL: https://issues.apache.org/jira/browse/YARN-2791
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: scheduler
>    Affects Versions: 2.5.1
>            Reporter: Swapnil Daingade
>            Assignee: Yuliya Feldman
>         Attachments: DiskDriveAsResourceInYARN.pdf
> Currently, the number of disks present on a node is not considered a factor while scheduling
containers on that node. Having large amount of memory on a node can lead to high number of
containers being launched on that node, all of which compete for I/O bandwidth. This multiplexing
of I/O across containers can lead to slower overall progress and sub-optimal resource utilization
as containers starved for I/O bandwidth hold on to other resources like cpu and memory. This
problem can be solved by considering disk as a resource and including it in deciding how many
containers can be concurrently run on a node.

This message was sent by Atlassian JIRA

View raw message