hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1799) Enhance LocalDirAllocator in NM to consider DiskMaxUtilization cutoff
Date Fri, 07 Mar 2014 17:23:43 GMT

    [ https://issues.apache.org/jira/browse/YARN-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13924074#comment-13924074
] 

Karthik Kambatla commented on YARN-1799:
----------------------------------------

Just to understand what we are trying to address here. [~sunilg] - is the following an accurate
description - when multiple tasks query the LocalDirAllocator, allocation is based on the
capacity available at that point in time and does not take previous queries into consideration,
leading to potential over-commit of disk space.

If yes, just adding a cut-off only delays the onset of this problem. A better approach might
be to "reserve" disk-space for a duration of time. 

> Enhance LocalDirAllocator in NM to consider DiskMaxUtilization cutoff
> ---------------------------------------------------------------------
>
>                 Key: YARN-1799
>                 URL: https://issues.apache.org/jira/browse/YARN-1799
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.3.0
>            Reporter: Sunil G
>
> LocalDirAllocator provides paths for all tasks for its local write.
> This considers the good list of directories which are selected by the HealthCheck mechamnism
in LocalDirsHandlerService
> getLocalPathForWrite() considers whether input demand "size" can meet the "capacity"
in "lastAccessed" directory.
> If more tasks asks for path from LocalDirAllocator, then it is possible that the allocation
is done based on the current disk availability at that given time.
> But this path would have earlier given to some other tasks to write and they may be sequentially
doing writing.
> It is better to check for an upper cutoff for disk availability



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message