hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-4390) Do surgical preemption based on reserved container in CapacityScheduler
Date Tue, 26 Apr 2016 22:26:13 GMT

     [ https://issues.apache.org/jira/browse/YARN-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eric Payne updated YARN-4390:
-----------------------------
    Attachment: QueueNotHittingMax.jpg

Hi [~leftnoteasy]. I have been testing YARN-4390.6.patch. I have tried several times with
different use cases, and I see something that may be a concern.

My cluster is 3 nodes with 4GB in each node. The queues are
|ops|8GB|preemption disabled|
|eng|2GB|preemption disabled|
|default|2GB|preemption enabled|

I start a job on the default queue that takes up the whole queue. Each container is 0.5GB:
{code}
$HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar
sleep -Dmapreduce.job.queuename=default -m 30 -r 0 -mt 300000
{code}
Once that starts, I start a second job on the ops queue with 1GB containers:
{code}
$HADOOP_PREFIX/bin/hadoop jar $HADOOP_PREFIX/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-${HADOOP_VERSION}-tests.jar
sleep -Dmapreduce.job.queuename=ops -Dmapreduce.map.memory.mb=1024 -Dmapred.child.java.opts=-Xmx512m
-m 30 -r 0 -mt 10000
{code}
Now, please refer to the screenshot I put up.  The second job (0007) never is able to fill
up the ops queue. Containers from the first job (0006) just keep getting preempted and then
given back to the first job.

It may be that the 1GB container is such a large fraction of the ops queue, but I am somewhat
concerned about these results. They are very repeatable.

> Do surgical preemption based on reserved container in CapacityScheduler
> -----------------------------------------------------------------------
>
>                 Key: YARN-4390
>                 URL: https://issues.apache.org/jira/browse/YARN-4390
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler
>    Affects Versions: 3.0.0, 2.8.0, 2.7.3
>            Reporter: Eric Payne
>            Assignee: Wangda Tan
>         Attachments: QueueNotHittingMax.jpg, YARN-4390-design.1.pdf, YARN-4390-test-results.pdf,
YARN-4390.1.patch, YARN-4390.2.patch, YARN-4390.3.branch-2.patch, YARN-4390.3.patch, YARN-4390.4.patch,
YARN-4390.5.patch, YARN-4390.6.patch
>
>
> There are multiple reasons why preemption could unnecessarily preempt containers. One
is that an app could be requesting a large container (say 8-GB), and the preemption monitor
could conceivably preempt multiple containers (say 8, 1-GB containers) in order to fill the
large container request. These smaller containers would then be rejected by the requesting
AM and potentially given right back to the preempted app.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message