Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: yarn-issues@hadoop.apache.org
Date: Tue, 9 Jun 2015 21:29:01 +0000 (UTC)
From: "Ashwin Shankar (JIRA)" <jira@apache.org>
To: yarn-issues@hadoop.apache.org
Message-ID: <JIRA.12818693.1428352984000.31813.1433885341461@Atlassian.JIRA>
In-Reply-To: <JIRA.12818693.1428352984000@Atlassian.JIRA>
References: <JIRA.12818693.1428352984000@Atlassian.JIRA>
 <JIRA.12818693.1428352984302@arcas>
Subject: [jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption
 logic uses DefaultResourceCalculator even in DRF mode causing thrashing
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/YARN-3453?page=3Dcom.atlassian.=
jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D14579=
588#comment-14579588 ]=20

Ashwin Shankar commented on YARN-3453:
--------------------------------------

hey Arun,
Thanks for working on this !
Couple more comments in addition to Karthik's comments  :
1. Why are we not using componentwisemin here ?
Resource target =3D Resources.min(calc, clusterResource,
           sched.getMinShare(), sched.getDemand());

2. FairScheduler.preemptResources() uses DefaultResourceCalculator and henc=
e would look at only memory.
This could lead to a problem in the following scenario :
Preemption round0 : toPreempt =3D (100G,10 cores)
...<we preempt 10 containers of (4G,1 core)
Preemption round10 : toPreempt =3D (60G,0 cores)

In round10, we've satisfied all the cores, the current implementation since=
 its based on DefaultResourceCalculator would continue to preempt
to satisfy the remaining 60G as well even in DRF, which means we just preem=
pted more cores than we had to.
Making this calculator DRF wouldnt solve the problem as well, since we woul=
d then be preempting less than what is necessary.
The root of this problem is YARN-2154. I'll leave it upto you to decide wha=
t you want to do about this in this jira.
{code:title=3DFairScheduler.java}
while (Resources.greaterThan(RESOURCE_CALCULATOR, clusterResource,
          toPreempt, Resources.none()))
{code}

3. Unit tests needs to be added.

@[~kasha@cloudera.com],
bq.Looking at the remaining uses of DefaultResourceCalculator in FairSchedu=
ler, we could benefit from updating all of them to DominantResourceCalculat=
or? Ashwin Shankar - do you concur?
Overall I see it would be beneficial. But I'm not so sure if the callers of=
 FairScheduler.getResourceCalculator would be okay with getting DominantRes=
ourceCalculator always? I see its mostly called my Fair Reservation System =
feature.


> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator=
 even in DRF mode causing thrashing
> -------------------------------------------------------------------------=
-----------------------------------
>
>                 Key: YARN-3453
>                 URL: https://issues.apache.org/jira/browse/YARN-3453
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.6.0
>            Reporter: Ashwin Shankar
>            Assignee: Arun Suresh
>         Attachments: YARN-3453.1.patch, YARN-3453.2.patch
>
>
> There are two places in preemption code flow where DefaultResourceCalcula=
tor is used, even in DRF mode.
> Which basically results in more resources getting preempted than needed, =
and those extra preempted containers aren=E2=80=99t even getting to the =E2=
=80=9Cstarved=E2=80=9D queue since scheduling logic is based on DRF's Calcu=
lator.
> Following are the two places :
> 1. {code:title=3DFSLeafQueue.java|borderStyle=3Dsolid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn=E2=80=99t be marked as =E2=80=9Cstarved=E2=80=9D if the d=
ominant resource usage
> is >=3D  fair/minshare.
> 2. {code:title=3DFairScheduler.java|borderStyle=3Dsolid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --------------------------------------------------------------
> One more thing that I believe needs to change in DRF mode is : during a p=
reemption round,if preempting a few containers results in satisfying needs =
of a resource type, then we should exit that preemption round, since the co=
ntainers that we just preempted should bring the dominant resource usage to=
 min/fair share.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)