hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3453) Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF mode causing thrashing
Date Fri, 10 Jul 2015 07:41:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14621902#comment-14621902

Arun Suresh commented on YARN-3453:


bq. After review above comments, I am reminded that the case (0 GB, non-zero cores) like (non-zero
GB, 0 cores) will also cause preempt more resources than necessary.
I agree... But I feel instead of fixing it here, if we can have a comprehensive fix as requested
by YARN-2154 ( [~kasha] and myself had an offline discussion about how we should actually
break from the preemption loop when incoming requests are satisfied), then we wont even hit
this case.
Further more, this JIRA fixes the {{isStarved()}} method in the Queue correctly, so at the
very least, the {{toPreempt}} resource object would be smaller (and thus would implicitly
result in less pre-emptions)

I also agree fining the ratio of demand is definitely useful. But again, let us grab all the
low hanging fruit first. I propose we create a separate JIRA for that.

> Fair Scheduler : Parts of preemption logic uses DefaultResourceCalculator even in DRF
mode causing thrashing
> ------------------------------------------------------------------------------------------------------------
>                 Key: YARN-3453
>                 URL: https://issues.apache.org/jira/browse/YARN-3453
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 2.6.0
>            Reporter: Ashwin Shankar
>            Assignee: Arun Suresh
>         Attachments: YARN-3453.1.patch, YARN-3453.2.patch, YARN-3453.3.patch, YARN-3453.4.patch,
> There are two places in preemption code flow where DefaultResourceCalculator is used,
even in DRF mode.
> Which basically results in more resources getting preempted than needed, and those extra
preempted containers aren’t even getting to the “starved” queue since scheduling logic
is based on DRF's Calculator.
> Following are the two places :
> 1. {code:title=FSLeafQueue.java|borderStyle=solid}
> private boolean isStarved(Resource share)
> {code}
> A queue shouldn’t be marked as “starved” if the dominant resource usage
> is >=  fair/minshare.
> 2. {code:title=FairScheduler.java|borderStyle=solid}
> protected Resource resToPreempt(FSLeafQueue sched, long curTime)
> {code}
> --------------------------------------------------------------
> One more thing that I believe needs to change in DRF mode is : during a preemption round,if
preempting a few containers results in satisfying needs of a resource type, then we should
exit that preemption round, since the containers that we just preempted should bring the dominant
resource usage to min/fair share.

This message was sent by Atlassian JIRA

View raw message