hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Graves (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3897) capacity scheduler - maxActiveApplicationsPerUser calculation can be wrong
Date Thu, 01 Mar 2012 18:53:59 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220240#comment-13220240
] 

Thomas Graves commented on MAPREDUCE-3897:
------------------------------------------

I just thought of a case where this won't work well for utilization.  That is if you have
a queue with small capacity - say 1%, but its max capacity is say 100%, even if we had the
configuration per queue for am% and you set it really high, it might only be allowed a couple
of AM's when in reality if the cluster has no one else running it should be allowed more so
it could use the 100% max capacity.

We might be better off leaving the maxActiveApplications computation using maxCapacity but
changing the maxActiveApplicationsPerUser to use capacity and then allow the user limit factor
to apply. Need to think about it some more.
                
> capacity scheduler - maxActiveApplicationsPerUser calculation can be wrong
> --------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3897
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3897
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Thomas Graves
>            Assignee: Eric Payne
>            Priority: Critical
>         Attachments: MAPREDUCE-3897-1.txt, MAPREDUCE-3897-1.txt
>
>
> The capacity scheduler calculates the maxActiveApplications and the maxActiveApplicationsPerUser
based on the config yarn.scheduler.capacity.maximum-applications or default 10000.  
> MaxActiveApplications = max ( ceil ( clusterMemory/minAllocation * maxAMResource% * absoluteMaxCapacity),
1)  
> MaxActiveAppsPerUser = max( ceil (maxActiveApplicationsComputedAbove * (userLimit%/100)
* userLimitFactor), 1) 
> maxActiveApplications is already multiplied by the queue absolute MAXIMUM capacity, so
if max capacity > capacity and if you have user limit factor 1 (which is the default) and
only 1 user is running, that user will not be allowed to use over the queue capacity, so having
it relative to MAX capacity doesn't make sense.  That user could easily end up in a deadlock
and all its space used by application masters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message