hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Ferguson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4327) Enhance CS to schedule accounting for both memory and cpu cores
Date Mon, 16 Jul 2012 21:22:36 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415652#comment-13415652
] 

Andrew Ferguson commented on MAPREDUCE-4327:
--------------------------------------------

awesome, thanks for the update Arun. I just finished reading through your commits. so far,
your patch looks a lot like mine, which is great! hopefully that means our logic is correct.
:-)   I like that you pulled more of the division and rounding code into the ResourceComparator,
and out of CSQueueUtils to keep it modular; I didn't think to do that.

I have a few suggestions for you (all of which I learned after writing test cases):

1) In ResourceMemoryCpuComparator (renamed "DefaultMultiResourceComparator" in my patch),
I found that a simple "if lhs.equals(rhs) return 0;" was needed at the start -- after dividing
by the cluster resources, two identical resource requests might appear to be different due
to floating point issues.

2) In the same class, I found that I needed to normalize the resources (by the cluster's resources),
and then sort them to compare two resources which consume the same amount of their most-dominant
resource, but differing amounts of their 2nd-most-dominant resource. This is important when
checking that you don't exceed a resource limit (eg, "greaterThan(comparator, consumed, limit)")
-- it may be that I'm within the limit for CPUs (which is the dominant resource), but exceeding
the limit for memory (which is not my dominant resource).

3) In resourcemanager.resource.Resources, when multiplying CPUs by a float, because CPUs is
an int, I needed two versions: one which rounded-up, and one which rounded-down. Calculating
queueMaxCap was the only time I needed the round-down version. Technically, this is also needed
for memory (since it is also an int), but as long as we only allocate memory in units of at
least, say, 128 MB (as is current practice in the code), the extra bits in the int (0 bytes
- 128 MB) are actually serving as a store for the fractional part! and thus, the existing
roundUp() and roundDown() functions (from CSQueueUtils) suffice.


cheers,
Andrew
                
> Enhance CS to schedule accounting for both memory and cpu cores
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-4327
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4327
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: mrv2, resourcemanager, scheduler
>    Affects Versions: 2.0.0-alpha
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>         Attachments: MAPREDUCE-4327-v2.patch, MAPREDUCE-4327-v3.patch, MAPREDUCE-4327-v4.patch,
MAPREDUCE-4327-v5.patch, MAPREDUCE-4327.patch
>
>
> With YARN being a general purpose system, it would be useful for several applications
(MPI et al) to specify not just memory but also CPU (cores) for their resource requirements.
Thus, it would be useful to the CapacityScheduler to account for both.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message