hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Íñigo Goiri (JIRA) <j...@apache.org>
Subject [jira] [Commented] (YARN-8827) Plumb per app, per user and per queue resource utilization from the NM to RM
Date Mon, 08 Oct 2018 22:57:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642579#comment-16642579

Íñigo Goiri commented on YARN-8827:

bq. Hmmm, not sure what you meant by this. The test case uses the MockNM and MockRM classes
which I think we use throughout most CapacityScheduler tests. Not sure how we can move it
inside the class.

I was talking about making nm1, nm2, nm3, nm4 part of the class as fields.
Then, we could have a method to trigger all heartbeats.
Probably also call the drainEvents and wait for event thread, etc.

bq. Ive added a javadoc before the testcase and some demarcation within the testcase - to
mark begining and end of each step - hope that clears things ?

Looks good. I would probably quote the relevant numbers that are being used here.

bq. If you don't mind, id like to keep it as 'e'  The point was to reduce the typing and length
of the line. Also I don't plan to re-use it ouside the this testcase, so lets keep it as private.
If I do reuse it, I will create a TestUtil class and put everything there - and probably rename


For the sleep removal, this looks better.
Why do we have to run the heartbeats three times?

> Plumb per app, per user and per queue resource utilization from the NM to RM
> ----------------------------------------------------------------------------
>                 Key: YARN-8827
>                 URL: https://issues.apache.org/jira/browse/YARN-8827
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>            Priority: Major
>         Attachments: YARN-8827-YARN-1011.01.patch, YARN-8827-YARN-1011.02.patch
> Opportunistic Containers for OverAllocation need to be allocated to pending applications
in some fair manner. Rather than evaluating queue and user resource usage (allocated resource
usage) and comparing against queue and user limits to decide the allocation, it might  make
more sense to use a snapshot of actual resource utilization of the queue and user.
> To facilitate this, this JIRA proposes to aggregate per user, per app (and maybe per
queue) resource utilization in addition to aggregated Container and Node Utilization and send
it along with the NM heartbeat. It should be fairly inexpensive to aggregate - since it can
be performed in the same loop of the {{ContainersMonitorImpl}}'s Monitoring thread.
> A snapshot aggregate can be made every couple of seconds in the RM. This instantaneous
resource utilization should be used to decide if Opportunistic containers can be allocated
to an App, Queue or User.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message