hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-6920) Fix TestNMClient failure due to YARN-6706
Date Fri, 04 Aug 2017 05:47:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-6920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16113973#comment-16113973
] 

Arun Suresh edited comment on YARN-6920 at 8/4/17 5:46 AM:
-----------------------------------------------------------

Thanks for taking a look [~jianhe]
bq. ..it is possible that a different container gets started later on SCHEDULE_CONTAINER event
?
It is possible, but given the following invariant:
# *Total Resources of Guaranteed containers ALLOCATED on a Node cannot exceed the Node capacity*:
The RM ensures that Guaranteed container are never over-allocated on an NM
# *Total (Opportunistic + Guaranteed) resources of RUNNING containers cannot exceed Node capacity*:
The ContainerSchedulerenforces this.
# *Running Opportunistic containers will be preempted to make room for Guaranteed containers*:
Also enforced by the ContainerScheduler

We don't really have to worry about if a different container starts in the meanwhile. If the
new container that was started is a Guaranteed, then the Node should have the resources to
begin with.. and if Opportunistic, then, it will probably be killed when our ReInitializing
container is restarted.

bq.  And for service container, user should be expected to always use Guaranteed type.
Yup. There is already an {{enforceExecutionType}} field in the ResourceRequest::ExecutionTypeRequest
that an AM can use to ensure that container it receives against this request is of Guaranteed
type.


was (Author: asuresh):
bq. ..it is possible that a different container gets started later on SCHEDULE_CONTAINER event
?
It is possible, but given the following invariant:
# *Total Resources of Guaranteed containers ALLOCATED on a Node cannot exceed the Node capacity*:
The RM ensures that Guaranteed container are never over-allocated on an NM
# *Total (Opportunistic + Guaranteed) resources of RUNNING containers cannot exceed Node capacity*:
The ContainerSchedulerenforces this.
# *Running Opportunistic containers will be preempted to make room for Guaranteed containers*:
Also enforced by the ContainerScheduler

We don't really have to worry about if a different container starts in the meanwhile. If the
new container that was started is a Guaranteed, then the Node should have the resources to
begin with.. and if Opportunistic, then, it will probably be killed when our ReInitializing
container is restarted.

bq.  And for service container, user should be expected to always use Guaranteed type.
Yup. There is already an {{enforceExecutionType}} field in the ResourceRequest::ExecutionTypeRequest
that an AM can use to ensure that container it receives against this request is of Guaranteed
type.

> Fix TestNMClient failure due to YARN-6706
> -----------------------------------------
>
>                 Key: YARN-6920
>                 URL: https://issues.apache.org/jira/browse/YARN-6920
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: YARN-6920.001.patch, YARN-6920.002.patch, YARN-6920.003.patch, YARN-6920.004.patch
>
>
> Looks like {{TestNMClient}} has been failing for a while. Opening this JIRA to track
the fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message