hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5290) ResourceManager can place more containers on a node than the node size allows
Date Wed, 22 Jun 2016 19:56:16 GMT

    [ https://issues.apache.org/jira/browse/YARN-5290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15345064#comment-15345064

Jason Lowe commented on YARN-5290:

We could have the RM wait until it receives hard confirmation from the NM before it releases
the resources associated with a container, but that would needlessly slow down scheduling
in some cases.  For example, if a user is at the scheduler user limit but releases a container
on node A, I don't see why we have to wait until that container is confirmed dead over two
subsequent NM heartbeats (one to tell the NM to shoot it and another to confirm its dead)
before allowing the user to allocate another container of the same size on node B.  However
I do think it's bad for us to allocate the new container on the _same_ node as the released
one since we can accidentally overwhelm the node if the old container isn't cleaned up fast

Therefore I propose that we go ahead and let the scheduler queues and user limit computations
update immediately so other nodes can be scheduled, but we don't release the resources in
the SchedulerNode itself until the node confirms a previously running container is dead. 
IMHO if the RM ever sees a container in the RUNNING state on a node, it should never think
that node has freed the resources for that container until the node itself says that container
has completed.

There is an interesting corner case where the RM has handed out a container to an AM (i.e.:
container is in the ACQUIRED state) but it hasn't seen it running on a node yet.  If the container
is killed by the RM or AM, there's still a chance where the container could appear on the
node after the RM has considered those resources freed.  We'll have to decide how to handle
that race.  One way to solve it is to assume the container resources could still be "used"
until it has had a chance to tell the NM that the container token for that container is no
longer valid and confirmed in a subsequent NM heartbeat that the container has not appeared
since.  Maybe there's a simpler/faster way to safely free the containers resources for that
race condition?

> ResourceManager can place more containers on a node than the node size allows
> -----------------------------------------------------------------------------
>                 Key: YARN-5290
>                 URL: https://issues.apache.org/jira/browse/YARN-5290
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>            Reporter: Jason Lowe
> When the ResourceManager or an ApplicationMaster kills a container the RM scheduler instantly
thinks the container is dead and frees those resources within the scheduler bookkeeping. 
However that container can still be running on the node until the node heartbeats back into
the RM and is told to kill the container.  If the RM allocates the space associated with the
released container and gives it to an AM quickly enough, the AM can launch a new container
while the old container is still running on the NM.  That leads to a scenario where we're
technically running more resources on the node than the node advertised to the RM.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message