hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-7019) Ability for applications to notify YARN about container reuse
Date Wed, 16 Aug 2017 13:59:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-7019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16128826#comment-16128826

Jason Lowe commented on YARN-7019:

I'm totally OK if we want to generalize this to a preemption score or whatever as long as
applications have the ability to inform YARN in some way.

bq. In YARN-3784, we were trying to send preemption timeout to AMs, so could AM take an immediate
action about checkpointing OR even send a feedback to RM with alternative container to preempt
instead of selected one?

The problem with relying on the AM is that feedback loop can be too long for some use cases.
 For example, YARN-1011 is proposing to have the NM preempt containers on its own in order
to preserve the health of the node when too many containers end up on the node and resource
utilization is at critical levels.  In those cases the NM doesn't have time to wait for the
NM to tell the RM about it, have the RM wait for the AM to heartbeat in so it can tell the
AM about it, wait for the AM to respond with a preemption preference, then wait for the NM
to heartbeat in again so the RM can relay the priority.  It would be nice if the NM could
be proactively told during container execution when the cost of preemption changes so it can
make better decisions on its own when pressed for time.

> Ability for applications to notify YARN about container reuse
> -------------------------------------------------------------
>                 Key: YARN-7019
>                 URL: https://issues.apache.org/jira/browse/YARN-7019
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Jason Lowe
> During preemption calculations YARN can try to reduce the amount of work lost by considering
how long a container has been running.  However when an application framework like Tez reuses
a container across multiple tasks it changes the work lost calculation since the container
has essentially checkpointed between task assignments.  It would be nice if applications could
inform YARN when a container has been reused/checkpointed and therefore is a better candidate
for preemption wrt. lost work than other, younger containers.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message