hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate "long lived"
Date Mon, 18 May 2015 23:27:02 GMT

    [ https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549505#comment-14549505

Chris Douglas commented on YARN-1039:

The semantics of a boolean flag are opaque. The policies enforced by different RM configurations
(and versions) will not be- and cannot be made to be- consistent. Application and container
priority are already encoded (or in progress, YARN-1963), so it's not just preemption priority
or cost. Affinity and anti-affinity are also covered by different features. Discussion has
been wide-ranging because it is unclear what "long-lived" guarantees across existing features
(beyond removing the progress bar from the UI, which I hope we can stop mentioning).

An implementation that only recognizes infinite and undefined leases could be mapped into
duration. Lease duration could also be used to communicate when security tokens cannot be
renewed, short-lived guarantees for YARN-2877 containers, boundaries of YARN-1051 reservations,
and planned decommissioning. In contrast, the "long-lived" flag cannot be used for these cases.
We could expose probabilistic guarantees (which are what we give in reality), but that's a
later issue.

Considering the blockers more concretely:
bq. (a) reservations (b) white-listed requests or (c) node-label requests getting stuck on
a node used by other services' containers that don't exit.

Aren't these handled by adding a timeout to allocations, which would also catch cases where
this flag is _not_ set? The timeout value could be set across the scheduler to start, but
could even be user-visible in later versions...

All said, I don't have time to work on this, agree the API can be evolved from the flag, and
am -0 on it.

> Add parameter for YARN resource requests to indicate "long lived"
> -----------------------------------------------------------------
>                 Key: YARN-1039
>                 URL: https://issues.apache.org/jira/browse/YARN-1039
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 3.0.0, 2.1.1-beta
>            Reporter: Steve Loughran
>            Assignee: Craig Welch
>         Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch
> A container request could support a new parameter "long-lived". This could be used by
a scheduler that would know not to host the service on a transient (cloud: spot priced) node.
> Schedulers could also decide whether or not to allocate multiple long-lived containers
on the same node

This message was sent by Atlassian JIRA

View raw message