hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantinos Karanasos (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1039) Add parameter for YARN resource requests to indicate "long lived"
Date Wed, 21 Jan 2015 00:46:37 GMT

    [ https://issues.apache.org/jira/browse/YARN-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284920#comment-14284920
] 

Konstantinos Karanasos commented on YARN-1039:
----------------------------------------------

Let me add my thoughts regarding whether we should allow duration to be reported instead of
just a boolean switch for short tasks.
I am actively involved on adding distributed scheduling capabilities ([YARN-2877]). We have
performed an extensive experimental evaluation that has shown significant performance improvements
in terms of throughput and latency, especially when short tasks are concerned. In that scenario,
having the ability to specify the duration of the task is crucial (for deciding what type
of container to use [[YARN-2882]], for estimating the waiting time in the NMs [[YARN-2886]],
etc.).

I understand the concerns that have been raised about how to properly provide the right task
duration. However, this can be done either based on historical information (previous waves
of this task type or previous execution of the same job) or on application level knowledge.
We are already experimenting with ways of how to deal with imprecise task durations.

That said, I definitely agree with [~john.jian.fang] that the user should not *have to* provide
any task duration (i.e., the system should work properly in case no durations are provided),
but on the other hand, in case she does, we should be able to take advantage of it.
Moreover, as [~curino] pointed out, if the API exposes an integer instead of a boolean, we
can simulate the boolean switch (e.g., by setting the value to MAX_INT for long tasks), but
if we simply use a boolean, we would have to change the API in the future to support duration.

> Add parameter for YARN resource requests to indicate "long lived"
> -----------------------------------------------------------------
>
>                 Key: YARN-1039
>                 URL: https://issues.apache.org/jira/browse/YARN-1039
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 3.0.0, 2.1.1-beta
>            Reporter: Steve Loughran
>            Assignee: Craig Welch
>         Attachments: YARN-1039.1.patch, YARN-1039.2.patch, YARN-1039.3.patch
>
>
> A container request could support a new parameter "long-lived". This could be used by
a scheduler that would know not to host the service on a transient (cloud: spot priced) node.
> Schedulers could also decide whether or not to allocate multiple long-lived containers
on the same node



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message