hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN
Date Thu, 18 Feb 2016 00:18:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151477#comment-15151477

Wangda Tan commented on YARN-4692:

Thanks [~vinodkv] and other folks working on this, this documentation is pretty comprehensive
already, some thoughts/suggestions:

1) For running containers, instead of classifying them into service/batch, I would prefer
to tag them by application priority. For example, 0 is production service tasks, 5 is batch
job, etc. The reason is
- Service container is not always important than other containers
- One important service can preempt containers from less important services.

2) A container is service or batch depends on duration of the task, we had lots of discussions
on YARN-1039 already.

3) For 3.2.2 container auto restart, beyond restart container when it dies, we could let framework
check health of running tasks. For example, support embeded REST API to get healthy status
of containers. With this, framework can restart malfunctioning containers.

4) For 3.2.7 Scheduling / Queue model
Beyond queue model, we should consider long running containers when reserving large container
on node.

5) Debuggability for service container is also very important,
- Tools similar to [cAdvisor|https://github.com/google/cadvisor] could be very helpful to
figure out issues of service tasks
- We also need tool to show aggregated scheduling-related information of apps/queues/cluster.

*For comments from [~asuresh]:*
bq. we can give applications the ability to specify Preemptability of containers in a particular
Instead of adding a new field, I think we can reuse container priority and application priority
to describe preemptability.

bq. Allow LR Applications to specify peak, min and variance/mean (also many transient and
steady-state) of a Resource request to allow schedulers to make better allocation decisions.
I think this is hard for end user to know. Our framework should be able to figure out such
metrics for running containers. For requested new containers, we'd better assume they will
use 100% of requested resources.

bq. In YARN-4597 Chris Douglas proposed ...
In my mind, YARN-4597 is targeted to solve low latency batch tasks, if service tasks running
for one hour or more, it's not a big deal to take several minutes to setup it.

And agree that reservation system (YARN-1051) is the utimate solution of queue model and container
allocation for services

> [Umbrella] Simplified and first-class support for services in YARN
> ------------------------------------------------------------------
>                 Key: YARN-4692
>                 URL: https://issues.apache.org/jira/browse/YARN-4692
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Vinod Kumar Vavilapalli
>         Attachments: YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
> YARN-896 focused on getting the ball rolling on the support for services (long running
applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class support
for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the project
>  - Weave a comprehensive story around what we further need and attempt to rally the community
around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for higher layers
to take care of and see how much of that is better integrated into the YARN platform itself.

This message was sent by Atlassian JIRA

View raw message