hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8320) Support CPU isolation for latency-sensitive (LS) service
Date Wed, 23 May 2018 16:34:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16487596#comment-16487596

Wangda Tan commented on YARN-8320:

Thanks [~cheersyang] / [~yangjiandan] for the detailed design, very helpful to understand
the contexts. 

I took a quick look at the proposal, a couple of questions / comments: 

1) It seems that the #vcore must be divisible by #physical-core, otherwise it will cause rounding
issue and containers will get less/more than requested resources. If admin enable the feature,
YARN should take care of checking this value before starting NM.

2) I'm still trying to understand benefit of RESERVED / SHARED mode. If a RESERVED core can
be used by ANY container, in my mind, the RESERVED container can be affected by adhoc ANY
container. And similarly, if we allow SHARED containers bind to same set of cores, considering
SHARED containers are running LS services and CPU-intensive, they could compute a lot on these
SHARED containers. Which could lead to even worse latency and more competitions.

3) Relationship to other features:
- Related to NUMA allocation on YARN (YARN-5764), to me the two features are related to each
other: Allocate reserved cores to a same process on the same or closest NUMA zone(s) has the
best performance, but satisfy one condition can break the other one. Should be very careful
to make sure the two features can work together.

- Related to GPU allocation on YARN: On one machine, GPU performance is sensitive to topology
of GPUs. Communication latency and bandwidth differs a lot when GPUs are connected by NVLink,
PCI-E, etc. It might be valuable to think about is it possible to have a same framework on
the same NM to do resource-specific scheduling and placement.

- Related to ResourcePlugin framework: We added ResourcePlugin framework since YARN-7224,
and now GPU/FPGA are using the framework to implement the feature. I'm not sure if this feature
can benefit from the ResourcePlugin framework, or some refactoring required to the framework.
It's better if we can extract common part and workflow out.

4) To me only privileged users and applications can request non-ANY CPU mode, how can we enforce
this (maybe not in phase#1, but we need a plan here).

> Support CPU isolation for latency-sensitive (LS) service
> --------------------------------------------------------
>                 Key: YARN-8320
>                 URL: https://issues.apache.org/jira/browse/YARN-8320
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>            Reporter: Jiandan Yang 
>            Priority: Major
>         Attachments: CPU-isolation-for-latency-sensitive-services-v1.pdf, CPU-isolation-for-latency-sensitive-services-v2.pdf,
> Currently NodeManager uses “cpu.cfs_period_us”, “cpu.cfs_quota_us” and “cpu.shares”
to isolate cpu resource. However,
>  * Linux Completely Fair Scheduling (CFS) is a throughput-oriented scheduler; no support
for differentiated latency
>  * Request latency of services running on container may be frequent shake when all containers
share cpus, and latency-sensitive services can not afford in our production environment.
> So we need more fine-grained cpu isolation.
> Here we propose a solution using cgroup cpuset to binds containers to different processors,
this is inspired by the isolation technique in [Borg system|http://schd.ws/hosted_files/lcccna2016/a7/CAT%20@%20Scale.pdf].

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message