hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4088) RM should be able to process heartbeats from NM asynchronously
Date Wed, 02 Sep 2015 22:36:47 GMT

    [ https://issues.apache.org/jira/browse/YARN-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14728158#comment-14728158

Jason Lowe commented on YARN-4088:

bq. Not sure if this (out-of-band heartbeat upon container completion) happens today
The OOB heartbeat is occurring to a degree if the cluster is running a lot of MapReduce. 
Currently the MapReduce AM will proactively kill any task that reports a terminal status over
the umbilical protocol.  There's then a race between the task container completing on its
own and the AM killing the task via the NM.  If the latter wins the race then we get an OOB
heartbeat since today a stop container request generates it.  I see the kill winning the race
fairly often on our clusters, so we are getting a lot of OOB heartbeats in practice.

General OOB heartbeats on any type of container completion does not occur today but is proposed
by YARN-2046.

bq.  Processing one NM at a time is unlikely to cope well with the storms of heartbeats.
This was a big problem in the past and the scheduler could fall far behind, but this was mitigated
to a large degree with batching of heartbeats (e.g.: YARN-365).

Agree in general though that allowing the scheduler to be more concurrent would be nice. 

> RM should be able to process heartbeats from NM asynchronously
> --------------------------------------------------------------
>                 Key: YARN-4088
>                 URL: https://issues.apache.org/jira/browse/YARN-4088
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager, scheduler
>            Reporter: Srikanth Kandula
> Today, the RM sequentially processes one heartbeat after another. 
> Imagine a 3000 server cluster with each server heart-beating every 3s. This gives the
RM 1ms on average to process each NM heartbeat. That is tough.
> It is true that there are several underlying datastructures that will be touched during
heartbeat processing. So, it is non-trivial to parallelize the NM heartbeat. Yet, it is quite
> Parallelizing the NM heartbeat would substantially improve the scalability of the RM,
allowing it to either 
> a) run larger clusters or 
> b) support faster heartbeats or dynamic scaling of heartbeats
> c) take more asks from each application or 
> c) use cleverer/ more expensive algorithms such as node labels or better packing or ...
> Indeed the RM's scalability limit has been cited as the motivating reason for a variety
of efforts which will become less needed if this can be solved. Ditto for slow heartbeats.
 See Sparrow and Mercury papers for example.
> Can we take a shot at this?
> If not, could we discuss why.

This message was sent by Atlassian JIRA

View raw message