hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlo Curino (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4198) CapacityScheduler locking / synchronization improvements
Date Fri, 25 Sep 2015 17:26:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-4198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908355#comment-14908355
] 

Carlo Curino commented on YARN-4198:
------------------------------------

[~kshukla], I am happy to collaborate, but we have a patch in the works with [~atumanov] and
[~chris.douglas]... We tested at scale... seems to work well... we now want to double check
it carefully, clean it up and submitted for review. However, as this is a very delicate piece,
it would be great if you help us go over it and analyze it carefully. It is also likely that
we missed some further opportunities of improvement.  

The general observation is that we are holding a bunch of big locks (e.g., CS) to make modifications
to data structures that could be protected by much more fine grained locks, or made concurrency
safe and not lock at all (as the entire CS anyway operate on a stale view of the cluster state
due to hearbeats etc). 

We will post something soon, and I would really like your help on reviewing/extending this.

> CapacityScheduler locking / synchronization improvements
> --------------------------------------------------------
>
>                 Key: YARN-4198
>                 URL: https://issues.apache.org/jira/browse/YARN-4198
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Carlo Curino
>
> In the context of YARN-4193 (which stresses the RM/CS performance) we found several performance
problems with  in the locking/synchronization of the CapacityScheduler, as well as inconsistencies
that do not normally surface (incorrect locking-order of queues protected by CS locks etc).
This JIRA proposes several refactoring that improve this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message