hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Li Lu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3091) [Umbrella] Improve locks of RM scheduler
Date Fri, 23 Jan 2015 02:43:35 GMT

    [ https://issues.apache.org/jira/browse/YARN-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288633#comment-14288633
] 

Li Lu commented on YARN-3091:
-----------------------------

Maybe we want to tweak the wording/organization of this JIRA a little bit? In the description
of this JIRA, two major points are raised:

bq. Many unnecessary synchronized locks, we have seen several cases recently that too frequent
access of scheduler makes scheduler hang. Which could be addressed by using read/write lock.
Components include scheduler, CS queues, apps
I agree that readers-writer lock is a viable approach for many synchronization performance
issues, but other synchronization mechanisms (such as concurrent data structures) may also
be our options. 

bq. Some fields not properly locked (Like clusterResource)
Improperly synchronized accesses may cause data races, and are generally considered as bugs
in Java programs (even though the Java memory model provides some sort of guarantee on racy
programs). To me, it would be better if the second point could be categorized as bug fixes,
rather than improvements, for the RM scheduler code. 

Therefore, maybe we want to solve the problem by two steps: a) fixing improperly synchronized
data accesses in RM scheduler (correctness) and b) improve synchronization performance for
RM scheduler code (performance)? I'm not sure if there should be two separate JIRAs to trace
this, or we can combine both in one "giant" JIRA. 

> [Umbrella] Improve locks of RM scheduler
> ----------------------------------------
>
>                 Key: YARN-3091
>                 URL: https://issues.apache.org/jira/browse/YARN-3091
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler, fairscheduler, resourcemanager, scheduler
>            Reporter: Wangda Tan
>
> In existing YARN RM scheduler, there're some issues of using locks. For example:
> - Many unnecessary synchronized locks, we have seen several cases recently that too frequent
access of scheduler makes scheduler hang. Which could be addressed by using read/write lock.
Components include scheduler, CS queues, apps
> - Some fields not properly locked (Like clusterResource)
> We can address them together in this ticket.
> (More details see comments below)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message