hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-1929) DeadLock in RM when automatic failover is enabled.
Date Mon, 14 Apr 2014 14:44:19 GMT

     [ https://issues.apache.org/jira/browse/YARN-1929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Karthik Kambatla updated YARN-1929:

    Attachment: yarn-1929-1.patch

Here is a first-cut patch that removes unnecessary synchronization from EmbeddedElectorService,
AdminService and CompositeService.

Thinking about the best way to write a unit test for this to avoid regressions in the future.
We can may be override becomeActive to sleep for some time and try to shut the RM down. If
it doesn't shutdown within a particular amount of time, fail the test? Any other ideas? 

> DeadLock in RM when automatic failover is enabled.
> --------------------------------------------------
>                 Key: YARN-1929
>                 URL: https://issues.apache.org/jira/browse/YARN-1929
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>         Environment: Yarn HA cluster
>            Reporter: Rohith
>            Assignee: Karthik Kambatla
>            Priority: Blocker
>         Attachments: yarn-1929-1.patch
> Dead lock detected  in RM when automatic failover is enabled.
> {noformat}
> Found one Java-level deadlock:
> =============================
> "Thread-2":
>   waiting to lock monitor 0x00007fb514303cf0 (object 0x00000000ef153fd0, a org.apache.hadoop.ha.ActiveStandbyElector),
>   which is held by "main-EventThread"
> "main-EventThread":
>   waiting to lock monitor 0x00007fb514750a48 (object 0x00000000ef154020, a org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService),
>   which is held by "Thread-2"
> {noformat}

This message was sent by Atlassian JIRA

View raw message