hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsuyoshi OZAWA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1861) Both RM stuck in standby mode when automatic failover is enabled
Date Mon, 14 Apr 2014 16:20:17 GMT

    [ https://issues.apache.org/jira/browse/YARN-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968487#comment-13968487
] 

Tsuyoshi OZAWA commented on YARN-1861:
--------------------------------------

[~kasha], I think this problem looks very similar to YARN-1929 - deadlock after losing ZK
session.

(*ASE#processResult* -> *EES#becomeStandby* -> *AS#transitionToStandby* -> *RM#transitionToStandby*)
and (RM#serviceStop -> RM.super#serviceStop -> *RM.super#stop* -> AS#stop -> *AS#serviceStop*
-> *EES#serviceStop* -> *ASE#quitElection*)

IIUC, Karthik's patch on YARN-1929 partially solve this problem, but not completely. Please
correct me if I get wrong. Thanks.

> Both RM stuck in standby mode when automatic failover is enabled
> ----------------------------------------------------------------
>
>                 Key: YARN-1861
>                 URL: https://issues.apache.org/jira/browse/YARN-1861
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.4.0
>            Reporter: Arpit Gupta
>            Assignee: Vinod Kumar Vavilapalli
>            Priority: Critical
>
> In our HA tests we noticed that the tests got stuck because both RM's got into standby
state and no one became active.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message