hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2946) DeadLocks in RMStateStore<->ZKRMStateStore
Date Tue, 16 Dec 2014 03:23:14 GMT

    [ https://issues.apache.org/jira/browse/YARN-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247672#comment-14247672
] 

Rohith commented on YARN-2946:
------------------------------

Thanks [~jianhe] and [~varun_saxena] for your suggestions.

[~jianhe] , I am trying to understand before implementing state machine for DT keys updates
on store,  is there any specific reason why state machine was not implemented? Does state
machine for updating DT keys cause any potential issues?

> DeadLocks in RMStateStore<->ZKRMStateStore
> ------------------------------------------
>
>                 Key: YARN-2946
>                 URL: https://issues.apache.org/jira/browse/YARN-2946
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.7.0
>            Reporter: Rohith
>            Assignee: Rohith
>            Priority: Blocker
>         Attachments: 0001-YARN-2946.patch, 0002-YARN-2946.patch, RM_BeforeFix_Deadlock_cycle_1.png,
RM_BeforeFix_Deadlock_cycle_2.png, TestYARN2946.java
>
>
> Found one deadlock in ZKRMStateStore.
> # Initial stage zkClient is null because of zk disconnected event.
> # When ZKRMstatestore#runWithCheck()  wait(zkSessionTimeout) for zkClient to re establish
zookeeper connection either via synconnected or expired event, it is highly possible that
any other thred can obtain lock on {{ZKRMStateStore.this}} from state machine transition events.
This cause Deadlock in ZKRMStateStore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message