hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-862) Potential NN deadlock in processDistributedUpgradeCommand
Date Tue, 11 Sep 2012 05:00:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452715#comment-13452715
] 

Todd Lipcon commented on HDFS-862:
----------------------------------

Given that we removed the "distributed upgrade" code recently, maybe we should just backport
that patch to earlier branches to avoid this issue entirely? Thanks for digging into this,
Andrey!
                
> Potential NN deadlock in processDistributedUpgradeCommand
> ---------------------------------------------------------
>
>                 Key: HDFS-862
>                 URL: https://issues.apache.org/jira/browse/HDFS-862
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.22.0, 0.23.1
>            Reporter: Todd Lipcon
>         Attachments: cycle.png, org.apache.hadoop.hdfs.server.common.TestDistributedUpgrade-output.txt
>
>
> Haven't seen this in practice, but the lock order is inconsistent. processReport locks
FSNamesystem, then calls UpgradeManager.startUpgrade, getUpgradeState, and getUpgradeStatus
(each of which locks the UpgradeManager). FSNameSystem.processDistributedUpgradeCommand calls
upgradeManager.processUpgradeCommand which is synchronized on UpgradeManager, which can call
FSNameSystem.leaveSafeMode which synchronizes on FSNamesystem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message