hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5124) Namenode in secure cluster deadlocks
Date Thu, 22 Aug 2013 17:35:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13747694#comment-13747694
] 

Chris Nauroth commented on HDFS-5124:
-------------------------------------

bq. Do we even need a read lock on the namespace at all? Token verification has nothing to
do with the namespace.

The only potential problem I see with removing the namesystem read lock is that failover acquires
the write lock in {{HAState#setStateInternal}}.  If this NN is in the middle of transitioning
from standby to active, then holding the read lock in {{DelegationTokenSecretManager#retrievePassword}}
blocks clients until the transition to active has completed.  Without the read lock, clients
will get an immediate {{StandbyException}} even though this NN is about to become active.
 If this happens rapidly enough, then the client could exhaust its max retries and get an
RPC error before the transition to active completes.

I kicked off a Jenkins run for the new patch:

https://builds.apache.org/job/PreCommit-HDFS-Build/4870/

                
> Namenode in secure cluster deadlocks
> ------------------------------------
>
>                 Key: HDFS-5124
>                 URL: https://issues.apache.org/jira/browse/HDFS-5124
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.1.1-beta
>         Environment: Secure Hadoop 2 cluster
>            Reporter: Deepesh Khandelwal
>            Assignee: Jing Zhao
>            Priority: Blocker
>         Attachments: HADOOP-5124.patch, HDFS-5124.001.patch, HDFS-5124.002.patch, nn_jstack.out
>
>
> Namenode deadlocks after a while in use.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message