hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3374) hdfs' TestDelegationToken fails intermittently with a race condition
Date Tue, 08 Jan 2013 17:20:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547024#comment-13547024
] 

Todd Lipcon commented on HDFS-3374:
-----------------------------------

This is still only in branch-1 and not in trunk. Any plans to forward port?

Also, jcarder noticed that this added a lock order inversion:
- FSNamesystem.saveNamespace (holding FSN lock) calls DTSM.saveSecretManagerState (which takes
DTSM lock)
- ExpiredTokenRemover.run (holding DTSM lock) calls rollMasterKey calls updateCurrentKey calls
logUpdateMasterKey which takes FSN lock

So if there is a concurrent saveNamespace at the same tie as the expired token remover runs,
it might make the NN deadlock.

                
> hdfs' TestDelegationToken fails intermittently with a race condition
> --------------------------------------------------------------------
>
>                 Key: HDFS-3374
>                 URL: https://issues.apache.org/jira/browse/HDFS-3374
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 1.0.3
>
>         Attachments: HDFS-3374-branch-1.0.patch, hdfs-3374.patch, HDFS-3374.patch
>
>
> The testcase is failing because the MiniDFSCluster is shutdown before the secret manager
can change the key, which calls system.exit with no edit streams available.
> {code}
>     [junit] 2012-05-04 15:03:51,521 WARN  common.Storage (FSImage.java:updateRemovedDirs(224))
- Removing storage dir /home/horton/src/hadoop/build/test/data/dfs/name1
>     [junit] 2012-05-04 15:03:51,522 FATAL namenode.FSNamesystem (FSEditLog.java:fatalExit(388))
- No edit streams are accessible
>     [junit] java.lang.Exception: No edit streams are accessible
>     [junit]     at org.apache.hadoop.hdfs.server.namenode.FSEditLog.fatalExit(FSEditLog.java:388)
>     [junit]     at org.apache.hadoop.hdfs.server.namenode.FSEditLog.exitIfNoStreams(FSEditLog.java:407)
>     [junit]     at org.apache.hadoop.hdfs.server.namenode.FSEditLog.removeEditsAndStorageDir(FSEditLog.java:432)
>     [junit]     at org.apache.hadoop.hdfs.server.namenode.FSEditLog.removeEditsStreamsAndStorageDirs(FSEditLog.java:468)
>     [junit]     at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:1028)
>     [junit]     at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.logUpdateMasterKey(FSNamesystem.java:5641)
>     [junit]     at org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenSecretManager.logUpdateMasterKey(DelegationTokenSecretManager.java:286)
>     [junit]     at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.updateCurrentKey(AbstractDelegationTokenSecretManager.java:150)
>     [junit]     at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.rollMasterKey(AbstractDelegationTokenSecretManager.java:174)
>     [junit]     at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:385)
>     [junit]     at java.lang.Thread.run(Thread.java:662)
>     [junit] Running org.apache.hadoop.hdfs.security.TestDelegationToken
>     [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
>     [junit] Test org.apache.hadoop.hdfs.security.TestDelegationToken FAILED (crashed)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message