hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Krogen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-11208) Deadlock in WebHDFS on shutdown
Date Mon, 05 Dec 2016 23:11:58 GMT

     [ https://issues.apache.org/jira/browse/HDFS-11208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Erik Krogen updated HDFS-11208:
-------------------------------
    Attachment: HDFS-11208-test-deadlock.patch

I am attaching a patch containing a unit test which demonstrates this issue (currently times
out with deadlock when applied). 

I am open to ideas on how best to solve this deadlock issue. 

> Deadlock in WebHDFS on shutdown
> -------------------------------
>
>                 Key: HDFS-11208
>                 URL: https://issues.apache.org/jira/browse/HDFS-11208
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1
>            Reporter: Erik Krogen
>            Assignee: Erik Krogen
>         Attachments: HDFS-11208-test-deadlock.patch
>
>
> Currently on the client side if the {{DelegationTokenRenewer}} attempts to renew a WebHdfs
delegation token while the client system is shutting down (i.e. {{FileSystem.Cache.ClientFinalizer}}
is running) a deadlock may occur. This happens because {{ClientFinalizer}} calls {{FileSystem.Cache.closeAll()}}
which first takes a lock on the {{FileSystem.Cache}} object and then locks each file system
in the cache as it iterates over them. {{DelegationTokenRenewer}} takes a lock on a filesystem
object while it is renewing that filesystem's token, but within {{TokenAspect.TokenManager.renew()}}
(used for renewal of WebHdfs tokens) {{FileSystem.get}} is called, which in turn takes a lock
on the FileSystem cache object, potentially causing deadlock if {{ClientFinalizer}} is currently
running.
> See below for example deadlock output:
> {code}
> Found one Java-level deadlock:
> =============================
> "Thread-8572":
> waiting to lock monitor 0x00007eff401f9878 (object 0x000000051ec3f930, a
> dali.hdfs.web.WebHdfsFileSystem),
> which is held by "FileSystem-DelegationTokenRenewer"
> "FileSystem-DelegationTokenRenewer":
> waiting to lock monitor 0x00007f005c08f5c8 (object 0x000000050389c8b8, a
> dali.fs.FileSystem$Cache),
> which is held by "Thread-8572"
> Java stack information for the threads listed above:
> ===================================================
> "Thread-8572":
> at dali.hdfs.web.WebHdfsFileSystem.close(WebHdfsFileSystem.java:864)
>    - waiting to lock <0x000000051ec3f930> (a
>    dali.hdfs.web.WebHdfsFileSystem)
>    at dali.fs.FilterFileSystem.close(FilterFileSystem.java:449)
>    at dali.fs.FileSystem$Cache.closeAll(FileSystem.java:2407)
>    - locked <0x000000050389c8b8> (a dali.fs.FileSystem$Cache)
>    at dali.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2424)
>    - locked <0x000000050389c8d0> (a
>    dali.fs.FileSystem$Cache$ClientFinalizer)
>    at dali.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
>    "FileSystem-DelegationTokenRenewer":
>    at dali.fs.FileSystem$Cache.getInternal(FileSystem.java:2343)
>    - waiting to lock <0x000000050389c8b8> (a dali.fs.FileSystem$Cache)
>    at dali.fs.FileSystem$Cache.get(FileSystem.java:2332)
>    at dali.fs.FileSystem.get(FileSystem.java:369)
>    at
>    dali.hdfs.web.TokenAspect$TokenManager.getInstance(TokenAspect.java:92)
>    at dali.hdfs.web.TokenAspect$TokenManager.renew(TokenAspect.java:72)
>    at dali.security.token.Token.renew(Token.java:373)
>    at
>    dali.fs.DelegationTokenRenewer$RenewAction.renew(DelegationTokenRenewer.java:127)
>    - locked <0x000000051ec3f930> (a dali.hdfs.web.WebHdfsFileSystem)
>    at
>    dali.fs.DelegationTokenRenewer$RenewAction.access$300(DelegationTokenRenewer.java:57)
>    at dali.fs.DelegationTokenRenewer.run(DelegationTokenRenewer.java:258)
> Found 1 deadlock.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message