hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5755) Filesystem close blocks SystemExit if a DFS client is trying to talk to a nonexistent NN
Date Fri, 10 Jan 2014 11:08:53 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867717#comment-13867717
] 

Steve Loughran commented on HDFS-5755:
--------------------------------------

fixing formatting of last sequence

{code}
"Thread-0" prio=5 tid=0x00007feed32ed800 nid=0x5707 waiting on condition [0x000000011a45d000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.hadoop.ipc.Client.stop(Client.java:1173)
at org.apache.hadoop.ipc.ClientCache.stopClient(ClientCache.java:100)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.close(ProtobufRpcEngine.java:251)
at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:626)
at org.apache.hadoop.io.retry.DefaultFailoverProxyProvider.close(DefaultFailoverProxyProvider.java:57)
at org.apache.hadoop.io.retry.RetryInvocationHandler.close(RetryInvocationHandler.java:206)
at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:626)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.close(ClientNamenodeProtocolTranslatorPB.java:174)
at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:621)
at org.apache.hadoop.hdfs.DFSClient.closeConnectionToNamenode(DFSClient.java:738)
at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:794)
locked <0x00000007fec26860> (a org.apache.hadoop.hdfs.DFSClient)
at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:847)
at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2524)
locked <0x00000007fec254e0> (a org.apache.hadoop.fs.FileSystem$Cache)
at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2541)
locked <0x00000007fec254f8> (a org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
{code}
h3. shutdown hook sequence:
# Hadoop shutdown hook triggers {{FileSystem.cache.closeAll()}}
# which invokes {{DFSClient.closeConnectionToNamenode()}}
# which tries to stop the RPC proxy
# which trickles all the way through to the IPC {{ClientCache.stopClient()}}
# which waits until all the client connections are closed

{code}
// wait until all connections are closed
while (!connections.isEmpty()) {
try
{ Thread.sleep(100); }
catch (InterruptedException e) {
}
}
{code}

> Filesystem close blocks SystemExit if a DFS client is trying to talk to a nonexistent
NN
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-5755
>                 URL: https://issues.apache.org/jira/browse/HDFS-5755
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> If a DFS client instance is spinning in a connection refused cycle, and you try to stop
the process via System.exit(), the exit call does not complete until the client has actually
given up. 
> That is: you can't exit a process with a standard kill while a DFS client operation is
blocking. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message