hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Abhiraj Butala (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-6378) NFS: when portmap/rpcbind is not available, NFS registration should timeout instead of hanging
Date Sun, 29 Jun 2014 10:23:25 GMT

     [ https://issues.apache.org/jira/browse/HDFS-6378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Abhiraj Butala updated HDFS-6378:
---------------------------------

    Attachment: HDFS-6378.patch

Attaching a simple patch to add a timeout to DatagramSocket which otherwise blocks indefinitely
on receive(). I have kept the timeout to be 500ms, let me know if it should be changed to
something more appropriate. 

Ctrl-C is now able to kill NFS gateway if portmap is not running or is exited. Note that,
an exception is logged when portmap is not running, but NFS gateway does not exit until Ctrl-C
is pressed.

Output logs:
{code}
14/06/29 03:11:46 INFO oncrpc.SimpleUdpServer: Started listening to UDP requests at port 4242
for Rpc program: mountd at localhost:4242 with workerCount 1
14/06/29 03:11:46 INFO oncrpc.SimpleTcpServer: Started listening to TCP requests at port 4242
for Rpc program: mountd at localhost:4242 with workerCount 1
14/06/29 03:11:46 ERROR oncrpc.RpcProgram: Registration failure with localhost:4242, portmap
entry: (PortmapMapping-100005:1:17:4242)
java.net.SocketTimeoutException: Receive timed out
	at java.net.PlainDatagramSocketImpl.receive0(Native Method)
	at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145)
	at java.net.DatagramSocket.receive(DatagramSocket.java:786)
	at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66)
	at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130)
	at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:101)
	at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:77)
	at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:55)
	at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:68)
	at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:72)
Exception in thread "main" java.lang.RuntimeException: Registration failure
	at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:135)
	at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:101)
	at org.apache.hadoop.mount.MountdBase.start(MountdBase.java:77)
	at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startServiceInternal(Nfs3.java:55)
	at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.startService(Nfs3.java:68)
	at org.apache.hadoop.hdfs.nfs.nfs3.Nfs3.main(Nfs3.java:72)
Caused by: java.net.SocketTimeoutException: Receive timed out
	at java.net.PlainDatagramSocketImpl.receive0(Native Method)
	at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145)
	at java.net.DatagramSocket.receive(DatagramSocket.java:786)
	at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66)
	at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130)
	... 5 more
^C14/06/29 03:18:51 ERROR nfs3.Nfs3Base: RECEIVED SIGNAL 2: SIGINT
14/06/29 03:18:52 ERROR oncrpc.RpcProgram: Unregistration failure with localhost:4242, portmap
entry: (PortmapMapping-100005:1:17:4242)
java.net.SocketTimeoutException: Receive timed out
	at java.net.PlainDatagramSocketImpl.receive0(Native Method)
	at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145)
	at java.net.DatagramSocket.receive(DatagramSocket.java:786)
	at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66)
	at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130)
	at org.apache.hadoop.oncrpc.RpcProgram.unregister(RpcProgram.java:118)
	at org.apache.hadoop.mount.MountdBase$Unregister.run(MountdBase.java:90)
	at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
14/06/29 03:18:52 WARN util.ShutdownHookManager: ShutdownHook 'Unregister' failed, java.lang.RuntimeException:
Unregistration failure
java.lang.RuntimeException: Unregistration failure
	at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:135)
	at org.apache.hadoop.oncrpc.RpcProgram.unregister(RpcProgram.java:118)
	at org.apache.hadoop.mount.MountdBase$Unregister.run(MountdBase.java:90)
	at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
Caused by: java.net.SocketTimeoutException: Receive timed out
	at java.net.PlainDatagramSocketImpl.receive0(Native Method)
	at java.net.AbstractPlainDatagramSocketImpl.receive(AbstractPlainDatagramSocketImpl.java:145)
	at java.net.DatagramSocket.receive(DatagramSocket.java:786)
	at org.apache.hadoop.oncrpc.SimpleUdpClient.run(SimpleUdpClient.java:66)
	at org.apache.hadoop.oncrpc.RpcProgram.register(RpcProgram.java:130)
	... 3 more
14/06/29 03:18:52 INFO nfs3.Nfs3Base: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down Nfs3 at abutala-vBox/127.0.1.1
************************************************************/
{code}

> NFS: when portmap/rpcbind is not available, NFS registration should timeout instead of
hanging 
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6378
>                 URL: https://issues.apache.org/jira/browse/HDFS-6378
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: nfs
>            Reporter: Brandon Li
>         Attachments: HDFS-6378.patch
>
>
> When portmap/rpcbind is not available, NFS could be stuck at registration. Instead, NFS
gateway should shut down automatically with proper error message.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message