hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lujing.zui (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAMA-890) PipesApplication connect to ZooKeeperSyncClinetImpl always timeout
Date Sat, 15 Mar 2014 08:37:42 GMT
lujing.zui created HAMA-890:
-------------------------------

             Summary: PipesApplication connect to ZooKeeperSyncClinetImpl always timeout
                 Key: HAMA-890
                 URL: https://issues.apache.org/jira/browse/HAMA-890
             Project: Hama
          Issue Type: Bug
    Affects Versions: 0.7.0
         Environment: Hadoop 2.2.0 distribute mode
            Reporter: lujing.zui


I build a cluster, which contain 4 groomserver.
I run a pipesApplication, matrixmultiplication, and in one groomserver, it occurs a problems
to connect to ZooKeeperSyncClient. so entire job failed. but other groomserver, everything
is fine.
I reboot the problematic node, cannot solve this problem.

As I understand, both sides of this connect are in one node, accept timeout seems impossible.
iptables is off, and network is normal, ping every node is ok.
I am so confused, any one can help me or give me some hint or suggestion? 
Thanks so much!

the log list below:
14/03/15 16:21:05 INFO ipc.Server: Starting Socket Reader #1 for port 61002
14/03/15 16:21:05 INFO ipc.Server: IPC Server Responder: starting
14/03/15 16:21:05 INFO ipc.Server: IPC Server listener on 61002: starting
14/03/15 16:21:05 INFO ipc.Server: IPC Server handler 0 on 61002: starting
14/03/15 16:21:05 INFO ipc.Server: IPC Server handler 2 on 61002: starting
14/03/15 16:21:05 INFO ipc.Server: IPC Server handler 1 on 61002: starting
14/03/15 16:21:05 INFO ipc.Server: IPC Server handler 3 on 61002: starting
14/03/15 16:21:05 INFO message.HamaMessageManagerImpl: BSPPeer address:hd1.hadoop.lab port:61002
14/03/15 16:21:05 INFO ipc.Server: IPC Server handler 4 on 61002: starting
14/03/15 16:21:05 INFO Configuration.deprecation: mapred.cache.localFiles is deprecated. Instead,
use mapreduce.job.cache.local.files
14/03/15 16:21:05 INFO sync.ZKSyncClient: Initializing ZK Sync Client
14/03/15 16:21:05 INFO sync.ZooKeeperSyncClientImpl: Start connecting to Zookeeper! At hd1.hadoop.lab/222.195.92.69:61002
14/03/15 16:21:08 ERROR bsp.BSPTask: Error running bsp setup and bsp function.
java.net.SocketTimeoutException: Accept timed out
	at java.net.PlainSocketImpl.socketAccept(Native Method)
	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
	at java.net.ServerSocket.implAccept(ServerSocket.java:478)
	at java.net.ServerSocket.accept(ServerSocket.java:446)
	at org.apache.hama.pipes.PipesApplication.start(PipesApplication.java:286)
	at org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:43)
	at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:170)
	at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)
14/03/15 16:21:08 ERROR bsp.BSPTask: Error cleaning up after bsp executed.
java.lang.NullPointerException
	at org.apache.hama.pipes.PipesBSP.cleanup(PipesBSP.java:95)
	at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:177)
	at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)
14/03/15 16:21:08 INFO ipc.Server: Stopping server on 61002
14/03/15 16:21:08 INFO ipc.Server: IPC Server handler 0 on 61002: exiting
14/03/15 16:21:08 INFO ipc.Server: IPC Server handler 2 on 61002: exiting
14/03/15 16:21:08 INFO ipc.Server: Stopping IPC Server listener on 61002
14/03/15 16:21:08 INFO ipc.Server: IPC Server handler 3 on 61002: exiting
14/03/15 16:21:08 INFO ipc.Server: IPC Server handler 4 on 61002: exiting
14/03/15 16:21:08 INFO ipc.Server: IPC Server handler 1 on 61002: exiting
14/03/15 16:21:08 INFO ipc.Server: Stopping IPC Server Responder
14/03/15 16:21:08 ERROR bsp.BSPTask: Shutting down ping service.
14/03/15 16:21:08 FATAL bsp.GroomServer: Error running child
java.net.SocketTimeoutException: Accept timed out
	at java.net.PlainSocketImpl.socketAccept(Native Method)
	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
	at java.net.ServerSocket.implAccept(ServerSocket.java:478)
	at java.net.ServerSocket.accept(ServerSocket.java:446)
	at org.apache.hama.pipes.PipesApplication.start(PipesApplication.java:286)
	at org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:43)
	at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:170)
	at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)
java.net.SocketTimeoutException: Accept timed out
	at java.net.PlainSocketImpl.socketAccept(Native Method)
	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
	at java.net.ServerSocket.implAccept(ServerSocket.java:478)
	at java.net.ServerSocket.accept(ServerSocket.java:446)
	at org.apache.hama.pipes.PipesApplication.start(PipesApplication.java:286)
	at org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:43)
	at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:170)
	at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message