hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lujing.zui (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-890) PipesApplication connect to ZooKeeperSyncClinetImpl always timeout
Date Sat, 15 Mar 2014 09:14:42 GMT

    [ https://issues.apache.org/jira/browse/HAMA-890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936093#comment-13936093
] 

lujing.zui commented on HAMA-890:
---------------------------------

i find reson.
this node get dns and update its hostname.
so zookeeper run using new hostname,but conf file using old hostname.
and zookeeper check it, directly exit itself.
so connect always failed.

So sorry for bothering everyone.

> PipesApplication connect to ZooKeeperSyncClinetImpl always timeout
> ------------------------------------------------------------------
>
>                 Key: HAMA-890
>                 URL: https://issues.apache.org/jira/browse/HAMA-890
>             Project: Hama
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Hadoop 2.2.0 distribute mode
>            Reporter: lujing.zui
>
> I build a cluster, which contain 4 groomservers.
> I run a pipesApplication, matrixmultiplication, and in one groomserver, it occurs a problems
to connect to ZooKeeperSyncClient. so entire job failed. but in other groomservers, everything
is fine.
> I reboot the problematic node, still not solve this problem.
> As my understanding, both sides of this connect are in one node, connection accept timeout
seems impossible. iptables is off, and network is normal, ping every node is ok.
> I am so confused, any one can help me or give me some hint or suggestion? 
> Thanks so much!
> the log list below:
> 14/03/15 16:21:05 INFO ipc.Server: Starting Socket Reader #1 for port 61002
> 14/03/15 16:21:05 INFO ipc.Server: IPC Server Responder: starting
> 14/03/15 16:21:05 INFO ipc.Server: IPC Server listener on 61002: starting
> 14/03/15 16:21:05 INFO ipc.Server: IPC Server handler 0 on 61002: starting
> 14/03/15 16:21:05 INFO ipc.Server: IPC Server handler 2 on 61002: starting
> 14/03/15 16:21:05 INFO ipc.Server: IPC Server handler 1 on 61002: starting
> 14/03/15 16:21:05 INFO ipc.Server: IPC Server handler 3 on 61002: starting
> 14/03/15 16:21:05 INFO message.HamaMessageManagerImpl: BSPPeer address:hd1.hadoop.lab
port:61002
> 14/03/15 16:21:05 INFO ipc.Server: IPC Server handler 4 on 61002: starting
> 14/03/15 16:21:05 INFO Configuration.deprecation: mapred.cache.localFiles is deprecated.
Instead, use mapreduce.job.cache.local.files
> 14/03/15 16:21:05 INFO sync.ZKSyncClient: Initializing ZK Sync Client
> 14/03/15 16:21:05 INFO sync.ZooKeeperSyncClientImpl: Start connecting to Zookeeper! At
hd1.hadoop.lab/222.195.92.69:61002
> 14/03/15 16:21:08 ERROR bsp.BSPTask: Error running bsp setup and bsp function.
> java.net.SocketTimeoutException: Accept timed out
> 	at java.net.PlainSocketImpl.socketAccept(Native Method)
> 	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
> 	at java.net.ServerSocket.implAccept(ServerSocket.java:478)
> 	at java.net.ServerSocket.accept(ServerSocket.java:446)
> 	at org.apache.hama.pipes.PipesApplication.start(PipesApplication.java:286)
> 	at org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:43)
> 	at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:170)
> 	at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
> 	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)
> 14/03/15 16:21:08 ERROR bsp.BSPTask: Error cleaning up after bsp executed.
> java.lang.NullPointerException
> 	at org.apache.hama.pipes.PipesBSP.cleanup(PipesBSP.java:95)
> 	at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:177)
> 	at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
> 	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)
> 14/03/15 16:21:08 INFO ipc.Server: Stopping server on 61002
> 14/03/15 16:21:08 INFO ipc.Server: IPC Server handler 0 on 61002: exiting
> 14/03/15 16:21:08 INFO ipc.Server: IPC Server handler 2 on 61002: exiting
> 14/03/15 16:21:08 INFO ipc.Server: Stopping IPC Server listener on 61002
> 14/03/15 16:21:08 INFO ipc.Server: IPC Server handler 3 on 61002: exiting
> 14/03/15 16:21:08 INFO ipc.Server: IPC Server handler 4 on 61002: exiting
> 14/03/15 16:21:08 INFO ipc.Server: IPC Server handler 1 on 61002: exiting
> 14/03/15 16:21:08 INFO ipc.Server: Stopping IPC Server Responder
> 14/03/15 16:21:08 ERROR bsp.BSPTask: Shutting down ping service.
> 14/03/15 16:21:08 FATAL bsp.GroomServer: Error running child
> java.net.SocketTimeoutException: Accept timed out
> 	at java.net.PlainSocketImpl.socketAccept(Native Method)
> 	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
> 	at java.net.ServerSocket.implAccept(ServerSocket.java:478)
> 	at java.net.ServerSocket.accept(ServerSocket.java:446)
> 	at org.apache.hama.pipes.PipesApplication.start(PipesApplication.java:286)
> 	at org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:43)
> 	at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:170)
> 	at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
> 	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)
> java.net.SocketTimeoutException: Accept timed out
> 	at java.net.PlainSocketImpl.socketAccept(Native Method)
> 	at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
> 	at java.net.ServerSocket.implAccept(ServerSocket.java:478)
> 	at java.net.ServerSocket.accept(ServerSocket.java:446)
> 	at org.apache.hama.pipes.PipesApplication.start(PipesApplication.java:286)
> 	at org.apache.hama.pipes.PipesBSP.setup(PipesBSP.java:43)
> 	at org.apache.hama.bsp.BSPTask.runBSP(BSPTask.java:170)
> 	at org.apache.hama.bsp.BSPTask.run(BSPTask.java:144)
> 	at org.apache.hama.bsp.GroomServer$BSPPeerChild.main(GroomServer.java:1243)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message