giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas DUGUE <nicolas.du...@univ-orleans.fr>
Subject RPC Error
Date Fri, 20 Jul 2012 14:56:45 GMT
Hi,

     We runned a Pagerank benchmark with 120 millions of vertices and 
one edge per vertice.
     We distributed that on 128 workers.

     The loading of the graph is done well.
     But, several workers bug at the superstep 0. Any ideas of the 
problem ? Thanks
     The Error trace :

java.lang.IllegalStateException: flush: Got ExecutionException
	at org.apache.giraph.comm.BasicRPCCommunications.flush(BasicRPCCommunications.java:1102)
	at org.apache.giraph.graph.BspServiceWorker.finishSuperstep(BspServiceWorker.java:968)
	at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:613)
	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:657)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Unknown Source)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.IOException:
Call to hadoop-0.univ-orleans.fr/172.18.1.200:30069 failed on local exception: java.io.IOException:
Connection reset by peer
	at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source)
	at java.util.concurrent.FutureTask.get(Unknown Source)
	at org.apache.giraph.comm.BasicRPCCommunications.flush(BasicRPCCommunications.java:1097)
	... 10 more
Caused by: java.lang.RuntimeException: java.io.IOException: Call to hadoop-0.univ-orleans.fr/172.18.1.200:30069
failed on local exception: java.io.IOException: Connection reset by peer
	at org.apache.giraph.comm.BasicRPCCommunications$PeerFlushExecutor.run(BasicRPCCommunications.java:368)
	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
	at java.util.concurrent.FutureTask.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: Call to hadoop-0.univ-orleans.fr/172.18.1.200:30069 failed
on local exception: java.io.IOException: Connection reset by peer
	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
	at org.apache.hadoop.ipc.Client.call(Client.java:1075)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
	at $Proxy3.putVertexIdMessagesList(Unknown Source)
	at org.apache.giraph.comm.BasicRPCCommunications$PeerFlushExecutor.run(BasicRPCCommunications.java:328)
	... 6 more
Caused by: java.io.IOException: Connection reset by peer
	at sun.nio.ch.FileDispatcher.write0(Native Method)
	at sun.nio.ch.SocketDispatcher.write(Unknown Source)
	at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
	at sun.nio.ch.IOUtil.write(Unknown Source)
	at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
	at org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:55)
	at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
	at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
	at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
	at java.io.BufferedOutputStream.write(Unknown Source)
	at java.io.DataOutputStream.write(Unknown Source)
	at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:782)
	at org.apache.hadoop.ipc.Client.call(Client.java:1051)
	... 9 more

Best regards,
Nicolas

Mime
View raw message