giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <ach...@apache.org>
Subject Re: RPC Error
Date Fri, 20 Jul 2012 19:50:47 GMT
I would try the -Dgiraph.useNetty=true to use Netty rather than Jetty.  
My guess, however, is that you likely had a error (likely memory) that 
caused a task to fail, causing a connect reset.  We try to assign the 
port numbers based on the task id so that you can work backwards to 
debug.  This task failed because it couldn't connect to a worker with 
port 30069.  I would look at map task 30069 and see why it failed, etc.

Avery

On 7/20/12 7:56 AM, Nicolas DUGUE wrote:
> Hi,
>
>     We runned a Pagerank benchmark with 120 millions of vertices and 
> one edge per vertice.
>     We distributed that on 128 workers.
>
>     The loading of the graph is done well.
>     But, several workers bug at the superstep 0. Any ideas of the 
> problem ? Thanks
>     The Error trace :
>
> java.lang.IllegalStateException: flush: Got ExecutionException
>     at 
> org.apache.giraph.comm.BasicRPCCommunications.flush(BasicRPCCommunications.java:1102)
>     at 
> org.apache.giraph.graph.BspServiceWorker.finishSuperstep(BspServiceWorker.java:968)
>     at org.apache.giraph.graph.GraphMapper.map(GraphMapper.java:613)
>     at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:657)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>     at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Unknown Source)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>     at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.RuntimeException: java.io.IOException: Call to 
> hadoop-0.univ-orleans.fr/172.18.1.200:30069 failed on local exception: 
> java.io.IOException: Connection reset by peer
>     at java.util.concurrent.FutureTask$Sync.innerGet(Unknown Source)
>     at java.util.concurrent.FutureTask.get(Unknown Source)
>     at 
> org.apache.giraph.comm.BasicRPCCommunications.flush(BasicRPCCommunications.java:1097)
>     ... 10 more
> Caused by: java.lang.RuntimeException: java.io.IOException: Call to 
> hadoop-0.univ-orleans.fr/172.18.1.200:30069 failed on local exception: 
> java.io.IOException: Connection reset by peer
>     at 
> org.apache.giraph.comm.BasicRPCCommunications$PeerFlushExecutor.run(BasicRPCCommunications.java:368)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Unknown 
> Source)
>     at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
>     at java.util.concurrent.FutureTask.run(Unknown Source)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>     at java.lang.Thread.run(Unknown Source)
> Caused by: java.io.IOException: Call to 
> hadoop-0.univ-orleans.fr/172.18.1.200:30069 failed on local exception: 
> java.io.IOException: Connection reset by peer
>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1075)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>     at $Proxy3.putVertexIdMessagesList(Unknown Source)
>     at 
> org.apache.giraph.comm.BasicRPCCommunications$PeerFlushExecutor.run(BasicRPCCommunications.java:328)
>     ... 6 more
> Caused by: java.io.IOException: Connection reset by peer
>     at sun.nio.ch.FileDispatcher.write0(Native Method)
>     at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>     at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source)
>     at sun.nio.ch.IOUtil.write(Unknown Source)
>     at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>     at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:55)
>     at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>     at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
>     at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
>     at java.io.BufferedOutputStream.write(Unknown Source)
>     at java.io.DataOutputStream.write(Unknown Source)
>     at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:782)
>     at org.apache.hadoop.ipc.Client.call(Client.java:1051)
>     ... 9 more
>
> Best regards,
> Nicolas


Mime
View raw message