Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 04910DB41 for ; Tue, 11 Sep 2012 07:38:29 +0000 (UTC) Received: (qmail 91032 invoked by uid 500); 11 Sep 2012 07:38:28 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 90957 invoked by uid 500); 11 Sep 2012 07:38:28 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 90947 invoked by uid 99); 11 Sep 2012 07:38:28 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Sep 2012 07:38:28 +0000 Received: from localhost (HELO achingmbp15.local) (127.0.0.1) (smtp-auth username aching, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Sep 2012 07:38:28 +0000 Message-ID: <504EEA72.3000504@apache.org> Date: Tue, 11 Sep 2012 00:38:26 -0700 From: Avery Ching User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:15.0) Gecko/20120824 Thunderbird/15.0 MIME-Version: 1.0 To: user@giraph.apache.org Subject: Re: reason behind a java.io.EOFException References: <504ED1EC.4090503@apache.org> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Hi Franco, I think that the problem is that there is a bug in the serialization/deserialization of LongDoubleNullDoubleVertex (this is your class right?). That's why it works on one worker, but not more. Avery On 9/10/12 11:24 PM, Franco Maria Nardini wrote: > Thanks a lot, Avery. > > I tried your solution but now I got this error that seems related to > netty. Am I wrong? > > Best, > > FM > > --- > 2012-09-11 08:19:41,796 WARN > org.apache.giraph.comm.netty.handler.RequestServerHandler: > exceptionCaught: Channel failed with remote address /172.20.10.3:50077 > java.io.EOFException: fieldSize is too long! Length is 8, but maximum is 5 > at org.jboss.netty.buffer.ChannelBufferInputStream.checkAvailable(ChannelBufferInputStream.java:230) > at org.jboss.netty.buffer.ChannelBufferInputStream.readLong(ChannelBufferInputStream.java:198) > at org.jboss.netty.buffer.ChannelBufferInputStream.readDouble(ChannelBufferInputStream.java:153) > at org.apache.giraph.graph.LongDoubleNullDoubleVertex.readFields(LongDoubleNullDoubleVertex.java:157) > at org.apache.giraph.comm.requests.SendVertexRequest.readFieldsRequest(SendVertexRequest.java:79) > at org.apache.giraph.comm.requests.WritableRequest.readFields(WritableRequest.java:90) > at org.apache.giraph.comm.netty.handler.RequestDecoder.decode(RequestDecoder.java:82) > at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:67) > at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) > at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:458) > at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:439) > at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) > at org.apache.giraph.comm.netty.ByteCounter.handleUpstream(ByteCounter.java:61) > at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) > at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) > at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:91) > at org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:385) > at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:256) > at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:680) > 2012-09-11 08:19:41,798 WARN > org.apache.giraph.comm.netty.handler.RequestServerHandler: > exceptionCaught: Channel failed with remote address /172.20.10.3:50077 > java.io.EOFException: fieldSize is too long! Length is 8, but maximum is 1 > at org.jboss.netty.buffer.ChannelBufferInputStream.checkAvailable(ChannelBufferInputStream.java:230) > at org.jboss.netty.buffer.ChannelBufferInputStream.readLong(ChannelBufferInputStream.java:198) > at org.jboss.netty.buffer.ChannelBufferInputStream.readDouble(ChannelBufferInputStream.java:153) > at org.apache.giraph.graph.LongDoubleNullDoubleVertex.readFields(LongDoubleNullDoubleVertex.java:157) > at org.apache.giraph.comm.requests.SendVertexRequest.readFieldsRequest(SendVertexRequest.java:79) > at org.apache.giraph.comm.requests.WritableRequest.readFields(WritableRequest.java:90) > at org.apache.giraph.comm.netty.handler.RequestDecoder.decode(RequestDecoder.java:82) > at org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOneDecoder.java:67) > at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) > at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:458) > at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:439) > at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) > at org.apache.giraph.comm.netty.ByteCounter.handleUpstream(ByteCounter.java:61) > at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268) > at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255) > at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:91) > at org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:385) > at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:256) > at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:680) > > On Tue, Sep 11, 2012 at 7:53 AM, Avery Ching wrote: >> These days we are focusing more on the netty IPC. Can you try >> -Dgiraph.useNetty=true? >> >> Avery >> >> >> On 9/10/12 2:08 PM, Franco Maria Nardini wrote: >>> Dear all, >>> >>> I am working with Giraph 0.2/Hadoop 1.0.3. In particular, I am trying >>> to execute the following code: >>> >>> hadoop jar >>> giraph-0.2-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar >>> org.apache.giraph.GiraphRunner \ >>> org.apache.giraph.examples.SimplePageRankVertex \ >>> -w 2 \ >>> -if >>> org.apache.giraph.examples.SimplePageRankVertex\$SimplePageRankVertexInputFormat >>> -ip bigGraph.txt \ >>> -of org.apache.giraph.io.IdWithValueTextOutputFormat -op output \ >>> -mc >>> org.apache.giraph.examples.SimplePageRankVertex\$HDFSBasedPageRankVertexMasterCompute >>> >>> If I set the number of workers equal to two, one of the mappers produce: >>> >>> ava.lang.RuntimeException: java.io.IOException: Call to >>> zipottero.local/172.20.10.3:30001 failed on local exception: >>> java.io.EOFException >>> at >>> org.apache.giraph.comm.BasicRPCCommunications.sendPartitionRequest(BasicRPCCommunications.java:923) >>> at >>> org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:327) >>> at >>> org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:604) >>> at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:377) >>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:578) >>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) >>> at org.apache.hadoop.mapred.Child$4.run(Child.java:255) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:396) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) >>> at org.apache.hadoop.mapred.Child.main(Child.java:249) >>> Caused by: java.io.IOException: Call to >>> zipottero.local/172.20.10.3:30001 failed on local exception: >>> java.io.EOFException >>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1107) >>> at org.apache.hadoop.ipc.Client.call(Client.java:1075) >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225) >>> at $Proxy3.putVertexList(Unknown Source) >>> at >>> org.apache.giraph.comm.BasicRPCCommunications.sendPartitionRequest(BasicRPCCommunications.java:920) >>> ... 11 more >>> Caused by: java.io.EOFException >>> at java.io.DataInputStream.readInt(DataInputStream.java:375) >>> at >>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:804) >>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:749) >>> >>> while it perfectly works if the number of workers is set to 1. I am >>> experiencing the problem both on small and big graphs. >>> >>> Any idea of the reasons behind this behavior? >>> >>> Thanks a lot in advance. >>> >>> Best, >>> >>> FM >>> -- >>> Franco Maria Nardini >>> >>> High Performance Computing Laboratory >>> Istituto di Scienza e Tecnologie dell�Informazione (ISTI) >>> Consiglio Nazionale delle Ricerche (CNR) >>> Via G. Moruzzi, 1 >>> 56124, Pisa, Italy >>> >>> Phone: +39 050 315 3496 >>> Fax: +39 050 315 2040 >>> Mail: francomaria.nardini@isti.cnr.it >>> Skype: francomaria.nardini >>> Web: http://hpc.isti.cnr.it/~nardini/ >>