Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2C5D3CF9E for ; Wed, 17 Jul 2013 14:14:36 +0000 (UTC) Received: (qmail 49355 invoked by uid 500); 17 Jul 2013 14:14:36 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 49036 invoked by uid 500); 17 Jul 2013 14:14:31 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 49026 invoked by uid 99); 17 Jul 2013 14:14:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jul 2013 14:14:29 +0000 X-ASF-Spam-Status: No, hits=0.1 required=5.0 tests=FROM_12LTRDOM,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of pascal@pascaljaeger.de designates 213.199.154.82 as permitted sender) Received: from [213.199.154.82] (HELO emea01-db3-obe.outbound.protection.outlook.com) (213.199.154.82) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jul 2013 14:14:23 +0000 Received: from AM3PR02MB033.eurprd02.prod.outlook.com (10.242.242.16) by AM3PR02MB034.eurprd02.prod.outlook.com (10.242.242.18) with Microsoft SMTP Server (TLS) id 15.0.731.16; Wed, 17 Jul 2013 14:14:00 +0000 Received: from AM3PR02MB033.eurprd02.prod.outlook.com ([169.254.4.25]) by AM3PR02MB033.eurprd02.prod.outlook.com ([169.254.4.25]) with mapi id 15.00.0731.000; Wed, 17 Jul 2013 14:13:59 +0000 From: =?iso-8859-1?Q?Pascal_J=E4ger?= To: "user@giraph.apache.org" Subject: Question concerning Aggregators Thread-Topic: Question concerning Aggregators Thread-Index: AQHOgvfhjd6xEowZk0y436jfJAeR3g== Date: Wed, 17 Jul 2013 14:13:58 +0000 Message-ID: In-Reply-To: <43DCFDE5E9E9AE4E92A4D8457E367E3A6CAA0865@exchdb01.kancelar.seznam.cz> Accept-Language: de-DE, en-US Content-Language: de-DE X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.3.5.130515 x-originating-ip: [188.108.144.237] x-forefront-prvs: 0910AAF391 x-forefront-antispam-report: SFV:NSPM;SFS:(199002)(189002)(53754006)(377424004)(76482001)(74876001)(83072001)(47736001)(49866001)(47976001)(50986001)(54316002)(59766001)(77982001)(65816001)(36756003)(56776001)(80022001)(81542001)(46102001)(31966008)(76786001)(74662001)(76796001)(81342001)(77096001)(74706001)(56816003)(16406001)(63696002)(74366001)(76176001)(47446002)(74502001)(74482001)(51856001)(79102001)(54356001)(4396001)(75402002)(53806001)(69226001)(66066001);DIR:OUT;SFP:;SCL:1;SRVR:AM3PR02MB034;H:AM3PR02MB033.eurprd02.prod.outlook.com;CLIP:188.108.144.237;RD:InfoNoRecords;A:1;MX:1;LANG:en; Content-Type: text/plain; charset="iso-8859-1" Content-ID: <30815125C7417D469B8D084362EDD3E4@eurprd02.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: pascaljaeger.de X-Virus-Checked: Checked by ClamAV on apache.org Hi everyone, I am trying to use an Aggregator I have written. Unfortunately I do get an IOException. As far as I can see the exception occurs when things are send from the Aggregator to the Master, during a readFields of one of my classes. The writes and readFields of my classes do not seem to be the problem because a lot of messages get passed without any problem. But shortly before it comes to the exception the read methods get values that do not match to my application, e.g. It reads Long.MAX_Value / 2 instead of 3. An then suddenly the EOF exception occurs during the readFields() of one of my classes. I checked if my code accidentally produces the errors by checking what gets written out and read in later, but this seems to be okay - except for the case below. Do you have any idea? Regards Pascal 2013-07-17 15:52:09,087 INFO org.apache.giraph.master.BspServiceMaster: aggregateWorkerStats: Aggregation found (vtx=3D6,finVtx=3D0,edges=3D16,msgCount=3D8,haltComputation=3Dfalse) on sup= erstep =3D 1 2013-07-17 15:52:09,088 INFO org.apache.giraph.master.BspServiceMaster: coordinateSuperstep: Cleaning up old Superstep /_hadoopBsp/job_201307171551_0001/_applicationAttemptsDir/0/_superstepDir/0 2013-07-17 15:52:09,108 INFO org.apache.giraph.master.MasterThread: masterThread: Coordination of superstep 1 took 0.07 seconds ended with state THIS_SUPERSTEP_DONE and is now on superstep 2 2013-07-17 15:52:09,112 INFO org.apache.giraph.comm.netty.NettyClient: connectAllAddresses: Successfully added 0 connections, (0 total connected) 0 failed, 0 failures total. 2013-07-17 15:52:09,112 INFO org.apache.giraph.partition.PartitionBalancer: balancePartitionsAcrossWorkers: Using algorithm static 2013-07-17 15:52:09,112 INFO org.apache.giraph.partition.PartitionUtils: analyzePartitionStats: Vertices - Mean: 6, Min: Worker(hostname=3D127.0.0.1= , MRtaskID=3D1, port=3D30001) - 6, Max: Worker(hostname=3D127.0.0.1, MRtaskID= =3D1, port=3D30001) - 6 2013-07-17 15:52:09,112 INFO org.apache.giraph.partition.PartitionUtils: analyzePartitionStats: Edges - Mean: 16, Min: Worker(hostname=3D127.0.0.1, MRtaskID=3D1, port=3D30001) - 16, Max: Worker(hostname=3D127.0.0.1, MRtaskI= D=3D1, port=3D30001) - 16 2013-07-17 15:52:09,119 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep 2 on path /_hadoopBsp/job_201307171551_0001/_applicationAttemptsDir/0/_superstepDir/2 /_workerFinishedDir 2013-07-17 15:52:09,119 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [127.0.0.1_1] 2013-07-17 15:52:09,144 WARN org.apache.giraph.comm.netty.handler.RequestServerHandler: exceptionCaught: Channel failed with remote address /127.0.0.1:60439 java.lang.IllegalStateException: doRequest: IOException occurred while processing request at=20 org.apache.giraph.comm.requests.SendAggregatorsToMasterRequest.doRequest(Se ndAggregatorsToMasterRequest.java:52) at=20 org.apache.giraph.comm.netty.handler.MasterRequestServerHandler.processRequ est(MasterRequestServerHandler.java:51) at=20 org.apache.giraph.comm.netty.handler.MasterRequestServerHandler.processRequ est(MasterRequestServerHandler.java:27) at=20 org.apache.giraph.comm.netty.handler.RequestServerHandler.messageReceived(R equestServerHandler.java:106) at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) at=20 org.jboss.netty.handler.codec.oneone.OneToOneDecoder.handleUpstream(OneToOn eDecoder.java:71) at=20 org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(Channe lUpstreamEventRunnable.java:45) at=20 org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunn able.java:69) at=20 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.j ava:895) at=20 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: 918) at java.lang.Thread.run(Thread.java:680) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readLong(DataInputStream.java:399) at java.io.DataInputStream.readDouble(DataInputStream.java:451) at mystuff.maxflow.ExcessPath.readFields(ExcessPath.java:71) at mystuff.maxflow.MFMessage.readFields(MFMessage.java:42) at=20 org.apache.giraph.master.MasterAggregatorHandler.acceptAggregatedValues(Mas terAggregatorHandler.java:253) at=20 org.apache.giraph.comm.requests.SendAggregatorsToMasterRequest.doRequest(Se ndAggregatorsToMasterRequest.java:50) ... 10 more 2013-07-17 15:57:09,151 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: 0 out of 1 workers finished on superstep 2 on path /_hadoopBsp/job_201307171551_0001/_applicationAttemptsDir/0/_superstepDir/2 /_workerFinishedDir 2013-07-17 15:57:09,151 INFO org.apache.giraph.master.BspServiceMaster: barrierOnWorkerList: Waiting on [127.0.0.1_1]