Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EE0A81010C for ; Thu, 29 Aug 2013 18:16:38 +0000 (UTC) Received: (qmail 6845 invoked by uid 500); 29 Aug 2013 18:16:36 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 6794 invoked by uid 500); 29 Aug 2013 18:16:31 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 6781 invoked by uid 99); 29 Aug 2013 18:16:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Aug 2013 18:16:30 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yasser.altowim@ericsson.com designates 198.24.6.45 as permitted sender) Received: from [198.24.6.45] (HELO usevmg20.ericsson.net) (198.24.6.45) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Aug 2013 18:16:23 +0000 X-AuditID: c618062d-b7fda8e0000024c6-55-521f8fe136ce Received: from EUSAAHC001.ericsson.se (Unknown_Domain [147.117.188.75]) by usevmg20.ericsson.net (Symantec Mail Security) with SMTP id 9D.EC.09414.1EF8F125; Thu, 29 Aug 2013 20:16:02 +0200 (CEST) Received: from EUSAAMB105.ericsson.se ([147.117.188.122]) by EUSAAHC001.ericsson.se ([147.117.188.75]) with mapi id 14.02.0328.009; Thu, 29 Aug 2013 14:16:01 -0400 From: Yasser Altowim To: "user@giraph.apache.org" Subject: Exception with Large Graphs Thread-Topic: Exception with Large Graphs Thread-Index: Ac6k47v/i2ZsrXe0SSW1MFuPYsHJkA== Date: Thu, 29 Aug 2013 18:16:01 +0000 Message-ID: <1158B25680108D4DBDD0558E33209370012D2F45@eusaamb105.ericsson.se> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [147.117.188.135] Content-Type: multipart/alternative; boundary="_000_1158B25680108D4DBDD0558E33209370012D2F45eusaamb105erics_" MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrKLMWRmVeSWpSXmKPExsUyuXSPt+6jfvkggwnTOSz+v/nN6MDosWfn GcYAxigum5TUnMyy1CJ9uwSujPbVz9gLTh9lqviw6QNbA+ONZUxdjBwcEgImEv9nWHYxcgKZ YhIX7q1n62Lk4hASOMoo8fD9dxYIZzmjRNuln0wgVWwCRhKHX35jBmkWETCVWLW1AiQsLKAi 0Xp0BRuILSKgKdGzrYsVwtaT2DalD6ycRUBV4t7lEJAwr4CvxOZzuxlBbEagvd9PrQGbziwg LnHryXwmiHsEJJbsOc8MYYtKvHz8jxXCVpb4PucRC8hIZoF8iQMXaiBGCkqcnPmEZQKj0Cwk k2YhVM1CUgVRoiOxYPcnNghbW2LZwtfMMPaZA4+ZkMUXMLKvYuQoLU4ty003MtjECAz6YxJs ujsY97y0PMQozcGiJM67Su9MoJBAemJJanZqakFqUXxRaU5q8SFGJg5OqQZG+R2T7vs9mJH3 vud8jtKlzRHXlvy2Kd3rucX2jI9WWO2FqTo9IZuil+XnMq7Y8eli14e0iQ9fdm5a9+KtxQun KRLrUo1f3Ps+ofSFgoVzxezlxzdrm2t9fpAizbvptcV5u95fT4463FFPDOKqlHujryq45Y3S b/bbMY2VxQmZXPIVkZ9mfjm4XYmlOCPRUIu5qDgRANNZ6WtIAgAA X-Virus-Checked: Checked by ClamAV on apache.org --_000_1158B25680108D4DBDD0558E33209370012D2F45eusaamb105erics_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi, I am implementing an algorithm using Giraph, and I was able to run= my algorithm on relatively small datasets (64,000,000 vertices and 128,000= ,000 edges). However, when I increase the size of the dataset to 128,000,00= 0 vertices and 256,000,000 edges, the job takes so much time to load the ve= rtices, and then it gives me the following exception. I have tried to increase the heap size and the task timeout value i= n the mapred-site.xml configuration file, and even vary the number of worke= rs from 1 to 10, but still getting the same exceptions. I have a cluster of= 10 nodes, and each node has a 4G of ram. Thanks in advance. 2013-08-29 10:22:53,150 INFO org.apache.giraph.utils.ProgressableUtils: wai= tFor: Future result not ready yet java.util.concurrent.FutureTask@1a129460 2013-08-29 10:22:53,151 INFO org.apache.giraph.utils.ProgressableUtils: wai= tFor: Waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@= 30d320e4 2013-08-29 10:23:07,938 INFO org.apache.giraph.worker.VertexInputSplitsCall= able: readVertexInputSplit: Loaded 7769685 vertices at 14250.953615591572 v= ertices/sec 15539370 edges at 28500.77593053654 edges/sec Memory (free/tota= l/max) =3D 680.21M / 3207.44M / 3555.56M 2013-08-29 10:23:14,538 INFO org.apache.giraph.worker.VertexInputSplitsCall= able: readVertexInputSplit: Loaded 8019685 vertices at 14533.557468366102 v= ertices/sec 16039370 edges at 29065.97491865343 edges/sec Memory (free/tota= l/max) =3D 906.80M / 3242.75M / 3555.56M 2013-08-29 10:23:21,888 INFO org.apache.giraph.worker.InputSplitsCallable: = loadFromInputSplit: Finished loading /_hadoopBsp/job_201308290837_0003/_ver= texInputSplitDir/9 (v=3D1212852, e=3D2425704) 2013-08-29 10:23:37,911 INFO org.apache.giraph.worker.InputSplitsHandler: r= eserveInputSplit: Reserved input split path /_hadoopBsp/job_201308290837_00= 03/_vertexInputSplitDir/19, overall roughly 7.518797% input splits reserved 2013-08-29 10:23:37,923 INFO org.apache.giraph.worker.InputSplitsCallable: = getInputSplit: Reserved /_hadoopBsp/job_201308290837_0003/_vertexInputSplit= Dir/19 from ZooKeeper and got input split 'org.apache.giraph.io.formats.mul= ti.InputSplitWithInputFormatIndex@24004559' 2013-08-29 10:23:44,313 INFO org.apache.giraph.worker.VertexInputSplitsCall= able: readVertexInputSplit: Loaded 8482537 vertices at 14585.340134636266 v= ertices/sec 16965074 edges at 29169.59449002283 edges/sec Memory (free/tota= l/max) =3D 538.93M / 3186.13M / 3555.56M 2013-08-29 10:23:49,963 INFO org.apache.giraph.worker.VertexInputSplitsCall= able: readVertexInputSplit: Loaded 8732537 vertices at 14870.726503632277 v= ertices/sec 17465074 edges at 29740.356341344923 edges/sec Memory (free/tot= al/max) =3D 489.84M / 3222.56M / 3555.56M 2013-08-29 10:34:28,371 INFO org.apache.giraph.utils.ProgressableUtils: wai= tFor: Future result not ready yet java.util.concurrent.FutureTask@1a129460 2013-08-29 10:34:34,847 INFO org.apache.giraph.utils.ProgressableUtils: wai= tFor: Waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@= 30d320e4 2013-08-29 10:34:34,850 INFO org.apache.giraph.comm.netty.handler.RequestDe= coder: decode: Server window metrics MBytes/sec sent =3D 0, MBytes/sec rece= ived =3D 0.0161, MBytesSent =3D 0.0002, MBytesReceived =3D 12.3175, ave sen= t req MBytes =3D 0, ave received req MBytes =3D 0.0587, secs waited =3D 765= .881 2013-08-29 10:34:35,698 INFO org.apache.zookeeper.ClientCnxn: Client sessio= n timed out, have not heard from server in 649805ms for sessionid 0x140cb11= 40540006, closing socket connection and attempting reconnect 2013-08-29 10:34:42,471 WARN org.apache.giraph.bsp.BspService: process: Dis= connected from ZooKeeper (will automatically try to recover) WatchedEvent s= tate:Disconnected type:None path:null 2013-08-29 10:34:42,472 WARN org.apache.giraph.worker.InputSplitsHandler: p= rocess: Problem with zookeeper, got event with path null, state Disconnecte= d, event type None 2013-08-29 10:34:43,819 INFO org.apache.zookeeper.ClientCnxn: Opening socke= t connection to server slave5.ericsson-magic.net/10.126.72.165:22181 2013-08-29 10:34:44,077 INFO org.apache.zookeeper.ClientCnxn: Socket connec= tion established to slave5.ericsson-magic.net/10.126.72.165:22181, initiati= ng session 2013-08-29 10:34:44,220 WARN org.apache.giraph.bsp.BspService: process: Got= unknown null path event WatchedEvent state:Expired type:None path:null 2013-08-29 10:34:44,220 WARN org.apache.giraph.worker.InputSplitsHandler: p= rocess: Problem with zookeeper, got event with path null, state Expired, ev= ent type None 2013-08-29 10:34:44,221 INFO org.apache.zookeeper.ClientCnxn: EventThread s= hut down 2013-08-29 10:34:44,240 INFO org.apache.zookeeper.ClientCnxn: Unable to rec= onnect to ZooKeeper service, session 0x140cb1140540006 has expired, closing= socket connection 2013-08-29 10:35:35,442 INFO org.apache.giraph.utils.ProgressableUtils: wai= tFor: Future result not ready yet java.util.concurrent.FutureTask@1a129460 2013-08-29 10:35:35,443 INFO org.apache.giraph.utils.ProgressableUtils: wai= tFor: Waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@= 30d320e4 2013-08-29 10:35:42,161 INFO org.apache.giraph.comm.netty.handler.RequestDe= coder: decode: Server window metrics MBytes/sec sent =3D 0, MBytes/sec rece= ived =3D 0.1059, MBytesSent =3D 0.0001, MBytesReceived =3D 7.1305, ave sent= req MBytes =3D 0, ave received req MBytes =3D 0.0291, secs waited =3D 67.3= 11 2013-08-29 10:35:48,659 INFO org.apache.giraph.worker.VertexInputSplitsCall= able: readVertexInputSplit: Loaded 8982537 vertices at 6882.0673288665985 v= ertices/sec 17965074 edges at 13763.906358998607 edges/sec Memory (free/tot= al/max) =3D 1040.32M / 3537.00M / 3555.56M 2013-08-29 10:36:14,680 INFO org.apache.giraph.worker.VertexInputSplitsCall= able: readVertexInputSplit: Loaded 9232537 vertices at 6931.612280518087 ve= rtices/sec 18465074 edges at 13862.99925688887 edges/sec Memory (free/total= /max) =3D 607.82M / 3564.69M / 3564.69M 2013-08-29 10:36:35,690 INFO org.apache.giraph.utils.ProgressableUtils: wai= tFor: Future result not ready yet java.util.concurrent.FutureTask@1a129460 2013-08-29 10:36:35,690 INFO org.apache.giraph.utils.ProgressableUtils: wai= tFor: Waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@= 30d320e4 2013-08-29 10:36:47,220 INFO org.apache.giraph.worker.InputSplitsCallable: = loadFromInputSplit: Finished loading /_hadoopBsp/job_201308290837_0003/_ver= texInputSplitDir/19 (v=3D1191319, e=3D2382638) 2013-08-29 10:36:47,667 ERROR org.apache.giraph.utils.LogStacktraceCallable= : Execution of callable failed java.lang.IllegalStateException: markInputSplitPathFinished: KeeperExceptio= n on /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19/_vertexInput= SplitFinished at org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFi= nished(InputSplitsHandler.java:168) at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(Inpu= tSplitsCallable.java:272) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCal= lable.java:211) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCal= lable.java:60) at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktrace= Callable.java:51) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:33= 4) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExec= utor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException: Ke= eperErrorCode =3D Session expired for /_hadoopBsp/job_201308290837_0003/_ve= rtexInputSplitDir/19/_vertexInputSplitFinished at org.apache.zookeeper.KeeperException.create(KeeperException.java= :118) at org.apache.zookeeper.KeeperException.create(KeeperException.java= :42) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637) at org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:15= 2) at org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFi= nished(InputSplitsHandler.java:159) ... 9 more 2013-08-29 10:36:50,349 ERROR org.apache.giraph.worker.BspServiceWorker: un= registerHealth: Got failure, unregistering health on /_hadoopBsp/job_201308= 290837_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/sl= ave8.ericsson-magic.net_5 on superstep -1 2013-08-29 10:36:52,498 ERROR org.apache.giraph.graph.GraphTaskManager: run= : Worker failure failed on another RuntimeException, original expection wil= l be rethrown java.lang.IllegalStateException: unregisterHealth: KeeperException - Couldn= 't delete /_hadoopBsp/job_201308290837_0003/_applicationAttemptsDir/0/_supe= rstepDir/-1/_workerHealthyDir/slave8.ericsson-magic.net_5 at org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspSe= rviceWorker.java:654) at org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServ= iceWorker.java:662) at org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(Gr= aphTaskManager.java:897) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupIn= formation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException: Ke= eperErrorCode =3D Session expired for /_hadoopBsp/job_201308290837_0003/_ap= plicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/slave8.ericsson-m= agic.net_5 at org.apache.zookeeper.KeeperException.create(KeeperException.java= :118) at org.apache.zookeeper.KeeperException.create(KeeperException.java= :42) at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728) at org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:30= 2) at org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspSe= rviceWorker.java:648) ... 10 more 2013-08-29 10:36:54,571 INFO org.apache.hadoop.mapred.TaskLogsTruncater: In= itializing logs' truncater with mapRetainSize=3D-1 and reduceRetainSize=3D-= 1 2013-08-29 10:37:15,417 INFO org.apache.hadoop.io.nativeio.NativeIO: Initia= lized cache for UID to User mapping with a cache timeout of 14400 seconds. 2013-08-29 10:37:15,456 INFO org.apache.hadoop.io.nativeio.NativeIO: Got Us= erName bigdatauser for UID 1007 from the native implementation 2013-08-29 10:37:16,047 WARN org.apache.hadoop.mapred.Child: Error running = child java.lang.IllegalStateException: run: Caught an unrecoverable exception wai= tFor: ExecutionException occurred while waiting for org.apache.giraph.utils= .ProgressableUtils$FutureWaitable@30d320e4 at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupIn= formation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: java.lang.IllegalStateException: waitFor: ExecutionException occ= urred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWai= table@30d320e4 at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUt= ils.java:181) at org.apache.giraph.utils.ProgressableUtils.waitForever(Progressab= leUtils.java:139) at org.apache.giraph.utils.ProgressableUtils.waitForever(Progressab= leUtils.java:124) at org.apache.giraph.utils.ProgressableUtils.getFutureResult(Progre= ssableUtils.java:87) at org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallabl= es(ProgressableUtils.java:221) at org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspSer= viceWorker.java:279) at org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServic= eWorker.java:323) at org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker= .java:504) at org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManage= r.java:246) at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91) ... 7 more Caused by: java.util.concurrent.ExecutionException: java.lang.IllegalStateE= xception: markInputSplitPathFinished: KeeperException on /_hadoopBsp/job_20= 1308290837_0003/_vertexInputSplitDir/19/_vertexInputSplitFinished at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:26= 2) at java.util.concurrent.FutureTask.get(FutureTask.java:119) at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor= (ProgressableUtils.java:300) at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUt= ils.java:173) ... 16 more Caused by: java.lang.IllegalStateException: markInputSplitPathFinished: Kee= perException on /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19/_= vertexInputSplitFinished at org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFi= nished(InputSplitsHandler.java:168) at org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(Inpu= tSplitsCallable.java:272) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCal= lable.java:211) at org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCal= lable.java:60) at org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktrace= Callable.java:51) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:33= 4) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExec= utor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException: Ke= eperErrorCode =3D Session expired for /_hadoopBsp/job_201308290837_0003/_ve= rtexInputSplitDir/19/_vertexInputSplitFinished at org.apache.zookeeper.KeeperException.create(KeeperException.java= :118) at org.apache.zookeeper.KeeperException.create(KeeperException.java= :42) at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637) at org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:15= 2) at org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFi= nished(InputSplitsHandler.java:159) ... 9 more 2013-08-29 10:37:17,481 INFO org.apache.hadoop.mapred.Task: Runnning cleanu= p for the task Best, Yasser --_000_1158B25680108D4DBDD0558E33209370012D2F45eusaamb105erics_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,<= /p>

 

   &nbs= p;     I am implementing an algorithm using Giraph, and= I was able to run my algorithm on relatively small datasets (64,000,000 ve= rtices and 128,000,000 edges). However, when I increase the size of the dat= aset to 128,000,000 vertices and 256,000,000 edges, the job takes so much time = to load the vertices, and then it gives me the following exception.

 

   &nbs= p;    I have tried to increase the heap size and the task ti= meout value in the mapred-site.xml configuration file, and even vary the nu= mber of workers from 1 to 10, but still getting the same exceptions. I have a cluster of 10 nodes, and each node has  a 4G of ram.  Tha= nks in advance.

 

2013-08-29 10:22:53,150 INFO org.apache.giraph.utils= .ProgressableUtils: waitFor: Future result not ready yet java.util.concurre= nt.FutureTask@1a129460

2013-08-29 10:22:53,151 INFO org.apache.giraph.utils= .ProgressableUtils: waitFor: Waiting for org.apache.giraph.utils.Progressab= leUtils$FutureWaitable@30d320e4

2013-08-29 10:23:07,938 INFO org.apache.giraph.worke= r.VertexInputSplitsCallable: readVertexInputSplit: Loaded 7769685 vertices = at 14250.953615591572 vertices/sec 15539370 edges at 28500.77593053654 edge= s/sec Memory (free/total/max) =3D 680.21M / 3207.44M / 3555.56M

2013-08-29 10:23:14,538 INFO org.apache.giraph.worke= r.VertexInputSplitsCallable: readVertexInputSplit: Loaded 8019685 vertices = at 14533.557468366102 vertices/sec 16039370 edges at 29065.97491865343 edge= s/sec Memory (free/total/max) =3D 906.80M / 3242.75M / 3555.56M

2013-08-29 10:23:21,888 INFO org.apache.giraph.worke= r.InputSplitsCallable: loadFromInputSplit: Finished loading /_hadoopBsp/job= _201308290837_0003/_vertexInputSplitDir/9 (v=3D1212852, e=3D2425704)

2013-08-29 10:23:37,911 INFO org.apache.giraph.worke= r.InputSplitsHandler: reserveInputSplit: Reserved input split path /_hadoop= Bsp/job_201308290837_0003/_vertexInputSplitDir/19, overall roughly 7.518797= % input splits reserved

2013-08-29 10:23:37,923 INFO org.apache.giraph.worke= r.InputSplitsCallable: getInputSplit: Reserved /_hadoopBsp/job_201308290837= _0003/_vertexInputSplitDir/19 from ZooKeeper and got input split 'org.apach= e.giraph.io.formats.multi.InputSplitWithInputFormatIndex@24004559'

2013-08-29 10:23:44,313 INFO org.apache.giraph.worke= r.VertexInputSplitsCallable: readVertexInputSplit: Loaded 8482537 vertices = at 14585.340134636266 vertices/sec 16965074 edges at 29169.59449002283 edge= s/sec Memory (free/total/max) =3D 538.93M / 3186.13M / 3555.56M

2013-08-29 10:23:49,963 INFO org.apache.giraph.worke= r.VertexInputSplitsCallable: readVertexInputSplit: Loaded 8732537 vertices = at 14870.726503632277 vertices/sec 17465074 edges at 29740.356341344923 edg= es/sec Memory (free/total/max) =3D 489.84M / 3222.56M / 3555.56M

2013-08-29 10:34:28,371 INFO org.apache.giraph.utils= .ProgressableUtils: waitFor: Future result not ready yet java.util.concurre= nt.FutureTask@1a129460

2013-08-29 10:34:34,847 INFO org.apache.giraph.utils= .ProgressableUtils: waitFor: Waiting for org.apache.giraph.utils.Progressab= leUtils$FutureWaitable@30d320e4

2013-08-29 10:34:34,850 INFO org.apache.giraph.comm.= netty.handler.RequestDecoder: decode: Server window metrics MBytes/sec sent= =3D 0, MBytes/sec received =3D 0.0161, MBytesSent =3D 0.0002, MBytesReceiv= ed =3D 12.3175, ave sent req MBytes =3D 0, ave received req MBytes =3D 0.0587, secs waited =3D 765.881

2013-08-29 10:34:35,698 INFO org.apache.zookeeper.Cl= ientCnxn: Client session timed out, have not heard from server in 649805ms = for sessionid 0x140cb1140540006, closing socket connection and attempting r= econnect

2013-08-29 10:34:42,471 WARN org.apache.giraph.bsp.B= spService: process: Disconnected from ZooKeeper (will automatically try to = recover) WatchedEvent state:Disconnected type:None path:null

2013-08-29 10:34:42,472 WARN org.apache.giraph.worke= r.InputSplitsHandler: process: Problem with zookeeper, got event with path = null, state Disconnected, event type None

2013-08-29 10:34:43,819 INFO org.apache.zookeeper.Cl= ientCnxn: Opening socket connection to server slave5.ericsson-magic.net/10.= 126.72.165:22181

2013-08-29 10:34:44,077 INFO org.apache.zookeeper.Cl= ientCnxn: Socket connection established to slave5.ericsson-magic.net/10.126= .72.165:22181, initiating session

2013-08-29 10:34:44,220 WARN org.apache.giraph.bsp.B= spService: process: Got unknown null path event WatchedEvent state:Expired = type:None path:null

2013-08-29 10:34:44,220 WARN org.apache.giraph.worke= r.InputSplitsHandler: process: Problem with zookeeper, got event with path = null, state Expired, event type None

2013-08-29 10:34:44,221 INFO org.apache.zookeeper.Cl= ientCnxn: EventThread shut down

2013-08-29 10:34:44,240 INFO org.apache.zookeeper.Cl= ientCnxn: Unable to reconnect to ZooKeeper service, session 0x140cb11405400= 06 has expired, closing socket connection

2013-08-29 10:35:35,442 INFO org.apache.giraph.utils= .ProgressableUtils: waitFor: Future result not ready yet java.util.concurre= nt.FutureTask@1a129460

2013-08-29 10:35:35,443 INFO org.apache.giraph.utils= .ProgressableUtils: waitFor: Waiting for org.apache.giraph.utils.Progressab= leUtils$FutureWaitable@30d320e4

2013-08-29 10:35:42,161 INFO org.apache.giraph.comm.= netty.handler.RequestDecoder: decode: Server window metrics MBytes/sec sent= =3D 0, MBytes/sec received =3D 0.1059, MBytesSent =3D 0.0001, MBytesReceiv= ed =3D 7.1305, ave sent req MBytes =3D 0, ave received req MBytes =3D 0.0291, secs waited =3D 67.311

2013-08-29 10:35:48,659 INFO org.apache.giraph.worke= r.VertexInputSplitsCallable: readVertexInputSplit: Loaded 8982537 vertices = at 6882.0673288665985 vertices/sec 17965074 edges at 13763.906358998607 edg= es/sec Memory (free/total/max) =3D 1040.32M / 3537.00M / 3555.56M

2013-08-29 10:36:14,680 INFO org.apache.giraph.worke= r.VertexInputSplitsCallable: readVertexInputSplit: Loaded 9232537 vertices = at 6931.612280518087 vertices/sec 18465074 edges at 13862.99925688887 edges= /sec Memory (free/total/max) =3D 607.82M / 3564.69M / 3564.69M

2013-08-29 10:36:35,690 INFO org.apache.giraph.utils= .ProgressableUtils: waitFor: Future result not ready yet java.util.concurre= nt.FutureTask@1a129460

2013-08-29 10:36:35,690 INFO org.apache.giraph.utils= .ProgressableUtils: waitFor: Waiting for org.apache.giraph.utils.Progressab= leUtils$FutureWaitable@30d320e4

2013-08-29 10:36:47,220 INFO org.apache.giraph.worke= r.InputSplitsCallable: loadFromInputSplit: Finished loading /_hadoopBsp/job= _201308290837_0003/_vertexInputSplitDir/19 (v=3D1191319, e=3D2382638)<= /o:p>

2013-08-29 10:36:47,667 ERROR org.apache.giraph.util= s.LogStacktraceCallable: Execution of callable failed

java.lang.IllegalStateException: markInputSplitPathF= inished: KeeperException on /_hadoopBsp/job_201308290837_0003/_vertexInputS= plitDir/19/_vertexInputSplitFinished

        at org.ap= ache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplit= sHandler.java:168)

        at org.ap= ache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.j= ava:272)

        at org.ap= ache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:211)

        at org.ap= ache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)

        at org.ap= ache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)=

        at java.u= til.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)

        at java.u= til.concurrent.FutureTask.run(FutureTask.java:166)

        at java.u= til.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at java.u= til.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.l= ang.Thread.run(Thread.java:724)

Caused by: org.apache.zookeeper.KeeperException$Sess= ionExpiredException: KeeperErrorCode =3D Session expired for /_hadoopBsp/jo= b_201308290837_0003/_vertexInputSplitDir/19/_vertexInputSplitFinished<= /o:p>

        at org.ap= ache.zookeeper.KeeperException.create(KeeperException.java:118)<= /p>

        at org.ap= ache.zookeeper.KeeperException.create(KeeperException.java:42)

        at org.ap= ache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)

        at org.ap= ache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:152)

        at org.ap= ache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplit= sHandler.java:159)

        ... 9 mor= e

2013-08-29 10:36:50,349 ERROR org.apache.giraph.work= er.BspServiceWorker: unregisterHealth: Got failure, unregistering health on= /_hadoopBsp/job_201308290837_0003/_applicationAttemptsDir/0/_superstepDir/= -1/_workerHealthyDir/slave8.ericsson-magic.net_5 on superstep -1

2013-08-29 10:36:52,498 ERROR org.apache.giraph.grap= h.GraphTaskManager: run: Worker failure failed on another RuntimeException,= original expection will be rethrown

java.lang.IllegalStateException: unregisterHealth: K= eeperException - Couldn't delete /_hadoopBsp/job_201308290837_0003/_applica= tionAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/slave8.ericsson-magic.= net_5

        at org.ap= ache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:= 654)

        at org.ap= ache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:66= 2)

        at org.ap= ache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.ja= va:897)

        at org.ap= ache.giraph.graph.GraphMapper.run(GraphMapper.java:100)

        at org.ap= ache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

        at org.ap= ache.hadoop.mapred.MapTask.run(MapTask.java:364)

        at org.ap= ache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.s= ecurity.AccessController.doPrivileged(Native Method)

        at javax.= security.auth.Subject.doAs(Subject.java:415)

        at org.ap= ache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:11= 90)

        at org.ap= ache.hadoop.mapred.Child.main(Child.java:249)

Caused by: org.apache.zookeeper.KeeperException$Sess= ionExpiredException: KeeperErrorCode =3D Session expired for /_hadoopBsp/jo= b_201308290837_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealt= hyDir/slave8.ericsson-magic.net_5

        at org.ap= ache.zookeeper.KeeperException.create(KeeperException.java:118)<= /p>

        at org.ap= ache.zookeeper.KeeperException.create(KeeperException.java:42)

        at org.ap= ache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)

        at org.ap= ache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:302)

        at org.ap= ache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:= 648)

        ... 10 mo= re

2013-08-29 10:36:54,571 INFO org.apache.hadoop.mapre= d.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=3D-1 a= nd reduceRetainSize=3D-1

2013-08-29 10:37:15,417 INFO org.apache.hadoop.io.na= tiveio.NativeIO: Initialized cache for UID to User mapping with a cache tim= eout of 14400 seconds.

2013-08-29 10:37:15,456 INFO org.apache.hadoop.io.na= tiveio.NativeIO: Got UserName bigdatauser for UID 1007 from the native impl= ementation

2013-08-29 10:37:16,047 WARN org.apache.hadoop.mapre= d.Child: Error running child

java.lang.IllegalStateException: run: Caught an unre= coverable exception waitFor: ExecutionException occurred while waiting for = org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4

        at org.ap= ache.giraph.graph.GraphMapper.run(GraphMapper.java:101)

        at org.ap= ache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

        at org.ap= ache.hadoop.mapred.MapTask.run(MapTask.java:364)

        at org.ap= ache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.s= ecurity.AccessController.doPrivileged(Native Method)

        at javax.= security.auth.Subject.doAs(Subject.java:415)

        at org.ap= ache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:11= 90)

        at org.ap= ache.hadoop.mapred.Child.main(Child.java:249)

Caused by: java.lang.IllegalStateException: waitFor:= ExecutionException occurred while waiting for org.apache.giraph.utils.Prog= ressableUtils$FutureWaitable@30d320e4

        at org.ap= ache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:181)

        at org.ap= ache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:139)=

        at org.ap= ache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:124)=

        at org.ap= ache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:= 87)

        at org.ap= ache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUt= ils.java:221)

        at org.ap= ache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:2= 79)

        at org.ap= ache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:323)=

        at org.ap= ache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:504)

        at org.ap= ache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:246)<= /o:p>

        at org.ap= ache.giraph.graph.GraphMapper.run(GraphMapper.java:91)

        ... 7 mor= e

Caused by: java.util.concurrent.ExecutionException: = java.lang.IllegalStateException: markInputSplitPathFinished: KeeperExceptio= n on /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19/_vertexInput= SplitFinished

        at java.u= til.concurrent.FutureTask$Sync.innerGet(FutureTask.java:262)

        at java.u= til.concurrent.FutureTask.get(FutureTask.java:119)

        at org.ap= ache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtil= s.java:300)

        at org.ap= ache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:173)

        ... 16 mo= re

Caused by: java.lang.IllegalStateException: markInpu= tSplitPathFinished: KeeperException on /_hadoopBsp/job_201308290837_0003/_v= ertexInputSplitDir/19/_vertexInputSplitFinished

        at org.ap= ache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplit= sHandler.java:168)

        at org.ap= ache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.j= ava:272)

        at org.ap= ache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:211)

        at org.ap= ache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)

        at org.ap= ache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)=

        at java.u= til.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)

        at java.u= til.concurrent.FutureTask.run(FutureTask.java:166)

        at java.u= til.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at java.u= til.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.l= ang.Thread.run(Thread.java:724)

Caused by: org.apache.zookeeper.KeeperException$Sess= ionExpiredException: KeeperErrorCode =3D Session expired for /_hadoopBsp/jo= b_201308290837_0003/_vertexInputSplitDir/19/_vertexInputSplitFinished<= /o:p>

        at org.ap= ache.zookeeper.KeeperException.create(KeeperException.java:118)<= /p>

        at org.ap= ache.zookeeper.KeeperException.create(KeeperException.java:42)

        at org.ap= ache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)

        at org.ap= ache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:152)

        at org.ap= ache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplit= sHandler.java:159)

        ... 9 mor= e

2013-08-29 10:37:17,481 INFO org.apache.hadoop.mapre= d.Task: Runnning cleanup for the task

 

 

Best,

Yasser

 

--_000_1158B25680108D4DBDD0558E33209370012D2F45eusaamb105erics_--