giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <ach...@apache.org>
Subject Re: Exception with Large Graphs
Date Fri, 30 Aug 2013 23:43:06 GMT
That error is from the master dying (likely due to the results of 
another worker dying).  Can you do a rough calculation of the size of 
data that you expect to be loaded and check if the memory is enough?

On 8/30/13 11:19 AM, Yasser Altowim wrote:
>
> Guys,
>
>        Can someone please help me with this issue? Thanks.
>
> Best,
>
> Yasser
>
> *From:*Yasser Altowim
> *Sent:* Thursday, August 29, 2013 11:16 AM
> *To:* user@giraph.apache.org
> *Subject:* Exception with Large Graphs
>
> Hi,
>
>          I am implementing an algorithm using Giraph, and I was able 
> to run my algorithm on relatively small datasets (64,000,000 vertices 
> and 128,000,000 edges). However, when I increase the size of the 
> dataset to 128,000,000 vertices and 256,000,000 edges, the job takes 
> so much time to load the vertices, and then it gives me the following 
> exception.
>
>         I have tried to increase the heap size and the task timeout 
> value in the mapred-site.xml configuration file, and even vary the 
> number of workers from 1 to 10, but still getting the same exceptions. 
> I have a cluster of 10 nodes, and each node has  a 4G of ram.  Thanks 
> in advance.
>
> 2013-08-29 10:22:53,150 INFO 
> org.apache.giraph.utils.ProgressableUtils: waitFor: Future result not 
> ready yet java.util.concurrent.FutureTask@1a129460 
> <mailto:java.util.concurrent.FutureTask@1a129460>
>
> 2013-08-29 10:22:53,151 INFO 
> org.apache.giraph.utils.ProgressableUtils: waitFor: Waiting for 
> org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4 
> <mailto:org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4>
>
> 2013-08-29 10:23:07,938 INFO 
> org.apache.giraph.worker.VertexInputSplitsCallable: 
> readVertexInputSplit: Loaded 7769685 vertices at 14250.953615591572 
> vertices/sec 15539370 edges at 28500.77593053654 edges/sec Memory 
> (free/total/max) = 680.21M / 3207.44M / 3555.56M
>
> 2013-08-29 10:23:14,538 INFO 
> org.apache.giraph.worker.VertexInputSplitsCallable: 
> readVertexInputSplit: Loaded 8019685 vertices at 14533.557468366102 
> vertices/sec 16039370 edges at 29065.97491865343 edges/sec Memory 
> (free/total/max) = 906.80M / 3242.75M / 3555.56M
>
> 2013-08-29 10:23:21,888 INFO 
> org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit: 
> Finished loading 
> /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/9 (v=1212852, 
> e=2425704)
>
> 2013-08-29 10:23:37,911 INFO 
> org.apache.giraph.worker.InputSplitsHandler: reserveInputSplit: 
> Reserved input split path 
> /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19, overall 
> roughly 7.518797% input splits reserved
>
> 2013-08-29 10:23:37,923 INFO 
> org.apache.giraph.worker.InputSplitsCallable: getInputSplit: Reserved 
> /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19 from 
> ZooKeeper and got input split 
> 'org.apache.giraph.io.formats.multi.InputSplitWithInputFormatIndex@24004559'
>
> 2013-08-29 10:23:44,313 INFO 
> org.apache.giraph.worker.VertexInputSplitsCallable: 
> readVertexInputSplit: Loaded 8482537 vertices at 14585.340134636266 
> vertices/sec 16965074 edges at 29169.59449002283 edges/sec Memory 
> (free/total/max) = 538.93M / 3186.13M / 3555.56M
>
> 2013-08-29 10:23:49,963 INFO 
> org.apache.giraph.worker.VertexInputSplitsCallable: 
> readVertexInputSplit: Loaded 8732537 vertices at 14870.726503632277 
> vertices/sec 17465074 edges at 29740.356341344923 edges/sec Memory 
> (free/total/max) = 489.84M / 3222.56M / 3555.56M
>
> 2013-08-29 10:34:28,371 INFO 
> org.apache.giraph.utils.ProgressableUtils: waitFor: Future result not 
> ready yet java.util.concurrent.FutureTask@1a129460 
> <mailto:java.util.concurrent.FutureTask@1a129460>
>
> 2013-08-29 10:34:34,847 INFO 
> org.apache.giraph.utils.ProgressableUtils: waitFor: Waiting for 
> org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4 
> <mailto:org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4>
>
> 2013-08-29 10:34:34,850 INFO 
> org.apache.giraph.comm.netty.handler.RequestDecoder: decode: Server 
> window metrics MBytes/sec sent = 0, MBytes/sec received = 0.0161, 
> MBytesSent = 0.0002, MBytesReceived = 12.3175, ave sent req MBytes = 
> 0, ave received req MBytes = 0.0587, secs waited = 765.881
>
> 2013-08-29 10:34:35,698 INFO org.apache.zookeeper.ClientCnxn: Client 
> session timed out, have not heard from server in 649805ms for 
> sessionid 0x140cb1140540006, closing socket connection and attempting 
> reconnect
>
> 2013-08-29 10:34:42,471 WARN org.apache.giraph.bsp.BspService: 
> process: Disconnected from ZooKeeper (will automatically try to 
> recover) WatchedEvent state:Disconnected type:None path:null
>
> 2013-08-29 10:34:42,472 WARN 
> org.apache.giraph.worker.InputSplitsHandler: process: Problem with 
> zookeeper, got event with path null, state Disconnected, event type None
>
> 2013-08-29 10:34:43,819 INFO org.apache.zookeeper.ClientCnxn: Opening 
> socket connection to server slave5.ericsson-magic.net/10.126.72.165:22181
>
> 2013-08-29 10:34:44,077 INFO org.apache.zookeeper.ClientCnxn: Socket 
> connection established to 
> slave5.ericsson-magic.net/10.126.72.165:22181, initiating session
>
> 2013-08-29 10:34:44,220 WARN org.apache.giraph.bsp.BspService: 
> process: Got unknown null path event WatchedEvent state:Expired 
> type:None path:null
>
> 2013-08-29 10:34:44,220 WARN 
> org.apache.giraph.worker.InputSplitsHandler: process: Problem with 
> zookeeper, got event with path null, state Expired, event type None
>
> 2013-08-29 10:34:44,221 INFO org.apache.zookeeper.ClientCnxn: 
> EventThread shut down
>
> 2013-08-29 10:34:44,240 INFO org.apache.zookeeper.ClientCnxn: Unable 
> to reconnect to ZooKeeper service, session 0x140cb1140540006 has 
> expired, closing socket connection
>
> 2013-08-29 10:35:35,442 INFO 
> org.apache.giraph.utils.ProgressableUtils: waitFor: Future result not 
> ready yet java.util.concurrent.FutureTask@1a129460 
> <mailto:java.util.concurrent.FutureTask@1a129460>
>
> 2013-08-29 10:35:35,443 INFO 
> org.apache.giraph.utils.ProgressableUtils: waitFor: Waiting for 
> org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4 
> <mailto:org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4>
>
> 2013-08-29 10:35:42,161 INFO 
> org.apache.giraph.comm.netty.handler.RequestDecoder: decode: Server 
> window metrics MBytes/sec sent = 0, MBytes/sec received = 0.1059, 
> MBytesSent = 0.0001, MBytesReceived = 7.1305, ave sent req MBytes = 0, 
> ave received req MBytes = 0.0291, secs waited = 67.311
>
> 2013-08-29 10:35:48,659 INFO 
> org.apache.giraph.worker.VertexInputSplitsCallable: 
> readVertexInputSplit: Loaded 8982537 vertices at 6882.0673288665985 
> vertices/sec 17965074 edges at 13763.906358998607 edges/sec Memory 
> (free/total/max) = 1040.32M / 3537.00M / 3555.56M
>
> 2013-08-29 10:36:14,680 INFO 
> org.apache.giraph.worker.VertexInputSplitsCallable: 
> readVertexInputSplit: Loaded 9232537 vertices at 6931.612280518087 
> vertices/sec 18465074 edges at 13862.99925688887 edges/sec Memory 
> (free/total/max) = 607.82M / 3564.69M / 3564.69M
>
> 2013-08-29 10:36:35,690 INFO 
> org.apache.giraph.utils.ProgressableUtils: waitFor: Future result not 
> ready yet java.util.concurrent.FutureTask@1a129460 
> <mailto:java.util.concurrent.FutureTask@1a129460>
>
> 2013-08-29 10:36:35,690 INFO 
> org.apache.giraph.utils.ProgressableUtils: waitFor: Waiting for 
> org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4 
> <mailto:org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4>
>
> 2013-08-29 10:36:47,220 INFO 
> org.apache.giraph.worker.InputSplitsCallable: loadFromInputSplit: 
> Finished loading 
> /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19 (v=1191319, 
> e=2382638)
>
> 2013-08-29 10:36:47,667 ERROR 
> org.apache.giraph.utils.LogStacktraceCallable: Execution of callable 
> failed
>
> java.lang.IllegalStateException: markInputSplitPathFinished: 
> KeeperException on 
> /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19/_vertexInputSplitFinished
>
>         at 
> org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplitsHandler.java:168)
>
>         at 
> org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:272)
>
>         at 
> org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:211)
>
>         at 
> org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
>
>         at 
> org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
>
>         at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>         at java.lang.Thread.run(Thread.java:724)
>
> Caused by: 
> org.apache.zookeeper.KeeperException$SessionExpiredException: 
> KeeperErrorCode = Session expired for 
> /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19/_vertexInputSplitFinished
>
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
>
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>
>         at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>
>         at 
> org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:152)
>
>         at 
> org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplitsHandler.java:159)
>
>         ... 9 more
>
> 2013-08-29 10:36:50,349 ERROR 
> org.apache.giraph.worker.BspServiceWorker: unregisterHealth: Got 
> failure, unregistering health on 
> /_hadoopBsp/job_201308290837_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/slave8.ericsson-magic.net_5

> on superstep -1
>
> 2013-08-29 10:36:52,498 ERROR 
> org.apache.giraph.graph.GraphTaskManager: run: Worker failure failed 
> on another RuntimeException, original expection will be rethrown
>
> java.lang.IllegalStateException: unregisterHealth: KeeperException - 
> Couldn't delete 
> /_hadoopBsp/job_201308290837_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/slave8.ericsson-magic.net_5
>
>         at 
> org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:654)
>
>         at 
> org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:662)
>
>         at 
> org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.java:897)
>
>         at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100)
>
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
>
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> Caused by: 
> org.apache.zookeeper.KeeperException$SessionExpiredException: 
> KeeperErrorCode = Session expired for 
> /_hadoopBsp/job_201308290837_0003/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/slave8.ericsson-magic.net_5
>
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
>
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>
>         at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
>
>         at 
> org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:302)
>
>         at 
> org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:648)
>
>         ... 10 more
>
> 2013-08-29 10:36:54,571 INFO 
> org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' 
> truncater with mapRetainSize=-1 and reduceRetainSize=-1
>
> 2013-08-29 10:37:15,417 INFO org.apache.hadoop.io.nativeio.NativeIO: 
> Initialized cache for UID to User mapping with a cache timeout of 
> 14400 seconds.
>
> 2013-08-29 10:37:15,456 INFO org.apache.hadoop.io.nativeio.NativeIO: 
> Got UserName bigdatauser for UID 1007 from the native implementation
>
> 2013-08-29 10:37:16,047 WARN org.apache.hadoop.mapred.Child: Error 
> running child
>
> java.lang.IllegalStateException: run: Caught an unrecoverable 
> exception waitFor: ExecutionException occurred while waiting for 
> org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4 
> <mailto:org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4>
>
>         at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:101)
>
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
>
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>
>         at java.security.AccessController.doPrivileged(Native Method)
>
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
> Caused by: java.lang.IllegalStateException: waitFor: 
> ExecutionException occurred while waiting for 
> org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4 
> <mailto:org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4>
>
>         at 
> org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:181)
>
>         at 
> org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:139)
>
>         at 
> org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:124)
>
>         at 
> org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:87)
>
>         at 
> org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:221)
>
>         at 
> org.apache.giraph.worker.BspServiceWorker.loadInputSplits(BspServiceWorker.java:279)
>
>         at 
> org.apache.giraph.worker.BspServiceWorker.loadVertices(BspServiceWorker.java:323)
>
>         at 
> org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:504)
>
>         at 
> org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:246)
>
>         at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:91)
>
>         ... 7 more
>
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.IllegalStateException: markInputSplitPathFinished: 
> KeeperException on 
> /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19/_vertexInputSplitFinished
>
>         at 
> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:262)
>
>         at java.util.concurrent.FutureTask.get(FutureTask.java:119)
>
>         at 
> org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(ProgressableUtils.java:300)
>
>         at 
> org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:173)
>
>         ... 16 more
>
> Caused by: java.lang.IllegalStateException: 
> markInputSplitPathFinished: KeeperException on 
> /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19/_vertexInputSplitFinished
>
>         at 
> org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplitsHandler.java:168)
>
>         at 
> org.apache.giraph.worker.InputSplitsCallable.loadInputSplit(InputSplitsCallable.java:272)
>
>         at 
> org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:211)
>
>         at 
> org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
>
>         at 
> org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
>
>         at 
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>
>         at java.lang.Thread.run(Thread.java:724)
>
> Caused by: 
> org.apache.zookeeper.KeeperException$SessionExpiredException: 
> KeeperErrorCode = Session expired for 
> /_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19/_vertexInputSplitFinished
>
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
>
>         at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>
>         at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637)
>
>         at 
> org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:152)
>
>         at 
> org.apache.giraph.worker.InputSplitsHandler.markInputSplitPathFinished(InputSplitsHandler.java:159)
>
>         ... 9 more
>
> 2013-08-29 10:37:17,481 INFO org.apache.hadoop.mapred.Task: Runnning 
> cleanup for the task
>
> Best,
>
> Yasser
>


Mime
View raw message