giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Emre Alada─č <>
Subject Only one worker is running
Date Thu, 12 Sep 2013 14:03:36 GMT

I have a custom pagerank computation with inputs reading from Hbase and 
writing to it.

I submit my job on a real distributed Hadoop cluster which can allocate 
320 map jobs. I started my job with 100 workers. What I see is that only 
one of the workers are actually reading the input and gets out of memory:

readVertexInputSplit: Loaded 2000000 vertices at 23706.551705289407 
vertices/sec 10945068 edges at 129730.61549455763 edges/sec Memory 
(free/total/max) = 124.31M / 910.25M / 910.25M

Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead
limit exceeded
	at java.util.concurrent.FutureTask$Sync.innerGet(
	at java.util.concurrent.FutureTask.get(
	at org.apache.giraph.utils.ProgressableUtils$FutureWaitable.waitFor(
	at org.apache.giraph.utils.ProgressableUtils.waitFor(
	... 16 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
	at java.util.Arrays.copyOfRange(
	at java.lang.String.<init>(
	at java.lang.String.substring(

and the rest of the workers say:

startSuperstep: WORKER_ONLY - Attempt=0, Superstep=-1

Master says:
MASTER_ONLY - 99 finished out of 100 on superstep -1

What configuration should I solve this problem?
I use:
         giraphConf.setWorkerConfiguration(1, 100, 85.0f);


View raw message