incubator-giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <ach...@apache.org>
Subject Re: PageRankBenchmark failing with zooKeeper.KeeperException
Date Tue, 06 Mar 2012 07:53:32 GMT
Hi Abhishek,

Nice to meet you.  Can you try it with less workers?  For instance -w 1 
or -w 2?  I think the likely issue is that you need have as many map 
slots as the number of workers + at least one master.  If you don't have 
enough slots, the job will fail.  Also, you might want to dial down the 
number of vertices a bit, unless you have oodles of memory.  Please let 
us know if that helps.

Avery

On 3/5/12 9:03 PM, Abhishek Srivastava wrote:
> Hi All,
>
> I have been trying (quite unsuccessfully for a while now) to run the 
> PageRankBenchmark
> to play around with Giraph. I got hadoop running in a single node 
> setup and hadoop
> jobs and jars run just fine. When I try to run the PageRankBenchmark, 
> I get this
> incomprehensible error which I'm not able to diagnose.
>
>
>
> -----------------------------------CUT 
> HERE---------------------------------------------
> abhi@darkstar:trunk $ hadoop jar 
> target/giraph-0.70-jar-with-dependencies.jar 
> org.apache.giraph.benchmark.PageRankBenchmark -e 1 -s 3 -v -V 50000000 
> -w 30
> Warning: $HADOOP_HOME is deprecated.
>
> Using org.apache.giraph.benchmark.PageRankBenchmark$PageRankVertex
> 12/03/04 03:44:08 WARN bsp.BspOutputFormat: checkOutputSpecs: 
> ImmutableOutputCommiter will not check anything
> 12/03/04 03:44:09 INFO mapred.JobClient: Running job: 
> job_201203031851_0004
> 12/03/04 03:44:10 INFO mapred.JobClient:  map 0% reduce 0%
> 12/03/04 03:44:26 INFO mapred.JobClient:  map 3% reduce 0%
> 12/03/04 10:43:52 INFO mapred.JobClient:  map 0% reduce 0%
> 12/03/04 10:43:57 INFO mapred.JobClient: Task Id : 
> attempt_201203031851_0004_m_000000_0, Status : FAILED
> Task attempt_201203031851_0004_m_000000_0 failed to report status for 
> 24979 seconds. Killing!
> 12/03/04 10:44:00 INFO mapred.JobClient: Task Id : 
> attempt_201203031851_0004_m_000001_0, Status : FAILED
> Task attempt_201203031851_0004_m_000001_0 failed to report status for 
> 25159 seconds. Killing!
> 12/03/04 10:44:07 INFO mapred.JobClient:  map 3% reduce 0%
> 12/03/04 10:49:07 INFO mapred.JobClient:  map 0% reduce 0%
> 12/03/04 10:49:12 INFO mapred.JobClient: Task Id : 
> attempt_201203031851_0004_m_000000_1, Status : FAILED
> java.lang.Throwable: Child Error
>     at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
> Caused by: java.io.IOException: Task process exit with nonzero status 
> of 1.
>     at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
>
> 12/03/04 10:49:22 INFO mapred.JobClient:  map 3% reduce 0%
> 12/03/04 10:54:23 INFO mapred.JobClient:  map 0% reduce 0%
> 12/03/04 10:54:28 INFO mapred.JobClient: Task Id : 
> attempt_201203031851_0004_m_000000_2, Status : FAILED
> java.lang.Throwable: Child Error
>     at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
> Caused by: java.io.IOException: Task process exit with nonzero status 
> of 1.
>     at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
>
> 12/03/04 10:54:38 INFO mapred.JobClient:  map 3% reduce 0%
> 12/03/04 10:59:10 INFO mapred.JobClient: Task Id : 
> attempt_201203031851_0004_m_000001_1, Status : FAILED
> java.lang.IllegalStateException: unregisterHealth: KeeperException - 
> Couldn't delete 
> /_hadoopBsp/job_201203031851_0004/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/darkstar_1
>     at 
> org.apache.giraph.graph.BspServiceWorker.unregisterHealth(BspServiceWorker.java:727)
>     at 
> org.apache.giraph.graph.BspServiceWorker.failureCleanup(BspServiceWorker.java:735)
>     at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:648)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>     at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>     at org.apache.hadoop.mapred.Child.main(Child.java:249)
> Caused by: 
> org.apache.zookeeper.KeeperException$ConnectionLossException: 
> KeeperErrorCode = ConnectionLoss for 
> /_hadoopBsp/job_201203031851_0004/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/darkstar_1
>     at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>     at 
> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>     at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
>     at 
> org.apache.giraph.graph.BspServiceWorker.unregisterHealth(BspServiceWorker.java:721)
>     ... 9 more
>
> Task attempt_201203031851_0004_m_000001_1 failed to report status for 
> 601 seconds. Killing!
> attempt_201203031851_0004_m_000001_1: log4j:WARN No appenders could be 
> found for logger (org.apache.zookeeper.ClientCnxn).
> attempt_201203031851_0004_m_000001_1: log4j:WARN Please initialize the 
> log4j system properly.
> 12/03/04 10:59:47 INFO mapred.JobClient:  map 0% reduce 0%
> 12/03/04 10:59:58 INFO mapred.JobClient: Job complete: 
> job_201203031851_0004
> 12/03/04 10:59:58 INFO mapred.JobClient: Counters: 6
> 12/03/04 10:59:58 INFO mapred.JobClient:   Job Counters
> 12/03/04 10:59:58 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=977551
> 12/03/04 10:59:58 INFO mapred.JobClient:     Total time spent by all 
> reduces waiting after reserving slots (ms)=0
> 12/03/04 10:59:58 INFO mapred.JobClient:     Total time spent by all 
> maps waiting after reserving slots (ms)=0
> 12/03/04 10:59:58 INFO mapred.JobClient:     Launched map tasks=7
> 12/03/04 10:59:58 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> 12/03/04 10:59:58 INFO mapred.JobClient:     Failed map tasks=1
> -----------------------------------CUT 
> HERE---------------------------------------------
>
>
> Thanks,
> Abhishek.


Mime
View raw message