incubator-giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Srivastava <asr.l...@gmail.com>
Subject Re: PageRankBenchmark failing with zooKeeper.KeeperException
Date Tue, 06 Mar 2012 14:33:55 GMT
Thanks Avery !
So it runs to completion with -w 1 and about 500 vertices;  does not go 
through with -w 2 due to
the lack of map slots as you pointed out.

- Abhishek.

On Monday 05 March 2012 11:53 PM, Avery Ching wrote:
> Hi Abhishek,
>
> Nice to meet you.  Can you try it with less workers?  For instance -w 
> 1 or -w 2?  I think the likely issue is that you need have as many map 
> slots as the number of workers + at least one master.  If you don't 
> have enough slots, the job will fail.  Also, you might want to dial 
> down the number of vertices a bit, unless you have oodles of memory.  
> Please let us know if that helps.
>
> Avery
>
> On 3/5/12 9:03 PM, Abhishek Srivastava wrote:
>> Hi All,
>>
>> I have been trying (quite unsuccessfully for a while now) to run the 
>> PageRankBenchmark
>> to play around with Giraph. I got hadoop running in a single node 
>> setup and hadoop
>> jobs and jars run just fine. When I try to run the PageRankBenchmark, 
>> I get this
>> incomprehensible error which I'm not able to diagnose.
>>
>>
>>
>> -----------------------------------CUT 
>> HERE---------------------------------------------
>> abhi@darkstar:trunk $ hadoop jar 
>> target/giraph-0.70-jar-with-dependencies.jar 
>> org.apache.giraph.benchmark.PageRankBenchmark -e 1 -s 3 -v -V 
>> 50000000 -w 30
>> Warning: $HADOOP_HOME is deprecated.
>>
>> Using org.apache.giraph.benchmark.PageRankBenchmark$PageRankVertex
>> 12/03/04 03:44:08 WARN bsp.BspOutputFormat: checkOutputSpecs: 
>> ImmutableOutputCommiter will not check anything
>> 12/03/04 03:44:09 INFO mapred.JobClient: Running job: 
>> job_201203031851_0004
>> 12/03/04 03:44:10 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/03/04 03:44:26 INFO mapred.JobClient:  map 3% reduce 0%
>> 12/03/04 10:43:52 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/03/04 10:43:57 INFO mapred.JobClient: Task Id : 
>> attempt_201203031851_0004_m_000000_0, Status : FAILED
>> Task attempt_201203031851_0004_m_000000_0 failed to report status for 
>> 24979 seconds. Killing!
>> 12/03/04 10:44:00 INFO mapred.JobClient: Task Id : 
>> attempt_201203031851_0004_m_000001_0, Status : FAILED
>> Task attempt_201203031851_0004_m_000001_0 failed to report status for 
>> 25159 seconds. Killing!
>> 12/03/04 10:44:07 INFO mapred.JobClient:  map 3% reduce 0%
>> 12/03/04 10:49:07 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/03/04 10:49:12 INFO mapred.JobClient: Task Id : 
>> attempt_201203031851_0004_m_000000_1, Status : FAILED
>> java.lang.Throwable: Child Error
>>     at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
>> Caused by: java.io.IOException: Task process exit with nonzero status 
>> of 1.
>>     at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
>>
>> 12/03/04 10:49:22 INFO mapred.JobClient:  map 3% reduce 0%
>> 12/03/04 10:54:23 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/03/04 10:54:28 INFO mapred.JobClient: Task Id : 
>> attempt_201203031851_0004_m_000000_2, Status : FAILED
>> java.lang.Throwable: Child Error
>>     at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
>> Caused by: java.io.IOException: Task process exit with nonzero status 
>> of 1.
>>     at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
>>
>> 12/03/04 10:54:38 INFO mapred.JobClient:  map 3% reduce 0%
>> 12/03/04 10:59:10 INFO mapred.JobClient: Task Id : 
>> attempt_201203031851_0004_m_000001_1, Status : FAILED
>> java.lang.IllegalStateException: unregisterHealth: KeeperException - 
>> Couldn't delete 
>> /_hadoopBsp/job_201203031851_0004/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/darkstar_1
>>     at 
>> org.apache.giraph.graph.BspServiceWorker.unregisterHealth(BspServiceWorker.java:727)
>>     at 
>> org.apache.giraph.graph.BspServiceWorker.failureCleanup(BspServiceWorker.java:735)
>>     at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:648)
>>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>     at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>     at java.security.AccessController.doPrivileged(Native Method)
>>     at javax.security.auth.Subject.doAs(Subject.java:396)
>>     at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>>     at org.apache.hadoop.mapred.Child.main(Child.java:249)
>> Caused by: 
>> org.apache.zookeeper.KeeperException$ConnectionLossException: 
>> KeeperErrorCode = ConnectionLoss for 
>> /_hadoopBsp/job_201203031851_0004/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/darkstar_1
>>     at 
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>     at 
>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>     at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
>>     at 
>> org.apache.giraph.graph.BspServiceWorker.unregisterHealth(BspServiceWorker.java:721)
>>     ... 9 more
>>
>> Task attempt_201203031851_0004_m_000001_1 failed to report status for 
>> 601 seconds. Killing!
>> attempt_201203031851_0004_m_000001_1: log4j:WARN No appenders could 
>> be found for logger (org.apache.zookeeper.ClientCnxn).
>> attempt_201203031851_0004_m_000001_1: log4j:WARN Please initialize 
>> the log4j system properly.
>> 12/03/04 10:59:47 INFO mapred.JobClient:  map 0% reduce 0%
>> 12/03/04 10:59:58 INFO mapred.JobClient: Job complete: 
>> job_201203031851_0004
>> 12/03/04 10:59:58 INFO mapred.JobClient: Counters: 6
>> 12/03/04 10:59:58 INFO mapred.JobClient:   Job Counters
>> 12/03/04 10:59:58 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=977551
>> 12/03/04 10:59:58 INFO mapred.JobClient:     Total time spent by all 
>> reduces waiting after reserving slots (ms)=0
>> 12/03/04 10:59:58 INFO mapred.JobClient:     Total time spent by all 
>> maps waiting after reserving slots (ms)=0
>> 12/03/04 10:59:58 INFO mapred.JobClient:     Launched map tasks=7
>> 12/03/04 10:59:58 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
>> 12/03/04 10:59:58 INFO mapred.JobClient:     Failed map tasks=1
>> -----------------------------------CUT 
>> HERE---------------------------------------------
>>
>>
>> Thanks,
>> Abhishek.
>


Mime
View raw message