giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: PageRankBenchmark failing with zooKeeper.KeeperException
Date Wed, 07 Mar 2012 05:57:44 GMT
Two workers would launch >2 (3) map tasks, so it would not work with
the default configuration.

Set and increase "mapred.tasktracker.map.tasks.maximum" in your
mapred-site.xml to 3, restart the TaskTracker, and you should be good
to use two workers from then.

On Wed, Mar 7, 2012 at 10:31 AM, Abhishek Srivastava <asr.lkml@gmail.com> wrote:
> Yes I'm running it in pseudo-distributed mode on a single physical host.
> Should it be able to handle 2 workers ?
>
>
> On Tuesday 06 March 2012 08:36 AM, Claudio Martella wrote:
>>
>> Are you running on a real cluster or a pseudo-distributed node?
>> Because the default 2-slots is the default config there.
>>
>> On Tue, Mar 6, 2012 at 3:33 PM, Abhishek Srivastava<asr.lkml@gmail.com>
>>  wrote:
>>>
>>> Thanks Avery !
>>> So it runs to completion with -w 1 and about 500 vertices;  does not go
>>> through with -w 2 due to
>>> the lack of map slots as you pointed out.
>>>
>>> - Abhishek.
>>>
>>>
>>> On Monday 05 March 2012 11:53 PM, Avery Ching wrote:
>>>>
>>>> Hi Abhishek,
>>>>
>>>> Nice to meet you.  Can you try it with less workers?  For instance -w 1
>>>> or
>>>> -w 2?  I think the likely issue is that you need have as many map slots
>>>> as
>>>> the number of workers + at least one master.  If you don't have enough
>>>> slots, the job will fail.  Also, you might want to dial down the number
>>>> of
>>>> vertices a bit, unless you have oodles of memory.  Please let us know if
>>>> that helps.
>>>>
>>>> Avery
>>>>
>>>> On 3/5/12 9:03 PM, Abhishek Srivastava wrote:
>>>>>
>>>>> Hi All,
>>>>>
>>>>> I have been trying (quite unsuccessfully for a while now) to run the
>>>>> PageRankBenchmark
>>>>> to play around with Giraph. I got hadoop running in a single node setup
>>>>> and hadoop
>>>>> jobs and jars run just fine. When I try to run the PageRankBenchmark,
I
>>>>> get this
>>>>> incomprehensible error which I'm not able to diagnose.
>>>>>
>>>>>
>>>>>
>>>>> -----------------------------------CUT
>>>>> HERE---------------------------------------------
>>>>> abhi@darkstar:trunk $ hadoop jar
>>>>> target/giraph-0.70-jar-with-dependencies.jar
>>>>> org.apache.giraph.benchmark.PageRankBenchmark -e 1 -s 3 -v -V 50000000
>>>>> -w 30
>>>>> Warning: $HADOOP_HOME is deprecated.
>>>>>
>>>>> Using org.apache.giraph.benchmark.PageRankBenchmark$PageRankVertex
>>>>> 12/03/04 03:44:08 WARN bsp.BspOutputFormat: checkOutputSpecs:
>>>>> ImmutableOutputCommiter will not check anything
>>>>> 12/03/04 03:44:09 INFO mapred.JobClient: Running job:
>>>>> job_201203031851_0004
>>>>> 12/03/04 03:44:10 INFO mapred.JobClient:  map 0% reduce 0%
>>>>> 12/03/04 03:44:26 INFO mapred.JobClient:  map 3% reduce 0%
>>>>> 12/03/04 10:43:52 INFO mapred.JobClient:  map 0% reduce 0%
>>>>> 12/03/04 10:43:57 INFO mapred.JobClient: Task Id :
>>>>> attempt_201203031851_0004_m_000000_0, Status : FAILED
>>>>> Task attempt_201203031851_0004_m_000000_0 failed to report status for
>>>>> 24979 seconds. Killing!
>>>>> 12/03/04 10:44:00 INFO mapred.JobClient: Task Id :
>>>>> attempt_201203031851_0004_m_000001_0, Status : FAILED
>>>>> Task attempt_201203031851_0004_m_000001_0 failed to report status for
>>>>> 25159 seconds. Killing!
>>>>> 12/03/04 10:44:07 INFO mapred.JobClient:  map 3% reduce 0%
>>>>> 12/03/04 10:49:07 INFO mapred.JobClient:  map 0% reduce 0%
>>>>> 12/03/04 10:49:12 INFO mapred.JobClient: Task Id :
>>>>> attempt_201203031851_0004_m_000000_1, Status : FAILED
>>>>> java.lang.Throwable: Child Error
>>>>>    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
>>>>> Caused by: java.io.IOException: Task process exit with nonzero status
>>>>> of
>>>>> 1.
>>>>>    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
>>>>>
>>>>> 12/03/04 10:49:22 INFO mapred.JobClient:  map 3% reduce 0%
>>>>> 12/03/04 10:54:23 INFO mapred.JobClient:  map 0% reduce 0%
>>>>> 12/03/04 10:54:28 INFO mapred.JobClient: Task Id :
>>>>> attempt_201203031851_0004_m_000000_2, Status : FAILED
>>>>> java.lang.Throwable: Child Error
>>>>>    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
>>>>> Caused by: java.io.IOException: Task process exit with nonzero status
>>>>> of
>>>>> 1.
>>>>>    at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
>>>>>
>>>>> 12/03/04 10:54:38 INFO mapred.JobClient:  map 3% reduce 0%
>>>>> 12/03/04 10:59:10 INFO mapred.JobClient: Task Id :
>>>>> attempt_201203031851_0004_m_000001_1, Status : FAILED
>>>>> java.lang.IllegalStateException: unregisterHealth: KeeperException -
>>>>> Couldn't delete
>>>>>
>>>>> /_hadoopBsp/job_201203031851_0004/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/darkstar_1
>>>>>    at
>>>>>
>>>>> org.apache.giraph.graph.BspServiceWorker.unregisterHealth(BspServiceWorker.java:727)
>>>>>    at
>>>>>
>>>>> org.apache.giraph.graph.BspServiceWorker.failureCleanup(BspServiceWorker.java:735)
>>>>>    at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:648)
>>>>>    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>>>>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>>>    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>>    at java.security.AccessController.doPrivileged(Native Method)
>>>>>    at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>    at
>>>>>
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>>>>>    at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>> Caused by:
>>>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>> KeeperErrorCode = ConnectionLoss for
>>>>>
>>>>> /_hadoopBsp/job_201203031851_0004/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/darkstar_1
>>>>>    at
>>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:90)
>>>>>    at
>>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
>>>>>    at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:728)
>>>>>    at
>>>>>
>>>>> org.apache.giraph.graph.BspServiceWorker.unregisterHealth(BspServiceWorker.java:721)
>>>>>    ... 9 more
>>>>>
>>>>> Task attempt_201203031851_0004_m_000001_1 failed to report status for
>>>>> 601
>>>>> seconds. Killing!
>>>>> attempt_201203031851_0004_m_000001_1: log4j:WARN No appenders could be
>>>>> found for logger (org.apache.zookeeper.ClientCnxn).
>>>>> attempt_201203031851_0004_m_000001_1: log4j:WARN Please initialize the
>>>>> log4j system properly.
>>>>> 12/03/04 10:59:47 INFO mapred.JobClient:  map 0% reduce 0%
>>>>> 12/03/04 10:59:58 INFO mapred.JobClient: Job complete:
>>>>> job_201203031851_0004
>>>>> 12/03/04 10:59:58 INFO mapred.JobClient: Counters: 6
>>>>> 12/03/04 10:59:58 INFO mapred.JobClient:   Job Counters
>>>>> 12/03/04 10:59:58 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=977551
>>>>> 12/03/04 10:59:58 INFO mapred.JobClient:     Total time spent by all
>>>>> reduces waiting after reserving slots (ms)=0
>>>>> 12/03/04 10:59:58 INFO mapred.JobClient:     Total time spent by all
>>>>> maps
>>>>> waiting after reserving slots (ms)=0
>>>>> 12/03/04 10:59:58 INFO mapred.JobClient:     Launched map tasks=7
>>>>> 12/03/04 10:59:58 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
>>>>> 12/03/04 10:59:58 INFO mapred.JobClient:     Failed map tasks=1
>>>>> -----------------------------------CUT
>>>>> HERE---------------------------------------------
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Abhishek.
>>>>
>>>>
>>
>>
>



-- 
Harsh J

Mime
View raw message