giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Young Han <young....@uwaterloo.ca>
Subject Re: Giraph EC2 Map task fails
Date Sun, 24 Nov 2013 19:19:06 GMT
Actually, it turned out to be a dumber error than that... The name of the
input file was wrong, so it was using an empty/non-existent graph.

We'll keep the zookeeper bit in mind if we run into further problems.

Thanks,
Young


On Sun, Nov 24, 2013 at 2:06 PM, Gustavo Enrique Salazar Torres <
gsalazar@ime.usp.br> wrote:

> I guess from your stacktrace that  you didn't start the zookeeper cluster.
>
> Cheers
> Gustavo
>
>
> On Sunday, November 24, 2013, Young Han <young.han@uwaterloo.ca> wrote:
> > Hi,
> >
> > We are attempting to get Giraph running on EC2, using Hadoop 1.0.4. We
> are using page rank with the following command:
> >
> > hadoop jar
> $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.0.0-for-hadoop-1.0.2-jar-with-dependencies.jar
> org.apache.giraph.GiraphRunner
> org.apache.giraph.examples.SimplePageRankVertex -c
> org.apache.giraph.combiner.DoubleSumCombiner -vif
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
> -vip /user/ubuntu/giraph-input/tiny_graph.txt -of
> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
> /user/ubuntu/giraph-output/pagerank -w 1
> >
> >
> > The input graph is the sample graph provided on the website:
> >
> > [0,0,[[1,1],[3,3]]]
> > [1,0,[[0,1],[2,2],[3,1]]]
> > [2,0,[[1,2],[4,4]]]
> > [3,0,[[0,3],[1,1],[4,4]]]
> > [4,0,[[3,4],[2,4]]]
> >
> >
> > We've tried small, medium, and xlarge instances; 4 instances and 3
> instances; and various number of workers (-w 1, -w 2, -w 5, -w 10, etc.).
> Hadoop has xmx (max Java heap size) set to 1024m.
> >
> > The pattern is that the *first* map task will always fail. The error
> appears in the Hadoop's jobtracker log:
> >
> > 2013-11-24 03:07:43,414 INFO org.apache.hadoop.mapred.JobInProgress:
> job_201311240306_0001: nMaps=2 nReduces=0 max=-1
> > 2013-11-24 03:07:43,417 INFO org.apache.hadoop.mapred.JobTracker: Job
> job_201311240306_0001 added successfully for user
> > 'ubuntu' to queue 'default'
> > 2013-11-24 03:07:43,418 INFO org.apache.hadoop.mapred.JobTracker:
> Initializing job_201311240306_0001
> > 2013-11-24 03:07:43,419 INFO org.apache.hadoop.mapred.JobInProgress:
> Initializing job_201311240306_0001
> > 2013-11-24 03:07:43,422 INFO org.apache.hadoop.mapred.AuditLogger:
> USER=ubuntu  IP=172.31.14.182        OPERATION=SUBMIT
> > _JOB    TARGET=job_201311240306_0001    RESULT=SUCCESS
> > 2013-11-24 03:07:43,828 INFO org.apache.hadoop.mapred.JobInProgress:
> jobToken generated and stored with users keys in /h
> >
> ome/ubuntu/hadoop_data/hadoop_tmp-ubuntu/mapred/system/job_201311240306_0001/jobToken
> > 2013-11-24 03:07:43,846 INFO org.apache.hadoop.mapred.JobInProgress:
> Input size for job job_201311240306_0001 = 0. Number of splits = 2
> > 2013-11-24 03:07:43,846 INFO org.apache.hadoop.mapred.JobInProgress:
> job_201311240306_0001 LOCALITY_WAIT_FACTOR=0.0
> > 2013-11-24 03:07:43,847 INFO org.apache.hadoop.mapred.JobInProgress: Job
> job_201311240306_0001 initialized successfully with 2 map tasks and 0
> reduce tasks.
> > 2013-11-24 03:07:45,152 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (JOB_SETUP) 'attempt_201311240306_0001_m_000003_0' to tip
> task_201311240306_0001_m_000003, for tracker 'tracker_cloud3:localhost/
> 127.0.0.1:47021'
> > 2013-11-24 03:07:54,222 INFO org.apache.hadoop.mapred.JobInProgress:
> Task 'attempt_201311240306_0001_m_000003_0' has completed
> task_201311240306_0001_m_000003 successfully.
> > 2013-11-24 03:07:54,228 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing a non-local task task_201311240306_0001_m_000000
> > 2013-11-24 03:07:54,229 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (MAP) 'attempt_201311240306_0001_m_000000_0' to tip
> task_201311240306_0001_m_000000, for tracker 'tracker_cloud3:localhost/
> 127.0.0.1:47021'
> > 2013-11-24 03:07:54,361 INFO org.apache.hadoop.mapred.JobInProgress:
> Choosing a non-local task task_201311240306_0001_m_000001
> > 2013-11-24 03:07:54,362 INFO org.apache.hadoop.mapred.JobTracker: Adding
> task (MAP) 'attempt_201311240306_0001_m_000001_0' to tip
> task_201311240306_0001_m_000001, for tracker 'tracker_cloud2:localhost/
> 127.0.0.1:55161'
> > 2013-11-24 03:08:03,243 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error from attempt_201311240306_0001_m_000000_0: java.lang.Throwable: Child
> Error
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
> > Caused by: java.io.IOException: Task process exit with nonzero status of
> 1.
> >         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:258)
> >
> >
> > Thereafter, all other workers will fail with:
> >
> > 2013-11-24 03:08:42,471 INFO org.apache.hadoop.mapred.TaskInProgress:
> Error from attempt_201311240306_0001_m_000001_0:
> java.lang.IllegalStateException: run: Caught an unrecoverable exception
> exists: Failed to check
> /_hadoopBsp/job_201311240306_0001/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
> after 3 tries!
> >         at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:102)
> >         at
> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> >         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:396)
> >         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> >         at org.apache.hadoop.mapred.Child.main(Child.java:249)
> > Caused by: java.lang.IllegalStateException: exists: Failed to check
> /_hadoopBsp/job_201311240306_0001/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
> after 3 tries!
> >         at
> org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:369)
> >         at
> org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:689)
> >         at
> org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:488)
> >         at
> org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:230)
> >         at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:92)
> >         ... 7 more
> >
> >
> > Any suggestions about why this might be happening?
> >
> > Thanks,
> > Young
> >
>

Mime
View raw message