giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Garcia <dgar...@potomacfusion.com>
Subject RE: cannot run Giraph trunk with Hadoop 2.0.0-alpha
Date Tue, 21 Aug 2012 01:56:51 GMT
You can remove this error by recursively removing _bsp folder from the zookeeper file system...and
then running the job again.  Probably should remove folder from hdfs too.

________________________________________
From: Johnny Zhang [xiaoyuz@cloudera.com]
Sent: Monday, August 20, 2012 6:59 PM
To: user@giraph.apache.org
Subject: Re: cannot run Giraph trunk with Hadoop 2.0.0-alpha

sorry for wide distribution, I further check the folder  '_bsp/_defaultZkManagerDir/job_1344903945125_0032'
exists, and it has one sub folder  '_bsp/_defaultZkManagerDir/job_1344903945125_0032/_task'
and another file inside, so the hdfs file permission should not be a issue. but not sure why
Giraph still complain '_bsp/_defaultZkManagerDir/job_1344903945125_0032/_zkServer does not
exist'.

Does Zookeeper needs further configuration? Or any other possible reason cannot create _zkServer
folder ?

Thanks,
Johnny


On Mon, Aug 20, 2012 at 11:59 AM, Johnny Zhang <xiaoyuz@cloudera.com<mailto:xiaoyuz@cloudera.com>>
wrote:
Alessandro:
Thanks for reminding me on that. Now I can run the pagerank example successfully, though I
still get one zookeeper server related exception. Here is part of the log:

12/08/20 11:56:44 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://cs-10-20-76-76.cloud.cloudera.com:8080/tasklog?plaintext=true&attemptid=attempt_1344903945125_0032_m_000002_2&filter=stdout
12/08/20 11:56:44 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://cs-10-20-76-76.cloud.cloudera.com:8080/tasklog?plaintext=true&attemptid=attempt_1344903945125_0032_m_000002_2&filter=stderr
12/08/20 11:56:44 INFO mapreduce.Job: Task Id : attempt_1344903945125_0032_m_000001_2, Status
: FAILED
Error: java.lang.RuntimeException: java.io.FileNotFoundException: File _bsp/_defaultZkManagerDir/job_1344903945125_0032/_zkServer
does not exist.
at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:749)
at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:320)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:570)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:725)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)
Caused by: java.io.FileNotFoundException: File _bsp/_defaultZkManagerDir/job_1344903945125_0032/_zkServer
does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:365)
at org.apache.giraph.zk.ZooKeeperManager.onlineZooKeeperServers(ZooKeeperManager.java:708)
... 9 more

12/08/20 11:56:44 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://cs-10-20-76-76.cloud.cloudera.com:8080/tasklog?plaintext=true&attemptid=attempt_1344903945125_0032_m_000001_2&filter=stdout
12/08/20 11:56:44 WARN mapreduce.Job: Error reading task output Server returned HTTP response
code: 400 for URL: http://cs-10-20-76-76.cloud.cloudera.com:8080/tasklog?plaintext=true&attemptid=attempt_1344903945125_0032_m_000001_2&filter=stderr
12/08/20 11:56:45 INFO mapreduce.Job: Job job_1344903945125_0032 failed with state FAILED
due to:
12/08/20 11:56:45 INFO mapreduce.Job: Counters: 28
File System Counters
FILE: Number of bytes read=120
FILE: Number of bytes written=49450
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=44
HDFS: Number of bytes written=0
HDFS: Number of read operations=8
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Failed map tasks=10
Launched map tasks=13
Other local map tasks=13
Total time spent by all maps in occupied slots (ms)=692328
Total time spent by all reduces in occupied slots (ms)=0
Map-Reduce Framework
Map input records=0
Map output records=0
Input split bytes=44
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=34
CPU time spent (ms)=450
Physical memory (bytes) snapshot=96169984
Virtual memory (bytes) snapshot=1599012864
Total committed heap usage (bytes)=76087296
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0


Thanks,
Johnny

On Mon, Aug 20, 2012 at 11:47 AM, Alessandro Presta <alessandro@fb.com<mailto:alessandro@fb.com>>
wrote:
Looks like you compiled for hadoop 0.20.203, which had a different API (that's why we have
to use Munge). Can you try recompiling with the hadoop_2.0.0 profile?

From: Johnny Zhang <xiaoyuz@cloudera.com<mailto:xiaoyuz@cloudera.com>>
Reply-To: "user@giraph.apache.org<mailto:user@giraph.apache.org>" <user@giraph.apache.org<mailto:user@giraph.apache.org>>
Date: Monday, August 20, 2012 7:31 PM
To: "user@giraph.apache.org<mailto:user@giraph.apache.org>" <user@giraph.apache.org<mailto:user@giraph.apache.org>>
Subject: cannot run Giraph trunk with Hadoop 2.0.0-alpha

Hi, all:
I am trying to run Giraph trunk with Hadoop 2.0.0-alpha.
I am getting below error when I run a page rank example job with 3 workers.

# hadoop jar target/giraph-0.2-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar org.apache.giraph.benchmark.PageRankBenchmark
-e 1 -s 3 -v -V 50000000 -w 3
12/08/20 11:10:38 WARN mapred.JobConf: The variable mapred.child.ulimit is no longer used.
12/08/20 11:10:38 INFO benchmark.PageRankBenchmark: Using class org.apache.giraph.benchmark.PageRankBenchmark
12/08/20 11:10:38 WARN conf.Configuration: mapred.job.tracker is deprecated. Instead, use
mapreduce.jobtracker.address
12/08/20 11:10:38 WARN conf.Configuration: mapred.job.map.memory.mb is deprecated. Instead,
use mapreduce.map.memory.mb
12/08/20 11:10:38 WARN conf.Configuration: mapred.job.reduce.memory.mb is deprecated. Instead,
use mapreduce.reduce.memory.mb
12/08/20 11:10:38 WARN conf.Configuration: mapred.map.tasks.speculative.execution is deprecated.
Instead, use mapreduce.map.speculative
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext,
but class was expected
at org.apache.giraph.bsp.BspOutputFormat.checkOutputSpecs(BspOutputFormat.java:43)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:411)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:326)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
at org.apache.giraph.graph.GiraphJob.run(GiraphJob.java:714)
at org.apache.giraph.benchmark.PageRankBenchmark.run(PageRankBenchmark.java:150)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.apache.giraph.benchmark.PageRankBenchmark.main(PageRankBenchmark.java:164)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)


My $HADOOP_MAPRED_HOME and $JAVA_HOME is set up correctly, could anyone tell me if I need
to setup anything else? Thanks a lot.

Johnny



Mime
View raw message