giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Wang <superwangch...@gmail.com>
Subject Re: problems running my simple page rank example
Date Thu, 30 Apr 2015 18:48:46 GMT
Hello,

Could someone respond my question?

Thanks

On Wed, Apr 29, 2015 at 4:28 PM, Cheng Wang <superwangcheng@gmail.com>
wrote:

> Hi,
>
> I am new to Giraph. Recently I am trying to write a very simple PageRank
> program using Giraph, which is as below:
>
> package org.apache.giraph.examples;
>
> import org.apache.giraph.graph.BasicComputation;
> import org.apache.giraph.conf.LongConfOption;
> import org.apache.giraph.edge.Edge;
> import org.apache.giraph.graph.Vertex;
> import org.apache.hadoop.io.DoubleWritable;
> import org.apache.hadoop.io.FloatWritable;
> import org.apache.hadoop.io.LongWritable;
> import org.apache.log4j.Logger;
>
> import java.io.IOException;
>
> /**
>  * My simplified Google page rank example.
>  */
> @Algorithm(
>     name = "Page Rank",
>     description = "My simplified page rank"
> )
>
> public class MyPageRankComputation extends BasicComputation<
>     LongWritable, DoubleWritable, FloatWritable, DoubleWritable> {
>
>   public static final int MAX_SUPERSTEPS = 2;
>
>   @Override
>   public void compute(Vertex<LongWritable, DoubleWritable, FloatWritable>
> vertex,
>       Iterable<DoubleWritable> messages) throws IOException {
>
>     if (getSuperstep() >= 1) {
>       double sum = 0;
>       for (DoubleWritable message : messages) {
>         sum += message.get();
>       }
>       vertex.setValue(new DoubleWritable(sum));
>     }
>
>     if (getSuperstep() < MAX_SUPERSTEPS) {
>       int numEdges = vertex.getNumEdges();
>       DoubleWritable message = new DoubleWritable(vertex.getValue().get()
> / numEdges);
>       sendMessageToAllEdges(vertex, message);
>     } else {
>       vertex.voteToHalt();
>     }
>   }
> }
>
> I didn't use Aggregator just to make the program simple.
> And put the program under the path of the giraph examples:
>
> /home/hduser/my-giraph/giraph-examples/src/main/java/org/apache/giraph/examples
>
> where I just extract the folder giraph-examples from the giraph repo and
> put it into another folder called my-giraph.
>
> The compilation is fine. I also set the HADOOP_CLASSPATH as:
>
> export
> HADOOP_CLASSPATH=/home/hduser/my-giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar:$HADOOP_PATH
>
> export
> LIBJARS=/home/hduser/my-giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar:/usr/local/giraph/giraph-core.jar
>
>
> TO run the program, I provide the input command line which I mimic the
> "Giraph Quick Start Guide, Running a Giraph Job",
> http://giraph.apache.org/quick_start.html
>
> $HADOOP_HOME/bin/hadoop jar
> $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar
>  org.apache.giraph.GiraphRunner
> org.apache.giraph.examples.MyPageRankComputation -vif
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
> -vip /user/hduser/page_rank/input/tiny_input.txt -vof
> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
> /user/hduser/page_rank/output -w 1
>
> The input is very similar to SSSP's, which is :
>
> [1,0.2,[[2,0],[4,0]]]
> [2,0.2,[3,0],[5,0]]
> [3,0.2,[4,0]]
> [4,0.2,[5,0]]
> [5,0.2,[1,0],[2,0],[3,0]]
>
> So far so good !!
>
> ---------------
> Now the problem is when I run the job, it gets hanged on the reduce phase,
> of which is shown as below:
> ////////////////////////////////////////////////
> hduser@cwang ~/my-giraph/giraph-examples/target $ $HADOOP_HOME/bin/hadoop
> jar
> $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar
>  org.apache.giraph.GiraphRunner
> org.apache.giraph.examples.MyPageRankComputation -vif
> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
> -vip /user/hduser/page_rank/input/tiny_input.txt -vof
> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
> /user/hduser/page_rank/output -w 1
> 15/04/29 16:14:59 INFO utils.ConfigurationUtils: No edge input format
> specified. Ensure your InputFormat does not require one.
> 15/04/29 16:14:59 INFO utils.ConfigurationUtils: No edge output format
> specified. Ensure your OutputFormat does not require one.
> 15/04/29 16:15:00 INFO job.GiraphJob: run: Since checkpointing is disabled
> (default), do not allow any task retries (setting mapred.map.max.attempts =
> 1, old value = 4)
> 15/04/29 16:15:02 INFO job.GiraphJob: Tracking URL:
> http://hdnode01:50030/jobdetails.jsp?jobid=job_201504291528_0005
> 15/04/29 16:15:02 INFO job.GiraphJob: Waiting for resources... Job will
> start only when it gets all 2 mappers
> 15/04/29 16:15:39 INFO
> job.HaltApplicationUtils$DefaultHaltInstructionsWriter:
> writeHaltInstructions: To halt after next superstep execute:
> 'bin/halt-application --zkServer cwang:22181 --zkNode
> /_hadoopBsp/job_201504291528_0005/_haltComputation'
> 15/04/29 16:15:39 INFO mapred.JobClient: Running job: job_201504291528_0005
> 15/04/29 16:15:40 INFO mapred.JobClient:  map 100% reduce 0%
> 15/04/29 16:20:28 INFO mapred.JobClient: Job complete:
> job_201504291528_0005
> 15/04/29 16:20:28 INFO mapred.JobClient: Counters: 5
> 15/04/29 16:20:28 INFO mapred.JobClient:   Job Counters
> 15/04/29 16:20:28 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=625803
> 15/04/29 16:20:28 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 15/04/29 16:20:28 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 15/04/29 16:20:28 INFO mapred.JobClient:     Launched map tasks=2
> 15/04/29 16:20:28 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> //////////////////////////////////////////////////
>
> And there is no desired output generated.
>
> Can someone tell me where is the problem?
>
>
> Thanks
> Cheng
>
>

Mime
View raw message