giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Wang <superwangch...@gmail.com>
Subject problems running my simple page rank example
Date Wed, 29 Apr 2015 21:28:28 GMT
Hi,

I am new to Giraph. Recently I am trying to write a very simple PageRank
program using Giraph, which is as below:

package org.apache.giraph.examples;

import org.apache.giraph.graph.BasicComputation;
import org.apache.giraph.conf.LongConfOption;
import org.apache.giraph.edge.Edge;
import org.apache.giraph.graph.Vertex;
import org.apache.hadoop.io.DoubleWritable;
import org.apache.hadoop.io.FloatWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.log4j.Logger;

import java.io.IOException;

/**
 * My simplified Google page rank example.
 */
@Algorithm(
    name = "Page Rank",
    description = "My simplified page rank"
)

public class MyPageRankComputation extends BasicComputation<
    LongWritable, DoubleWritable, FloatWritable, DoubleWritable> {

  public static final int MAX_SUPERSTEPS = 2;

  @Override
  public void compute(Vertex<LongWritable, DoubleWritable, FloatWritable>
vertex,
      Iterable<DoubleWritable> messages) throws IOException {

    if (getSuperstep() >= 1) {
      double sum = 0;
      for (DoubleWritable message : messages) {
        sum += message.get();
      }
      vertex.setValue(new DoubleWritable(sum));
    }

    if (getSuperstep() < MAX_SUPERSTEPS) {
      int numEdges = vertex.getNumEdges();
      DoubleWritable message = new DoubleWritable(vertex.getValue().get() /
numEdges);
      sendMessageToAllEdges(vertex, message);
    } else {
      vertex.voteToHalt();
    }
  }
}

I didn't use Aggregator just to make the program simple.
And put the program under the path of the giraph examples:
/home/hduser/my-giraph/giraph-examples/src/main/java/org/apache/giraph/examples

where I just extract the folder giraph-examples from the giraph repo and
put it into another folder called my-giraph.

The compilation is fine. I also set the HADOOP_CLASSPATH as:

export
HADOOP_CLASSPATH=/home/hduser/my-giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar:$HADOOP_PATH

export
LIBJARS=/home/hduser/my-giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar:/usr/local/giraph/giraph-core.jar


TO run the program, I provide the input command line which I mimic the
"Giraph Quick Start Guide, Running a Giraph Job",
http://giraph.apache.org/quick_start.html

$HADOOP_HOME/bin/hadoop jar
$GIRAPH_HOME/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar
 org.apache.giraph.GiraphRunner
org.apache.giraph.examples.MyPageRankComputation -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip /user/hduser/page_rank/input/tiny_input.txt -vof
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
/user/hduser/page_rank/output -w 1

The input is very similar to SSSP's, which is :

[1,0.2,[[2,0],[4,0]]]
[2,0.2,[3,0],[5,0]]
[3,0.2,[4,0]]
[4,0.2,[5,0]]
[5,0.2,[1,0],[2,0],[3,0]]

So far so good !!

---------------
Now the problem is when I run the job, it gets hanged on the reduce phase,
of which is shown as below:
////////////////////////////////////////////////
hduser@cwang ~/my-giraph/giraph-examples/target $ $HADOOP_HOME/bin/hadoop
jar
$GIRAPH_HOME/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar
 org.apache.giraph.GiraphRunner
org.apache.giraph.examples.MyPageRankComputation -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip /user/hduser/page_rank/input/tiny_input.txt -vof
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
/user/hduser/page_rank/output -w 1
15/04/29 16:14:59 INFO utils.ConfigurationUtils: No edge input format
specified. Ensure your InputFormat does not require one.
15/04/29 16:14:59 INFO utils.ConfigurationUtils: No edge output format
specified. Ensure your OutputFormat does not require one.
15/04/29 16:15:00 INFO job.GiraphJob: run: Since checkpointing is disabled
(default), do not allow any task retries (setting mapred.map.max.attempts =
1, old value = 4)
15/04/29 16:15:02 INFO job.GiraphJob: Tracking URL:
http://hdnode01:50030/jobdetails.jsp?jobid=job_201504291528_0005
15/04/29 16:15:02 INFO job.GiraphJob: Waiting for resources... Job will
start only when it gets all 2 mappers
15/04/29 16:15:39 INFO
job.HaltApplicationUtils$DefaultHaltInstructionsWriter:
writeHaltInstructions: To halt after next superstep execute:
'bin/halt-application --zkServer cwang:22181 --zkNode
/_hadoopBsp/job_201504291528_0005/_haltComputation'
15/04/29 16:15:39 INFO mapred.JobClient: Running job: job_201504291528_0005
15/04/29 16:15:40 INFO mapred.JobClient:  map 100% reduce 0%
15/04/29 16:20:28 INFO mapred.JobClient: Job complete: job_201504291528_0005
15/04/29 16:20:28 INFO mapred.JobClient: Counters: 5
15/04/29 16:20:28 INFO mapred.JobClient:   Job Counters
15/04/29 16:20:28 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=625803
15/04/29 16:20:28 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
15/04/29 16:20:28 INFO mapred.JobClient:     Total time spent by all maps
waiting after reserving slots (ms)=0
15/04/29 16:20:28 INFO mapred.JobClient:     Launched map tasks=2
15/04/29 16:20:28 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
//////////////////////////////////////////////////

And there is no desired output generated.

Can someone tell me where is the problem?


Thanks
Cheng

Mime
View raw message