giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Compton <compton.r...@gmail.com>
Subject Re: Where can I find a simple "Hello World" example for Giraph
Date Fri, 22 Feb 2013 02:23:21 GMT
Well, looks like I need to split things across different files, right?

I've now got down to this as the most basic Giraph program one can run:

http://pastebin.com/HfuKEm5r
http://pastebin.com/iAFSsDay
http://pastebin.com/50qW6ACV

It seems to work just fine on cdh3. One weird problem I get is when I
set a number too high (ie SUPERSTEP_COUNT=100 or SUPERSTEP_COUNT=20
and AGGREGATE_VERTICIES=5000000) the maps will show negative (?!)
progress. Since I eventually will want to work with ~50M node graphs,
I'll need to figure a way around it. But this might also be a problem
with the PseudoRandomVertexInputFormat. It have no idea what is wrong,
but at least I have some basic code working (?) now...

13/02/21 18:09:02 INFO graph.GiraphJob: run: Since checkpointing is
disabled (default), do not allow any task retries (setting
mapred.map.max.attempts = 0, old value = 4)
13/02/21 18:09:03 WARN bsp.BspOutputFormat: checkOutputSpecs:
ImmutableOutputCommiter will not check anything
13/02/21 18:09:08 INFO mapred.JobClient: Running job: job_201302201124_0147
13/02/21 18:09:10 INFO mapred.JobClient:  map 0% reduce 0%
13/02/21 18:09:21 INFO mapred.JobClient:  map 54% reduce 0%
13/02/21 18:09:22 INFO mapred.JobClient:  map 96% reduce 0%
13/02/21 18:09:31 INFO mapred.JobClient:  map 100% reduce 0%
13/02/21 18:18:17 INFO mapred.JobClient:  map 96% reduce 0%
13/02/21 18:19:13 INFO mapred.JobClient:  map 93% reduce 0%
13/02/21 18:19:48 INFO mapred.JobClient:  map 87% reduce 0%
13/02/21 18:19:59 INFO mapred.JobClient:  map 83% reduce 0%
13/02/21 18:20:07 INFO mapred.JobClient:  map 70% reduce 0%
13/02/21 18:20:08 INFO mapred.JobClient:  map 54% reduce 0%
13/02/21 18:20:13 INFO mapred.JobClient:  map 51% reduce 0%
13/02/21 18:20:14 INFO mapred.JobClient:  map 45% reduce 0%
13/02/21 18:20:15 INFO mapred.JobClient:  map 41% reduce 0%
13/02/21 18:20:22 INFO mapred.JobClient:  map 38% reduce 0%
13/02/21 18:20:29 INFO mapred.JobClient:  map 35% reduce 0%
13/02/21 18:20:51 INFO mapred.JobClient:  map 32% reduce 0%
13/02/21 18:22:11 INFO mapred.JobClient:  map 29% reduce 0%
13/02/21 18:22:12 INFO mapred.JobClient: Job complete: job_201302201124_0147
13/02/21 18:22:12 INFO mapred.JobClient: Counters: 6
13/02/21 18:22:12 INFO mapred.JobClient:   Job Counters
13/02/21 18:22:12 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=7776941
13/02/21 18:22:12 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
13/02/21 18:22:12 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0
13/02/21 18:22:12 INFO mapred.JobClient:     Launched map tasks=31
13/02/21 18:22:12 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/02/21 18:22:12 INFO mapred.JobClient:     Failed map tasks=1






On Thu, Feb 21, 2013 at 5:35 PM, Ryan Compton <compton.ryan@gmail.com> wrote:
> Ok, ignoring everything about I/O I can run the below code.
>
> But then it hangs at 0% reduce forever
>
> -bash-3.2$ hadoop jar
> target/geocoderV2-1.0-SNAPSHOT-jar-with-dependencies.jar
> com.hrl.issl.osi.networks.HelloGiraph0p2
> 13/02/21 17:33:28 INFO graph.GiraphJob: run: Since checkpointing is
> disabled (default), do not allow any task retries (setting
> mapred.map.max.attempts = 0, old value = 4)
> 13/02/21 17:33:29 WARN bsp.BspOutputFormat: checkOutputSpecs:
> ImmutableOutputCommiter will not check anything
> 13/02/21 17:33:29 INFO mapred.JobClient: Running job: job_201302201124_0136
> 13/02/21 17:33:30 INFO mapred.JobClient:  map 0% reduce 0%
> 13/02/21 17:34:03 INFO mapred.JobClient:  map 74% reduce 0%
> 13/02/21 17:34:04 INFO mapred.JobClient:  map 80% reduce 0%
> 13/02/21 17:34:06 INFO mapred.JobClient:  map 100% reduce 0%
>
>
> What did I miss?
>
>
>
> import java.io.IOException;
>
> import org.apache.giraph.combiner.DoubleSumCombiner;
> import org.apache.giraph.conf.GiraphConfiguration;
> import org.apache.giraph.graph.GiraphJob;
> import org.apache.giraph.io.formats.PseudoRandomVertexInputFormat;
> import org.apache.giraph.vertex.EdgeListVertex;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.io.DoubleWritable;
> import org.apache.hadoop.io.LongWritable;
> import org.apache.hadoop.util.Tool;
> import org.apache.hadoop.util.ToolRunner;
> import org.apache.log4j.Logger;
>
> /**
>  * Default Pregel-style PageRank computation.
>  */
> public class HelloGiraph0p2 implements Tool {
>
> /**
> * Configuration from Configurable
> */
> private Configuration conf;
>
> @Override
> public Configuration getConf() {
> return conf;
> }
>
> @Override
> public void setConf(Configuration conf) {
> this.conf = conf;
> }
>
> @Override
> public final int run(final String[] args) throws Exception {
>
> String name = getClass().getName();
>
> GiraphJob job = new GiraphJob(getConf(), name);
> GiraphConfiguration configuration = job.getConfiguration();
>
> configuration.setVertexClass(EdgeListVertexTwoPlusTwo.class);
>
> configuration.useUnsafeSerialization(true);
>
> configuration.setVertexCombinerClass(DoubleSumCombiner.class);
>
> configuration.setVertexInputFormatClass(PseudoRandomVertexInputFormat.class);
> configuration.setLong(PseudoRandomVertexInputFormat.AGGREGATE_VERTICES,
> Long.parseLong("1000") );
> configuration.setLong(PseudoRandomVertexInputFormat.EDGES_PER_VERTEX,Long.parseLong("10"));
>
> int workers = Integer.parseInt("30");
> configuration.setWorkerConfiguration(workers, workers, 100.0f);
>
> //configuration.setInt(PageRankComputation.SUPERSTEP_COUNT,
> Integer.parseInt("10"));
>
> boolean isVerbose = true;
> if (job.run(isVerbose)) {
> return 0;
> } else {
> return -1;
> }
> }
>
> int NUM_SUPERSTEPS = 5;
> public class EdgeListVertexTwoPlusTwo extends
> EdgeListVertex<LongWritable, DoubleWritable, DoubleWritable,
> DoubleWritable> {
> @Override
> public void compute(Iterable<DoubleWritable> messages) throws IOException {
>
> if (this.getSuperstep() >= 1) {
> double four = 2+2;
> this.setValue( new DoubleWritable(four));
> }
>
> if (this.getSuperstep() < NUM_SUPERSTEPS) {
> this.sendMessageToAllEdges( new DoubleWritable(this.getValue().get()));
> } else {
> this.voteToHalt();
> }
> }
> }
>
> /**
> * Execute the benchmark.
> *
> * @param args Typically the command line arguments.
> * @throws Exception Any exception from the computation.
> */
> public static void main(final String[] args) throws Exception {
> System.exit(ToolRunner.run(new HelloGiraph0p2(), args));
> }
> }
>
> On Thu, Feb 21, 2013 at 4:10 PM, Ryan Compton <compton.ryan@gmail.com> wrote:
>> Ok, I've been looking at the PageRankBenchmark. There's a lot going on
>> in there...
>>
>> It looks like the minimum amount of stuff I need to run a do-nothing
>> job is what I've got below.
>>
>> But now it's telling me that (and PageRankBenchmark doesn't have the
>> word "output" anywhere).
>>
>> 13/02/21 16:08:37 ERROR security.UserGroupInformation:
>> PriviledgedActionException as:rfcompton (auth:SIMPLE)
>> cause:org.apache.hadoop.mapred.InvalidJobConfException: Output
>> directory not set.
>> Exception in thread "main"
>> org.apache.hadoop.mapred.InvalidJobConfException: Output directory not
>> set.
>> at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123)
>> at org.apache.giraph.io.formats.TextVertexOutputFormat.checkOutputSpecs(TextVertexOutputFormat.java:55)
>> at org.apache.giraph.bsp.BspOutputFormat.checkOutputSpecs(BspOutputFormat.java:48)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
>> at org.apache.giraph.graph.GiraphJob.run(GiraphJob.java:282)
>> at com.hrl.issl.osi.networks.HelloGiraph0p2.run(HelloGiraph0p2.java:49)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>> at com.hrl.issl.osi.networks.HelloGiraph0p2.main(HelloGiraph0p2.java:55)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
>>
>>
>>
>> import org.apache.giraph.conf.GiraphConfiguration;
>> import org.apache.giraph.graph.GiraphJob;
>> import org.apache.giraph.io.formats.JsonBase64VertexInputFormat;
>> import org.apache.giraph.io.formats.JsonBase64VertexOutputFormat;
>> import org.apache.giraph.io.formats.TextVertexInputFormat;
>> import org.apache.giraph.io.formats.TextVertexOutputFormat;
>> import org.apache.giraph.vertex.SimpleVertex;
>> import org.apache.hadoop.conf.Configuration;
>> import org.apache.hadoop.util.Tool;
>> import org.apache.hadoop.util.ToolRunner;
>>
>> /**
>>  *
>>  * Hello world giraph 0.2...
>>  *
>>  */
>> public class HelloGiraph0p2 implements Tool {
>> /** Configuration */
>> private Configuration conf;
>>
>> @Override
>> public void setConf(Configuration conf) {
>> this.conf = conf;
>> }
>>
>> @Override
>> public Configuration getConf() {
>> return conf;
>> }
>>
>> @Override
>> public int run(String[] arg0) throws Exception {
>>
>> GiraphJob job = new GiraphJob(getConf(), getClass().getName());
>>
>> GiraphConfiguration configuration = job.getConfiguration();
>>
>> configuration.setVertexClass(SimpleVertex.class);
>>
>> configuration.setVertexInputFormatClass(JsonBase64VertexInputFormat.class);
>> configuration.setVertexOutputFormatClass(JsonBase64VertexOutputFormat.class);
>>
>> configuration.setWorkerConfiguration(30, 30, 100.0f);
>>
>> return job.run(true) ? 0 : -1;
>>
>> }
>>
>> public static void main(String[] args) throws Exception {
>>
>> System.exit(ToolRunner.run(new HelloGiraph0p2(), args));
>> }
>>
>> }
>>
>> On Thu, Feb 21, 2013 at 2:58 PM, Maja Kabiljo <majakabiljo@fb.com> wrote:
>>> Hi Ryan,
>>>
>>> Before running the job, you need to set Vertex and input/output format
>>> classes on it. Please take a look at one of the benchmarks to see how to
>>> do that. Alternatively, you can try using GiraphRunner, where you pass
>>> these classes as command line arguments.
>>>
>>> Maja
>>>
>>> On 2/21/13 2:43 PM, "Ryan Compton" <compton.ryan@gmail.com> wrote:
>>>
>>>>I'm still struggling with this. I am trying to use 0.2, I dont have
>>>>permissions to edit core-site.xml
>>>>
>>>>I think this the most basic boiler plate code for a 0.2 Giraph
>>>>project, but I still can't run it.
>>>>
>>>>Exception in thread "main" java.lang.NullPointerException
>>>>at
>>>>org.apache.giraph.utils.ReflectionUtils.getTypeArguments(ReflectionUtils.j
>>>>ava:85)
>>>>at
>>>>org.apache.giraph.conf.GiraphClasses.readFromConf(GiraphClasses.java:117)
>>>>at org.apache.giraph.conf.GiraphClasses.<init>(GiraphClasses.java:105)
>>>>at
>>>>org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.<init>(Immutabl
>>>>eClassesGiraphConfiguration.java:84)
>>>>at
>>>>com.hrl.issl.osi.networks.HelloGiraph0p2.setConf(HelloGiraph0p2.java:34)
>>>>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:61)
>>>>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>>at com.hrl.issl.osi.networks.HelloGiraph0p2.main(HelloGiraph0p2.java:70)
>>>>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>at
>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
>>>>39)
>>>>at
>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
>>>>pl.java:25)
>>>>at java.lang.reflect.Method.invoke(Method.java:597)
>>>>at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
>>>>
>>>>
>>>>
>>>>package networks;
>>>>
>>>>import java.io.IOException;
>>>>
>>>>import org.apache.giraph.conf.ImmutableClassesGiraphConfiguration;
>>>>import org.apache.giraph.graph.GiraphJob;
>>>>import org.apache.giraph.vertex.EdgeListVertex;
>>>>import org.apache.hadoop.conf.Configuration;
>>>>import org.apache.hadoop.io.LongWritable;
>>>>import org.apache.hadoop.io.Text;
>>>>import org.apache.hadoop.util.Tool;
>>>>import org.apache.hadoop.util.ToolRunner;
>>>>import org.apache.log4j.Logger;
>>>>
>>>>/**
>>>> *
>>>> * Hello world giraph 0.2...
>>>> *
>>>> */
>>>>public class HelloGiraph0p2 extends EdgeListVertex<LongWritable, Text,
>>>>Text, Text> implements Tool {
>>>>/** Configuration */
>>>>private ImmutableClassesGiraphConfiguration<LongWritable, Text, Text,
>>>>Text> conf;
>>>>/** Class logger */
>>>>private static final Logger LOG = Logger.getLogger(HelloGiraph0p2.class);
>>>>
>>>>@Override
>>>>public void compute(Iterable<Text> arg0) throws IOException {
>>>>int four = 2+2;
>>>>}
>>>>@Override
>>>>public void setConf(Configuration configurationIn) {
>>>>this.conf = new ImmutableClassesGiraphConfiguration<LongWritable,
>>>>Text, Text, Text>(configurationIn);
>>>>return;
>>>>}
>>>>@Override
>>>>public ImmutableClassesGiraphConfiguration<LongWritable, Text, Text,
>>>>Text> getConf() {
>>>>return conf;
>>>>}
>>>>
>>>>/**
>>>>*
>>>>* ToolRunner run
>>>>*
>>>>* @param arg0
>>>>* @return
>>>>* @throws Exception
>>>>*/
>>>>@Override
>>>>public int run(String[] arg0) throws Exception {
>>>>GiraphJob job = new GiraphJob(getConf(), getClass().getName());
>>>>
>>>>return job.run(true) ? 0 : -1;
>>>>
>>>>}
>>>>/**
>>>>* main...
>>>>*
>>>>* @param args
>>>>* @throws Exception
>>>>*/
>>>>public static void main(String[] args) throws Exception {
>>>>System.exit(ToolRunner.run(new HelloGiraph0p2(), args));
>>>>}
>>>>
>>>>}
>>>>
>>>>
>>>>
>>>>On Tue, Feb 5, 2013 at 4:24 AM, Gustavo Enrique Salazar Torres
>>>><gsalazar@ime.usp.br> wrote:
>>>>> Hi Ryan:
>>>>>
>>>>> I got that same error and discovered that I have to start a zookeeper
>>>>> instance. What I did was to download Zookeeper, write a new zoo.cfg file
>>>>> under conf directory with the following:
>>>>>
>>>>> dataDir=/home/user/zookeeper-3.4.5/tmp
>>>>> clientPort=2181
>>>>>
>>>>> Also I added some lines in Hadoop's core-site.xml:
>>>>> <property>
>>>>>     <name>giraph.zkList</name>
>>>>>     <value>localhost:2181</value>
>>>>>   </property>
>>>>>
>>>>> Then I start Zookeper with bin/zkServer.sh start (also you will have to
>>>>> restart Hadoop) and then you can launch your Giraph Job.
>>>>> This setup worked for me (maybe there is an easiest way :D), hope it is
>>>>> useful.
>>>>>
>>>>> Best regards
>>>>> Gustavo
>>>>>
>>>>>
>>>>> On Mon, Feb 4, 2013 at 10:06 PM, Ryan Compton <compton.ryan@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Ok great, thanks. I've been working with 0.1, I can get things to
>>>>>> compile (see below code) but they still are not running, the maps hang
>>>>>> (also below). I have no idea how to fix it, I may consider updating
>>>>>> that code I have that compiles to 0.2 and see if it works then. The
>>>>>> only difference I can see is that 0.2 requires everything have a
>>>>>> "message"
>>>>>>
>>>>>> -bash-3.2$ hadoop jar target/giraph-0.1-jar-with-dependencies.jar
>>>>>> com.SimpleGiraphSumEdgeWeights /user/rfcompton/giraphTSPInput
>>>>>> /user/rfcompton/giraphTSPOutput 3 3
>>>>>> 13/02/04 15:48:23 INFO mapred.JobClient: Running job:
>>>>>> job_201301230932_1199
>>>>>> 13/02/04 15:48:24 INFO mapred.JobClient:  map 0% reduce 0%
>>>>>> 13/02/04 15:48:35 INFO mapred.JobClient:  map 25% reduce 0%
>>>>>> 13/02/04 15:58:40 INFO mapred.JobClient: Task Id :
>>>>>> attempt_201301230932_1199_m_000003_0, Status : FAILED
>>>>>> java.lang.IllegalStateException: run: Caught an unrecoverable
>>>>>> exception setup: Offlining servers due to exception...
>>>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:641)
>>>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>> at
>>>>>>
>>>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>>>>n.java:1157)
>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:264)
>>>>>> Caused by: java.lang.RuntimeException: setup: Offlining servers due to
>>>>>> exception...
>>>>>> at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:466)
>>>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
>>>>>> ... 7 more
>>>>>> Caused by: java.lang.IllegalStateException: setup: loadVertices failed
>>>>>> at
>>>>>>
>>>>>>org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:582
>>>>>>)
>>>>>> at org.apache.
>>>>>> Task attempt_201301230932_1199_m_000003_0 failed to report status for
>>>>>> 600 seconds. Killing!
>>>>>> 13/02/04 15:58:43 INFO mapred.JobClient: Task Id :
>>>>>> attempt_201301230932_1199_m_000002_0, Status : FAILED
>>>>>> Task attempt_201301230932_1199_m_000002_0 failed to report status for
>>>>>> 600 seconds. Killing!
>>>>>> 13/02/04 15:58:43 INFO mapred.JobClient: Task Id :
>>>>>> attempt_201301230932_1199_m_000000_0, Status : FAILED
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>========================================================================
>>>>>>=========================
>>>>>> This is the code I was using:
>>>>>>
>>>>>> import com.google.common.base.Preconditions;
>>>>>> import com.google.common.collect.Maps;
>>>>>>
>>>>>> import org.apache.giraph.comm.ArrayListWritable;
>>>>>> import org.apache.giraph.graph.BasicVertex;
>>>>>> import org.apache.giraph.graph.BspUtils;
>>>>>> import org.apache.giraph.graph.GiraphJob;
>>>>>> import org.apache.giraph.graph.EdgeListVertex;
>>>>>> import org.apache.giraph.graph.VertexReader;
>>>>>> import org.apache.giraph.graph.VertexWriter;
>>>>>> import org.apache.giraph.lib.TextVertexInputFormat;
>>>>>> import org.apache.giraph.lib.TextVertexInputFormat.TextVertexReader;
>>>>>> import org.apache.giraph.lib.TextVertexOutputFormat;
>>>>>> import org.apache.giraph.lib.TextVertexOutputFormat.TextVertexWriter;
>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>> import org.apache.hadoop.fs.Path;
>>>>>> import org.apache.hadoop.io.DoubleWritable;
>>>>>> import org.apache.hadoop.io.FloatWritable;
>>>>>> import org.apache.hadoop.io.IntWritable;
>>>>>> import org.apache.hadoop.io.LongWritable;
>>>>>> import org.apache.hadoop.io.Text;
>>>>>> import org.apache.hadoop.mapreduce.InputSplit;
>>>>>> import org.apache.hadoop.mapreduce.RecordReader;
>>>>>> import org.apache.hadoop.mapreduce.RecordWriter;
>>>>>> import org.apache.hadoop.mapreduce.TaskAttemptContext;
>>>>>> import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
>>>>>> import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
>>>>>> import org.apache.hadoop.util.Tool;
>>>>>> import org.apache.hadoop.util.ToolRunner;
>>>>>> import org.apache.log4j.Logger;
>>>>>> import org.json.JSONArray;
>>>>>> import org.json.JSONException;
>>>>>>
>>>>>> import java.io.IOException;
>>>>>> import java.util.ArrayList;
>>>>>> import java.util.Iterator;
>>>>>> import java.util.Map;
>>>>>> import java.util.StringTokenizer;
>>>>>>
>>>>>> /**
>>>>>>  * Shows an example of a brute-force implementation of the Travelling
>>>>>> Salesman Problem
>>>>>>  */
>>>>>> public class SimpleGiraphSumEdgeWeights extends
>>>>>> EdgeListVertex<LongWritable, ArrayListWritable<DoubleWritable>,
>>>>>> FloatWritable, ArrayListWritable<Text>> implements Tool {
>>>>>>     /** Configuration */
>>>>>>     private Configuration conf;
>>>>>>     /** Class logger */
>>>>>>     private static final Logger LOG =
>>>>>>         Logger.getLogger(SimpleGiraphSumEdgeWeights.class);
>>>>>>     /** The shortest paths id */
>>>>>>     public static String SOURCE_ID =
>>>>>>"SimpleShortestPathsVertex.sourceId";
>>>>>>     /** Default shortest paths id */
>>>>>>     public static long SOURCE_ID_DEFAULT = 1;
>>>>>>
>>>>>>
>>>>>>     /**
>>>>>>      * Is this vertex the source id?
>>>>>>      *
>>>>>>      * @return True if the source id
>>>>>>      */
>>>>>>     private boolean isSource() {
>>>>>>         return (getVertexId().get() ==
>>>>>>             getContext().getConfiguration().getLong(SOURCE_ID,
>>>>>>
>>>>>>SOURCE_ID_DEFAULT));
>>>>>>     }
>>>>>>     public class Message extends ArrayListWritable<Text> {
>>>>>>     public Message() {
>>>>>>       super();
>>>>>>     }
>>>>>>
>>>>>> @Override
>>>>>> public void setClass() {
>>>>>> // TODO Auto-generated method stub
>>>>>> }
>>>>>>   }
>>>>>>     public class Valeur extends ArrayListWritable<DoubleWritable> {
>>>>>>     public Valeur() {
>>>>>>       super();
>>>>>>     }
>>>>>>
>>>>>>   @Override
>>>>>>   public void setClass() {
>>>>>>   // TODO Auto-generated method stub
>>>>>>
>>>>>>   }
>>>>>>   }
>>>>>>
>>>>>>     @Override
>>>>>>     public void compute(Iterator<ArrayListWritable<Text>> msgIterator)
>>>>>>{
>>>>>>     System.out.println("****     LAUNCHING COMPUTATION FOR VERTEX
>>>>>> "+this.getVertexId().get()+", SUPERSTEP "+this.getSuperstep()+"
>>>>>> ****");
>>>>>>     //We get the source ID, we will need it
>>>>>>     String sourceID = new
>>>>>> LongWritable(this.getContext().getConfiguration().getLong(SOURCE_ID,
>>>>>>                 SOURCE_ID_DEFAULT)).toString();
>>>>>>     //We get the total number of verticles, and the current superstep
>>>>>> number, we will need it too
>>>>>>         int J=1;
>>>>>>
>>>>>>         voteToHalt();
>>>>>>     }
>>>>>>
>>>>>>     /**
>>>>>>      * VertexInputFormat that supports {@link
>>>>>>SimpleGiraphSumEdgeWeights}
>>>>>>      */
>>>>>>     public static class SimpleShortestPathsVertexInputFormat extends
>>>>>> TextVertexInputFormat<LongWritable, ArrayListWritable<DoubleWritable>,
>>>>>>                                   FloatWritable,
>>>>>>                                   DoubleWritable> {
>>>>>>         @Override
>>>>>>         public VertexReader<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>, FloatWritable, DoubleWritable>
>>>>>>                 createVertexReader(InputSplit split,
>>>>>>                                    TaskAttemptContext context)
>>>>>>                                    throws IOException {
>>>>>>             return new SimpleShortestPathsVertexReader(
>>>>>>                 textInputFormat.createRecordReader(split, context));
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>>     /**
>>>>>>      * VertexReader that supports {@link SimpleGiraphSumEdgeWeights}.
>>>>>>In
>>>>>> this
>>>>>>      * case, the edge values are not used.  The files should be in the
>>>>>>      * following JSON format:
>>>>>>      * JSONArray(<vertex id>, <vertex value>,
>>>>>>      *           JSONArray(JSONArray(<dest vertex id>, <edge value>),
>>>>>> ...))
>>>>>>      * Here is an example with vertex id 1, vertex value 4.3, and two
>>>>>> edges.
>>>>>>      * First edge has a destination vertex 2, edge value 2.1.
>>>>>>      * Second edge has a destination vertex 3, edge value 0.7.
>>>>>>      * [1,4.3,[[2,2.1],[3,0.7]]]
>>>>>>      */
>>>>>>     public static class SimpleShortestPathsVertexReader extends
>>>>>>             TextVertexReader<LongWritable,
>>>>>>             ArrayListWritable<DoubleWritable>, FloatWritable,
>>>>>> DoubleWritable> {
>>>>>>
>>>>>>         public SimpleShortestPathsVertexReader(
>>>>>>                 RecordReader<LongWritable, Text> lineRecordReader) {
>>>>>>             super(lineRecordReader);
>>>>>>         }
>>>>>>
>>>>>>         public class Valeur extends ArrayListWritable<DoubleWritable> {
>>>>>>           public Valeur() {
>>>>>>             super();
>>>>>>           }
>>>>>>
>>>>>>       @Override
>>>>>>       public void setClass() {
>>>>>>       // TODO Auto-generated method stub
>>>>>>
>>>>>>       }
>>>>>>         }
>>>>>>
>>>>>>         @Override
>>>>>>         public BasicVertex<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>, FloatWritable,
>>>>>>                            DoubleWritable> getCurrentVertex()
>>>>>>             throws IOException, InterruptedException {
>>>>>>           BasicVertex<LongWritable, ArrayListWritable<DoubleWritable>,
>>>>>> FloatWritable,
>>>>>>               DoubleWritable> vertex = BspUtils.<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>, FloatWritable,
>>>>>>
>>>>>> DoubleWritable>createVertex(getContext().getConfiguration());
>>>>>>
>>>>>>             Text line = getRecordReader().getCurrentValue();
>>>>>>             try {
>>>>>>                 JSONArray jsonVertex = new JSONArray(line.toString());
>>>>>>                 LongWritable vertexId = new
>>>>>> LongWritable(jsonVertex.getLong(0));
>>>>>>                Valeur vertexValue = new Valeur();
>>>>>>                vertexValue.add(new
>>>>>> DoubleWritable(jsonVertex.getDouble(1)));
>>>>>>                 Map<LongWritable, FloatWritable> edges =
>>>>>> Maps.newHashMap();
>>>>>>                 JSONArray jsonEdgeArray = jsonVertex.getJSONArray(2);
>>>>>>                 for (int i = 0; i < jsonEdgeArray.length(); ++i) {
>>>>>>                     JSONArray jsonEdge = jsonEdgeArray.getJSONArray(i);
>>>>>>                     edges.put(new LongWritable(jsonEdge.getLong(0)),
>>>>>>                             new FloatWritable((float)
>>>>>> jsonEdge.getDouble(1)));
>>>>>>                 }
>>>>>>                 vertex.initialize(vertexId, vertexValue, edges, null);
>>>>>>             } catch (JSONException e) {
>>>>>>                 throw new IllegalArgumentException(
>>>>>>                     "next: Couldn't get vertex from line " +
>>>>>> line.toString(), e);
>>>>>>             }
>>>>>>             return vertex;
>>>>>>         }
>>>>>>
>>>>>>         @Override
>>>>>>         public boolean nextVertex() throws IOException,
>>>>>> InterruptedException {
>>>>>>             return getRecordReader().nextKeyValue();
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>>     /**
>>>>>>      * VertexOutputFormat that supports {@link
>>>>>>SimpleGiraphSumEdgeWeights}
>>>>>>      */
>>>>>>     public static class SimpleShortestPathsVertexOutputFormat extends
>>>>>>             TextVertexOutputFormat<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>,
>>>>>>             FloatWritable> {
>>>>>>
>>>>>>         @Override
>>>>>>         public VertexWriter<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>, FloatWritable>
>>>>>>                 createVertexWriter(TaskAttemptContext context)
>>>>>>                 throws IOException, InterruptedException {
>>>>>>             RecordWriter<Text, Text> recordWriter =
>>>>>>                 textOutputFormat.getRecordWriter(context);
>>>>>>             return new SimpleShortestPathsVertexWriter(recordWriter);
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>>     /**
>>>>>>      * VertexWriter that supports {@link SimpleGiraphSumEdgeWeights}
>>>>>>      */
>>>>>>     public static class SimpleShortestPathsVertexWriter extends
>>>>>>             TextVertexWriter<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>, FloatWritable> {
>>>>>>         public SimpleShortestPathsVertexWriter(
>>>>>>                 RecordWriter<Text, Text> lineRecordWriter) {
>>>>>>             super(lineRecordWriter);
>>>>>>         }
>>>>>>
>>>>>>         @Override
>>>>>>         public void writeVertex(BasicVertex<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>,
>>>>>>                                 FloatWritable, ?> vertex)
>>>>>>                 throws IOException, InterruptedException {
>>>>>>         String sourceID = new
>>>>>> LongWritable(vertex.getContext().getConfiguration().getLong(SOURCE_ID,
>>>>>>                     SOURCE_ID_DEFAULT)).toString();
>>>>>>         JSONArray jsonVertex = new JSONArray();
>>>>>>             try {
>>>>>>                 jsonVertex.put(vertex.getVertexId().get());
>>>>>>                 jsonVertex.put(vertex.getVertexValue().toString());
>>>>>>                 JSONArray jsonEdgeArray = new JSONArray();
>>>>>>                 for (LongWritable targetVertexId : vertex) {
>>>>>>                     JSONArray jsonEdge = new JSONArray();
>>>>>>                     jsonEdge.put(targetVertexId.get());
>>>>>>
>>>>>> jsonEdge.put(vertex.getEdgeValue(targetVertexId).get());
>>>>>>                     jsonEdgeArray.put(jsonEdge);
>>>>>>                 }
>>>>>>                 jsonVertex.put(jsonEdgeArray);
>>>>>>             } catch (JSONException e) {
>>>>>>                 throw new IllegalArgumentException(
>>>>>>                     "writeVertex: Couldn't write vertex " + vertex);
>>>>>>             }
>>>>>>             getRecordWriter().write(new Text(jsonVertex.toString()),
>>>>>> null);
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>>     @Override
>>>>>>     public Configuration getConf() {
>>>>>>         return conf;
>>>>>>     }
>>>>>>
>>>>>>     @Override
>>>>>>     public void setConf(Configuration conf) {
>>>>>>         this.conf = conf;
>>>>>>     }
>>>>>>
>>>>>>     @Override
>>>>>>     public int run(String[] argArray) throws Exception {
>>>>>>         Preconditions.checkArgument(argArray.length == 4,
>>>>>>             "run: Must have 4 arguments <input path> <output path> " +
>>>>>>             "<source vertex id> <# of workers>");
>>>>>>
>>>>>>         GiraphJob job = new GiraphJob(getConf(), getClass().getName());
>>>>>>         job.setVertexClass(getClass());
>>>>>>         job.setVertexInputFormatClass(
>>>>>>             SimpleShortestPathsVertexInputFormat.class);
>>>>>>         job.setVertexOutputFormatClass(
>>>>>>             SimpleShortestPathsVertexOutputFormat.class);
>>>>>>         FileInputFormat.addInputPath(job, new Path(argArray[0]));
>>>>>>         FileOutputFormat.setOutputPath(job, new Path(argArray[1]));
>>>>>>
>>>>>> job.getConfiguration().setLong(SimpleGiraphSumEdgeWeights.SOURCE_ID,
>>>>>>                                        Long.parseLong(argArray[2]));
>>>>>>         job.setWorkerConfiguration(Integer.parseInt(argArray[3]),
>>>>>>                                    Integer.parseInt(argArray[3]),
>>>>>>                                    100.0f);
>>>>>>
>>>>>>         return job.run(true) ? 0 : -1;
>>>>>>     }
>>>>>>
>>>>>>     public static void main(String[] args) throws Exception {
>>>>>>         System.exit(ToolRunner.run(new SimpleGiraphSumEdgeWeights(),
>>>>>> args));
>>>>>>     }
>>>>>> }
>>>>>>
>>>>>> On Fri, Feb 1, 2013 at 5:37 PM, Eli Reisman <apache.mailbox@gmail.com>
>>>>>> wrote:
>>>>>> > Your best bet is to look over the two code components that users most
>>>>>> > often
>>>>>> > have to tweak or implement to write application code. That is, the
>>>>>> > Vertex
>>>>>> > implementations in examples/ and benchmark/ and the IO formats and
>>>>>> > related
>>>>>> > goodies like RecordReaders etc. that are mostly in the io/ dir. You
>>>>>> > might
>>>>>> > also take a look at the test suite for some quick ideas of how some
>>>>>>of
>>>>>> > the
>>>>>> > moving parts fit together.
>>>>>> >
>>>>>> > If you have real work to do with Giraph, you're going to need to get
>>>>>> > used to
>>>>>> > 0.2 and its API. The old API is both limited in what kind of data it
>>>>>> > will
>>>>>> > process, and not compatible into the future. The API we have now,
>>>>>>while
>>>>>> > evolving, is much much closer to being "final" than anything in 0.1
>>>>>>And
>>>>>> > regardless, we now have (in hindsight) the sure knowledge that none
>>>>>>of
>>>>>> > the
>>>>>> > code you write for 0.1 will be portable into the future.
>>>>>> >
>>>>>> > I am first in line to be sorry about the state of the docs. There are
>>>>>> > efforts underway now to fix this.  We all owe the users a collective
>>>>>> > apology
>>>>>> > for this. In lieu of proper apologies, feel free to ask any and all
>>>>>> > questions, no matter how dumb, they can't be as dumb as mine! The
>>>>>> > codebase
>>>>>> > is under heavy development and has a lot of confusingly-named moving
>>>>>> > parts
>>>>>> > so first get used to the plumbing an app writer has to know to
>>>>>>function,
>>>>>> > get
>>>>>> > some apps up and running, then dig into the framework code and it
>>>>>>will
>>>>>> > make
>>>>>> > more sense.
>>>>>> >
>>>>>> > One string to pull on to begin to look inside the framework is
>>>>>> > bin/giraph ->
>>>>>> > org.apache.giraph.GiraphRunner (hands job to Hadoop) -> ... ->
>>>>>> > o.a.g.graph.GraphMapper (is a mapper instance on a Hadoop cluster,
>>>>>> > started
>>>>>> > according to the Job sumbitted to Hadoop, but running our BSP code
>>>>>> > instead)
>>>>>> > -> o.a.g.graph.GraphTaskManager -> lots of places from there...
>>>>>> >
>>>>>> > The overarching BSP activity management for a single job run is
>>>>>> > basically
>>>>>> > all stemming out of GraphTaskManager now. You can look at setup() and
>>>>>> > execute() and get a decent idea of the major events in a job run, and
>>>>>> > where
>>>>>> > to look to get a better peek under the hood at any given task or
>>>>>>event.
>>>>>> > Good
>>>>>> > luck!
>>>>>> >
>>>>>> >
>>>>>> > On Fri, Feb 1, 2013 at 4:59 PM, Gustavo Enrique Salazar Torres
>>>>>> > <gsalazar@ime.usp.br> wrote:
>>>>>> >>
>>>>>> >> Hi Ryan:
>>>>>> >>
>>>>>> >> It's the simplest thing:
>>>>>> >> 1. Define your type of parameters for a type of Vertex (for example
>>>>>> >> EdgeListVertex)
>>>>>> >> 2. Implement compute method.
>>>>>> >>
>>>>>> >> From what I saw out there in the M/R world, Giraph provides the
>>>>>> >> simplest
>>>>>> >> way to work with graphs.
>>>>>> >>
>>>>>> >> Take a look at
>>>>>> >>
>>>>>> >>
>>>>>>https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Exampl
>>>>>>e
>>>>>> >> and use release 0.1
>>>>>> >> (http://www.apache.org/dyn/closer.cgi/incubator/giraph/)
>>>>>> >> because 0.2-SNAPSHOT is under heavy work.
>>>>>> >>
>>>>>> >> Hope this helps you.
>>>>>> >>
>>>>>> >> Gustavo
>>>>>> >>
>>>>>> >> On Fri, Feb 1, 2013 at 9:17 PM, Ryan Compton
>>>>>><compton.ryan@gmail.com>
>>>>>> >> wrote:
>>>>>> >>>
>>>>>> >>> I am having trouble understand what all the classes do and the
>>>>>> >>> documentation looks like it might be out of date. I searched around
>>>>>> >>> and found this: https://github.com/edaboussi/Giraph but it won't
>>>>>> >>> compile with 0.2, any suggestions?
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >
>>>

On Thu, Feb 21, 2013 at 5:35 PM, Ryan Compton <compton.ryan@gmail.com> wrote:
> Ok, ignoring everything about I/O I can run the below code.
>
> But then it hangs at 0% reduce forever
>
> -bash-3.2$ hadoop jar
> target/geocoderV2-1.0-SNAPSHOT-jar-with-dependencies.jar
> com.hrl.issl.osi.networks.HelloGiraph0p2
> 13/02/21 17:33:28 INFO graph.GiraphJob: run: Since checkpointing is
> disabled (default), do not allow any task retries (setting
> mapred.map.max.attempts = 0, old value = 4)
> 13/02/21 17:33:29 WARN bsp.BspOutputFormat: checkOutputSpecs:
> ImmutableOutputCommiter will not check anything
> 13/02/21 17:33:29 INFO mapred.JobClient: Running job: job_201302201124_0136
> 13/02/21 17:33:30 INFO mapred.JobClient:  map 0% reduce 0%
> 13/02/21 17:34:03 INFO mapred.JobClient:  map 74% reduce 0%
> 13/02/21 17:34:04 INFO mapred.JobClient:  map 80% reduce 0%
> 13/02/21 17:34:06 INFO mapred.JobClient:  map 100% reduce 0%
>
>
> What did I miss?
>
>
>
> import java.io.IOException;
>
> import org.apache.giraph.combiner.DoubleSumCombiner;
> import org.apache.giraph.conf.GiraphConfiguration;
> import org.apache.giraph.graph.GiraphJob;
> import org.apache.giraph.io.formats.PseudoRandomVertexInputFormat;
> import org.apache.giraph.vertex.EdgeListVertex;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.io.DoubleWritable;
> import org.apache.hadoop.io.LongWritable;
> import org.apache.hadoop.util.Tool;
> import org.apache.hadoop.util.ToolRunner;
> import org.apache.log4j.Logger;
>
> /**
>  * Default Pregel-style PageRank computation.
>  */
> public class HelloGiraph0p2 implements Tool {
>
> /**
> * Configuration from Configurable
> */
> private Configuration conf;
>
> @Override
> public Configuration getConf() {
> return conf;
> }
>
> @Override
> public void setConf(Configuration conf) {
> this.conf = conf;
> }
>
> @Override
> public final int run(final String[] args) throws Exception {
>
> String name = getClass().getName();
>
> GiraphJob job = new GiraphJob(getConf(), name);
> GiraphConfiguration configuration = job.getConfiguration();
>
> configuration.setVertexClass(EdgeListVertexTwoPlusTwo.class);
>
> configuration.useUnsafeSerialization(true);
>
> configuration.setVertexCombinerClass(DoubleSumCombiner.class);
>
> configuration.setVertexInputFormatClass(PseudoRandomVertexInputFormat.class);
> configuration.setLong(PseudoRandomVertexInputFormat.AGGREGATE_VERTICES,
> Long.parseLong("1000") );
> configuration.setLong(PseudoRandomVertexInputFormat.EDGES_PER_VERTEX,Long.parseLong("10"));
>
> int workers = Integer.parseInt("30");
> configuration.setWorkerConfiguration(workers, workers, 100.0f);
>
> //configuration.setInt(PageRankComputation.SUPERSTEP_COUNT,
> Integer.parseInt("10"));
>
> boolean isVerbose = true;
> if (job.run(isVerbose)) {
> return 0;
> } else {
> return -1;
> }
> }
>
> int NUM_SUPERSTEPS = 5;
> public class EdgeListVertexTwoPlusTwo extends
> EdgeListVertex<LongWritable, DoubleWritable, DoubleWritable,
> DoubleWritable> {
> @Override
> public void compute(Iterable<DoubleWritable> messages) throws IOException {
>
> if (this.getSuperstep() >= 1) {
> double four = 2+2;
> this.setValue( new DoubleWritable(four));
> }
>
> if (this.getSuperstep() < NUM_SUPERSTEPS) {
> this.sendMessageToAllEdges( new DoubleWritable(this.getValue().get()));
> } else {
> this.voteToHalt();
> }
> }
> }
>
> /**
> * Execute the benchmark.
> *
> * @param args Typically the command line arguments.
> * @throws Exception Any exception from the computation.
> */
> public static void main(final String[] args) throws Exception {
> System.exit(ToolRunner.run(new HelloGiraph0p2(), args));
> }
> }
>
> On Thu, Feb 21, 2013 at 4:10 PM, Ryan Compton <compton.ryan@gmail.com> wrote:
>> Ok, I've been looking at the PageRankBenchmark. There's a lot going on
>> in there...
>>
>> It looks like the minimum amount of stuff I need to run a do-nothing
>> job is what I've got below.
>>
>> But now it's telling me that (and PageRankBenchmark doesn't have the
>> word "output" anywhere).
>>
>> 13/02/21 16:08:37 ERROR security.UserGroupInformation:
>> PriviledgedActionException as:rfcompton (auth:SIMPLE)
>> cause:org.apache.hadoop.mapred.InvalidJobConfException: Output
>> directory not set.
>> Exception in thread "main"
>> org.apache.hadoop.mapred.InvalidJobConfException: Output directory not
>> set.
>> at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123)
>> at org.apache.giraph.io.formats.TextVertexOutputFormat.checkOutputSpecs(TextVertexOutputFormat.java:55)
>> at org.apache.giraph.bsp.BspOutputFormat.checkOutputSpecs(BspOutputFormat.java:48)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872)
>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
>> at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
>> at org.apache.giraph.graph.GiraphJob.run(GiraphJob.java:282)
>> at com.hrl.issl.osi.networks.HelloGiraph0p2.run(HelloGiraph0p2.java:49)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>> at com.hrl.issl.osi.networks.HelloGiraph0p2.main(HelloGiraph0p2.java:55)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> at java.lang.reflect.Method.invoke(Method.java:597)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
>>
>>
>>
>> import org.apache.giraph.conf.GiraphConfiguration;
>> import org.apache.giraph.graph.GiraphJob;
>> import org.apache.giraph.io.formats.JsonBase64VertexInputFormat;
>> import org.apache.giraph.io.formats.JsonBase64VertexOutputFormat;
>> import org.apache.giraph.io.formats.TextVertexInputFormat;
>> import org.apache.giraph.io.formats.TextVertexOutputFormat;
>> import org.apache.giraph.vertex.SimpleVertex;
>> import org.apache.hadoop.conf.Configuration;
>> import org.apache.hadoop.util.Tool;
>> import org.apache.hadoop.util.ToolRunner;
>>
>> /**
>>  *
>>  * Hello world giraph 0.2...
>>  *
>>  */
>> public class HelloGiraph0p2 implements Tool {
>> /** Configuration */
>> private Configuration conf;
>>
>> @Override
>> public void setConf(Configuration conf) {
>> this.conf = conf;
>> }
>>
>> @Override
>> public Configuration getConf() {
>> return conf;
>> }
>>
>> @Override
>> public int run(String[] arg0) throws Exception {
>>
>> GiraphJob job = new GiraphJob(getConf(), getClass().getName());
>>
>> GiraphConfiguration configuration = job.getConfiguration();
>>
>> configuration.setVertexClass(SimpleVertex.class);
>>
>> configuration.setVertexInputFormatClass(JsonBase64VertexInputFormat.class);
>> configuration.setVertexOutputFormatClass(JsonBase64VertexOutputFormat.class);
>>
>> configuration.setWorkerConfiguration(30, 30, 100.0f);
>>
>> return job.run(true) ? 0 : -1;
>>
>> }
>>
>> public static void main(String[] args) throws Exception {
>>
>> System.exit(ToolRunner.run(new HelloGiraph0p2(), args));
>> }
>>
>> }
>>
>> On Thu, Feb 21, 2013 at 2:58 PM, Maja Kabiljo <majakabiljo@fb.com> wrote:
>>> Hi Ryan,
>>>
>>> Before running the job, you need to set Vertex and input/output format
>>> classes on it. Please take a look at one of the benchmarks to see how to
>>> do that. Alternatively, you can try using GiraphRunner, where you pass
>>> these classes as command line arguments.
>>>
>>> Maja
>>>
>>> On 2/21/13 2:43 PM, "Ryan Compton" <compton.ryan@gmail.com> wrote:
>>>
>>>>I'm still struggling with this. I am trying to use 0.2, I dont have
>>>>permissions to edit core-site.xml
>>>>
>>>>I think this the most basic boiler plate code for a 0.2 Giraph
>>>>project, but I still can't run it.
>>>>
>>>>Exception in thread "main" java.lang.NullPointerException
>>>>at
>>>>org.apache.giraph.utils.ReflectionUtils.getTypeArguments(ReflectionUtils.j
>>>>ava:85)
>>>>at
>>>>org.apache.giraph.conf.GiraphClasses.readFromConf(GiraphClasses.java:117)
>>>>at org.apache.giraph.conf.GiraphClasses.<init>(GiraphClasses.java:105)
>>>>at
>>>>org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.<init>(Immutabl
>>>>eClassesGiraphConfiguration.java:84)
>>>>at
>>>>com.hrl.issl.osi.networks.HelloGiraph0p2.setConf(HelloGiraph0p2.java:34)
>>>>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:61)
>>>>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>>at com.hrl.issl.osi.networks.HelloGiraph0p2.main(HelloGiraph0p2.java:70)
>>>>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>at
>>>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:
>>>>39)
>>>>at
>>>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm
>>>>pl.java:25)
>>>>at java.lang.reflect.Method.invoke(Method.java:597)
>>>>at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
>>>>
>>>>
>>>>
>>>>package networks;
>>>>
>>>>import java.io.IOException;
>>>>
>>>>import org.apache.giraph.conf.ImmutableClassesGiraphConfiguration;
>>>>import org.apache.giraph.graph.GiraphJob;
>>>>import org.apache.giraph.vertex.EdgeListVertex;
>>>>import org.apache.hadoop.conf.Configuration;
>>>>import org.apache.hadoop.io.LongWritable;
>>>>import org.apache.hadoop.io.Text;
>>>>import org.apache.hadoop.util.Tool;
>>>>import org.apache.hadoop.util.ToolRunner;
>>>>import org.apache.log4j.Logger;
>>>>
>>>>/**
>>>> *
>>>> * Hello world giraph 0.2...
>>>> *
>>>> */
>>>>public class HelloGiraph0p2 extends EdgeListVertex<LongWritable, Text,
>>>>Text, Text> implements Tool {
>>>>/** Configuration */
>>>>private ImmutableClassesGiraphConfiguration<LongWritable, Text, Text,
>>>>Text> conf;
>>>>/** Class logger */
>>>>private static final Logger LOG = Logger.getLogger(HelloGiraph0p2.class);
>>>>
>>>>@Override
>>>>public void compute(Iterable<Text> arg0) throws IOException {
>>>>int four = 2+2;
>>>>}
>>>>@Override
>>>>public void setConf(Configuration configurationIn) {
>>>>this.conf = new ImmutableClassesGiraphConfiguration<LongWritable,
>>>>Text, Text, Text>(configurationIn);
>>>>return;
>>>>}
>>>>@Override
>>>>public ImmutableClassesGiraphConfiguration<LongWritable, Text, Text,
>>>>Text> getConf() {
>>>>return conf;
>>>>}
>>>>
>>>>/**
>>>>*
>>>>* ToolRunner run
>>>>*
>>>>* @param arg0
>>>>* @return
>>>>* @throws Exception
>>>>*/
>>>>@Override
>>>>public int run(String[] arg0) throws Exception {
>>>>GiraphJob job = new GiraphJob(getConf(), getClass().getName());
>>>>
>>>>return job.run(true) ? 0 : -1;
>>>>
>>>>}
>>>>/**
>>>>* main...
>>>>*
>>>>* @param args
>>>>* @throws Exception
>>>>*/
>>>>public static void main(String[] args) throws Exception {
>>>>System.exit(ToolRunner.run(new HelloGiraph0p2(), args));
>>>>}
>>>>
>>>>}
>>>>
>>>>
>>>>
>>>>On Tue, Feb 5, 2013 at 4:24 AM, Gustavo Enrique Salazar Torres
>>>><gsalazar@ime.usp.br> wrote:
>>>>> Hi Ryan:
>>>>>
>>>>> I got that same error and discovered that I have to start a zookeeper
>>>>> instance. What I did was to download Zookeeper, write a new zoo.cfg file
>>>>> under conf directory with the following:
>>>>>
>>>>> dataDir=/home/user/zookeeper-3.4.5/tmp
>>>>> clientPort=2181
>>>>>
>>>>> Also I added some lines in Hadoop's core-site.xml:
>>>>> <property>
>>>>>     <name>giraph.zkList</name>
>>>>>     <value>localhost:2181</value>
>>>>>   </property>
>>>>>
>>>>> Then I start Zookeper with bin/zkServer.sh start (also you will have to
>>>>> restart Hadoop) and then you can launch your Giraph Job.
>>>>> This setup worked for me (maybe there is an easiest way :D), hope it is
>>>>> useful.
>>>>>
>>>>> Best regards
>>>>> Gustavo
>>>>>
>>>>>
>>>>> On Mon, Feb 4, 2013 at 10:06 PM, Ryan Compton <compton.ryan@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Ok great, thanks. I've been working with 0.1, I can get things to
>>>>>> compile (see below code) but they still are not running, the maps hang
>>>>>> (also below). I have no idea how to fix it, I may consider updating
>>>>>> that code I have that compiles to 0.2 and see if it works then. The
>>>>>> only difference I can see is that 0.2 requires everything have a
>>>>>> "message"
>>>>>>
>>>>>> -bash-3.2$ hadoop jar target/giraph-0.1-jar-with-dependencies.jar
>>>>>> com.SimpleGiraphSumEdgeWeights /user/rfcompton/giraphTSPInput
>>>>>> /user/rfcompton/giraphTSPOutput 3 3
>>>>>> 13/02/04 15:48:23 INFO mapred.JobClient: Running job:
>>>>>> job_201301230932_1199
>>>>>> 13/02/04 15:48:24 INFO mapred.JobClient:  map 0% reduce 0%
>>>>>> 13/02/04 15:48:35 INFO mapred.JobClient:  map 25% reduce 0%
>>>>>> 13/02/04 15:58:40 INFO mapred.JobClient: Task Id :
>>>>>> attempt_201301230932_1199_m_000003_0, Status : FAILED
>>>>>> java.lang.IllegalStateException: run: Caught an unrecoverable
>>>>>> exception setup: Offlining servers due to exception...
>>>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:641)
>>>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>> at
>>>>>>
>>>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio
>>>>>>n.java:1157)
>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:264)
>>>>>> Caused by: java.lang.RuntimeException: setup: Offlining servers due to
>>>>>> exception...
>>>>>> at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:466)
>>>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630)
>>>>>> ... 7 more
>>>>>> Caused by: java.lang.IllegalStateException: setup: loadVertices failed
>>>>>> at
>>>>>>
>>>>>>org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:582
>>>>>>)
>>>>>> at org.apache.
>>>>>> Task attempt_201301230932_1199_m_000003_0 failed to report status for
>>>>>> 600 seconds. Killing!
>>>>>> 13/02/04 15:58:43 INFO mapred.JobClient: Task Id :
>>>>>> attempt_201301230932_1199_m_000002_0, Status : FAILED
>>>>>> Task attempt_201301230932_1199_m_000002_0 failed to report status for
>>>>>> 600 seconds. Killing!
>>>>>> 13/02/04 15:58:43 INFO mapred.JobClient: Task Id :
>>>>>> attempt_201301230932_1199_m_000000_0, Status : FAILED
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>========================================================================
>>>>>>=========================
>>>>>> This is the code I was using:
>>>>>>
>>>>>> import com.google.common.base.Preconditions;
>>>>>> import com.google.common.collect.Maps;
>>>>>>
>>>>>> import org.apache.giraph.comm.ArrayListWritable;
>>>>>> import org.apache.giraph.graph.BasicVertex;
>>>>>> import org.apache.giraph.graph.BspUtils;
>>>>>> import org.apache.giraph.graph.GiraphJob;
>>>>>> import org.apache.giraph.graph.EdgeListVertex;
>>>>>> import org.apache.giraph.graph.VertexReader;
>>>>>> import org.apache.giraph.graph.VertexWriter;
>>>>>> import org.apache.giraph.lib.TextVertexInputFormat;
>>>>>> import org.apache.giraph.lib.TextVertexInputFormat.TextVertexReader;
>>>>>> import org.apache.giraph.lib.TextVertexOutputFormat;
>>>>>> import org.apache.giraph.lib.TextVertexOutputFormat.TextVertexWriter;
>>>>>> import org.apache.hadoop.conf.Configuration;
>>>>>> import org.apache.hadoop.fs.Path;
>>>>>> import org.apache.hadoop.io.DoubleWritable;
>>>>>> import org.apache.hadoop.io.FloatWritable;
>>>>>> import org.apache.hadoop.io.IntWritable;
>>>>>> import org.apache.hadoop.io.LongWritable;
>>>>>> import org.apache.hadoop.io.Text;
>>>>>> import org.apache.hadoop.mapreduce.InputSplit;
>>>>>> import org.apache.hadoop.mapreduce.RecordReader;
>>>>>> import org.apache.hadoop.mapreduce.RecordWriter;
>>>>>> import org.apache.hadoop.mapreduce.TaskAttemptContext;
>>>>>> import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
>>>>>> import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
>>>>>> import org.apache.hadoop.util.Tool;
>>>>>> import org.apache.hadoop.util.ToolRunner;
>>>>>> import org.apache.log4j.Logger;
>>>>>> import org.json.JSONArray;
>>>>>> import org.json.JSONException;
>>>>>>
>>>>>> import java.io.IOException;
>>>>>> import java.util.ArrayList;
>>>>>> import java.util.Iterator;
>>>>>> import java.util.Map;
>>>>>> import java.util.StringTokenizer;
>>>>>>
>>>>>> /**
>>>>>>  * Shows an example of a brute-force implementation of the Travelling
>>>>>> Salesman Problem
>>>>>>  */
>>>>>> public class SimpleGiraphSumEdgeWeights extends
>>>>>> EdgeListVertex<LongWritable, ArrayListWritable<DoubleWritable>,
>>>>>> FloatWritable, ArrayListWritable<Text>> implements Tool {
>>>>>>     /** Configuration */
>>>>>>     private Configuration conf;
>>>>>>     /** Class logger */
>>>>>>     private static final Logger LOG =
>>>>>>         Logger.getLogger(SimpleGiraphSumEdgeWeights.class);
>>>>>>     /** The shortest paths id */
>>>>>>     public static String SOURCE_ID =
>>>>>>"SimpleShortestPathsVertex.sourceId";
>>>>>>     /** Default shortest paths id */
>>>>>>     public static long SOURCE_ID_DEFAULT = 1;
>>>>>>
>>>>>>
>>>>>>     /**
>>>>>>      * Is this vertex the source id?
>>>>>>      *
>>>>>>      * @return True if the source id
>>>>>>      */
>>>>>>     private boolean isSource() {
>>>>>>         return (getVertexId().get() ==
>>>>>>             getContext().getConfiguration().getLong(SOURCE_ID,
>>>>>>
>>>>>>SOURCE_ID_DEFAULT));
>>>>>>     }
>>>>>>     public class Message extends ArrayListWritable<Text> {
>>>>>>     public Message() {
>>>>>>       super();
>>>>>>     }
>>>>>>
>>>>>> @Override
>>>>>> public void setClass() {
>>>>>> // TODO Auto-generated method stub
>>>>>> }
>>>>>>   }
>>>>>>     public class Valeur extends ArrayListWritable<DoubleWritable> {
>>>>>>     public Valeur() {
>>>>>>       super();
>>>>>>     }
>>>>>>
>>>>>>   @Override
>>>>>>   public void setClass() {
>>>>>>   // TODO Auto-generated method stub
>>>>>>
>>>>>>   }
>>>>>>   }
>>>>>>
>>>>>>     @Override
>>>>>>     public void compute(Iterator<ArrayListWritable<Text>> msgIterator)
>>>>>>{
>>>>>>     System.out.println("****     LAUNCHING COMPUTATION FOR VERTEX
>>>>>> "+this.getVertexId().get()+", SUPERSTEP "+this.getSuperstep()+"
>>>>>> ****");
>>>>>>     //We get the source ID, we will need it
>>>>>>     String sourceID = new
>>>>>> LongWritable(this.getContext().getConfiguration().getLong(SOURCE_ID,
>>>>>>                 SOURCE_ID_DEFAULT)).toString();
>>>>>>     //We get the total number of verticles, and the current superstep
>>>>>> number, we will need it too
>>>>>>         int J=1;
>>>>>>
>>>>>>         voteToHalt();
>>>>>>     }
>>>>>>
>>>>>>     /**
>>>>>>      * VertexInputFormat that supports {@link
>>>>>>SimpleGiraphSumEdgeWeights}
>>>>>>      */
>>>>>>     public static class SimpleShortestPathsVertexInputFormat extends
>>>>>> TextVertexInputFormat<LongWritable, ArrayListWritable<DoubleWritable>,
>>>>>>                                   FloatWritable,
>>>>>>                                   DoubleWritable> {
>>>>>>         @Override
>>>>>>         public VertexReader<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>, FloatWritable, DoubleWritable>
>>>>>>                 createVertexReader(InputSplit split,
>>>>>>                                    TaskAttemptContext context)
>>>>>>                                    throws IOException {
>>>>>>             return new SimpleShortestPathsVertexReader(
>>>>>>                 textInputFormat.createRecordReader(split, context));
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>>     /**
>>>>>>      * VertexReader that supports {@link SimpleGiraphSumEdgeWeights}.
>>>>>>In
>>>>>> this
>>>>>>      * case, the edge values are not used.  The files should be in the
>>>>>>      * following JSON format:
>>>>>>      * JSONArray(<vertex id>, <vertex value>,
>>>>>>      *           JSONArray(JSONArray(<dest vertex id>, <edge value>),
>>>>>> ...))
>>>>>>      * Here is an example with vertex id 1, vertex value 4.3, and two
>>>>>> edges.
>>>>>>      * First edge has a destination vertex 2, edge value 2.1.
>>>>>>      * Second edge has a destination vertex 3, edge value 0.7.
>>>>>>      * [1,4.3,[[2,2.1],[3,0.7]]]
>>>>>>      */
>>>>>>     public static class SimpleShortestPathsVertexReader extends
>>>>>>             TextVertexReader<LongWritable,
>>>>>>             ArrayListWritable<DoubleWritable>, FloatWritable,
>>>>>> DoubleWritable> {
>>>>>>
>>>>>>         public SimpleShortestPathsVertexReader(
>>>>>>                 RecordReader<LongWritable, Text> lineRecordReader) {
>>>>>>             super(lineRecordReader);
>>>>>>         }
>>>>>>
>>>>>>         public class Valeur extends ArrayListWritable<DoubleWritable> {
>>>>>>           public Valeur() {
>>>>>>             super();
>>>>>>           }
>>>>>>
>>>>>>       @Override
>>>>>>       public void setClass() {
>>>>>>       // TODO Auto-generated method stub
>>>>>>
>>>>>>       }
>>>>>>         }
>>>>>>
>>>>>>         @Override
>>>>>>         public BasicVertex<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>, FloatWritable,
>>>>>>                            DoubleWritable> getCurrentVertex()
>>>>>>             throws IOException, InterruptedException {
>>>>>>           BasicVertex<LongWritable, ArrayListWritable<DoubleWritable>,
>>>>>> FloatWritable,
>>>>>>               DoubleWritable> vertex = BspUtils.<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>, FloatWritable,
>>>>>>
>>>>>> DoubleWritable>createVertex(getContext().getConfiguration());
>>>>>>
>>>>>>             Text line = getRecordReader().getCurrentValue();
>>>>>>             try {
>>>>>>                 JSONArray jsonVertex = new JSONArray(line.toString());
>>>>>>                 LongWritable vertexId = new
>>>>>> LongWritable(jsonVertex.getLong(0));
>>>>>>                Valeur vertexValue = new Valeur();
>>>>>>                vertexValue.add(new
>>>>>> DoubleWritable(jsonVertex.getDouble(1)));
>>>>>>                 Map<LongWritable, FloatWritable> edges =
>>>>>> Maps.newHashMap();
>>>>>>                 JSONArray jsonEdgeArray = jsonVertex.getJSONArray(2);
>>>>>>                 for (int i = 0; i < jsonEdgeArray.length(); ++i) {
>>>>>>                     JSONArray jsonEdge = jsonEdgeArray.getJSONArray(i);
>>>>>>                     edges.put(new LongWritable(jsonEdge.getLong(0)),
>>>>>>                             new FloatWritable((float)
>>>>>> jsonEdge.getDouble(1)));
>>>>>>                 }
>>>>>>                 vertex.initialize(vertexId, vertexValue, edges, null);
>>>>>>             } catch (JSONException e) {
>>>>>>                 throw new IllegalArgumentException(
>>>>>>                     "next: Couldn't get vertex from line " +
>>>>>> line.toString(), e);
>>>>>>             }
>>>>>>             return vertex;
>>>>>>         }
>>>>>>
>>>>>>         @Override
>>>>>>         public boolean nextVertex() throws IOException,
>>>>>> InterruptedException {
>>>>>>             return getRecordReader().nextKeyValue();
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>>     /**
>>>>>>      * VertexOutputFormat that supports {@link
>>>>>>SimpleGiraphSumEdgeWeights}
>>>>>>      */
>>>>>>     public static class SimpleShortestPathsVertexOutputFormat extends
>>>>>>             TextVertexOutputFormat<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>,
>>>>>>             FloatWritable> {
>>>>>>
>>>>>>         @Override
>>>>>>         public VertexWriter<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>, FloatWritable>
>>>>>>                 createVertexWriter(TaskAttemptContext context)
>>>>>>                 throws IOException, InterruptedException {
>>>>>>             RecordWriter<Text, Text> recordWriter =
>>>>>>                 textOutputFormat.getRecordWriter(context);
>>>>>>             return new SimpleShortestPathsVertexWriter(recordWriter);
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>>     /**
>>>>>>      * VertexWriter that supports {@link SimpleGiraphSumEdgeWeights}
>>>>>>      */
>>>>>>     public static class SimpleShortestPathsVertexWriter extends
>>>>>>             TextVertexWriter<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>, FloatWritable> {
>>>>>>         public SimpleShortestPathsVertexWriter(
>>>>>>                 RecordWriter<Text, Text> lineRecordWriter) {
>>>>>>             super(lineRecordWriter);
>>>>>>         }
>>>>>>
>>>>>>         @Override
>>>>>>         public void writeVertex(BasicVertex<LongWritable,
>>>>>> ArrayListWritable<DoubleWritable>,
>>>>>>                                 FloatWritable, ?> vertex)
>>>>>>                 throws IOException, InterruptedException {
>>>>>>         String sourceID = new
>>>>>> LongWritable(vertex.getContext().getConfiguration().getLong(SOURCE_ID,
>>>>>>                     SOURCE_ID_DEFAULT)).toString();
>>>>>>         JSONArray jsonVertex = new JSONArray();
>>>>>>             try {
>>>>>>                 jsonVertex.put(vertex.getVertexId().get());
>>>>>>                 jsonVertex.put(vertex.getVertexValue().toString());
>>>>>>                 JSONArray jsonEdgeArray = new JSONArray();
>>>>>>                 for (LongWritable targetVertexId : vertex) {
>>>>>>                     JSONArray jsonEdge = new JSONArray();
>>>>>>                     jsonEdge.put(targetVertexId.get());
>>>>>>
>>>>>> jsonEdge.put(vertex.getEdgeValue(targetVertexId).get());
>>>>>>                     jsonEdgeArray.put(jsonEdge);
>>>>>>                 }
>>>>>>                 jsonVertex.put(jsonEdgeArray);
>>>>>>             } catch (JSONException e) {
>>>>>>                 throw new IllegalArgumentException(
>>>>>>                     "writeVertex: Couldn't write vertex " + vertex);
>>>>>>             }
>>>>>>             getRecordWriter().write(new Text(jsonVertex.toString()),
>>>>>> null);
>>>>>>         }
>>>>>>     }
>>>>>>
>>>>>>     @Override
>>>>>>     public Configuration getConf() {
>>>>>>         return conf;
>>>>>>     }
>>>>>>
>>>>>>     @Override
>>>>>>     public void setConf(Configuration conf) {
>>>>>>         this.conf = conf;
>>>>>>     }
>>>>>>
>>>>>>     @Override
>>>>>>     public int run(String[] argArray) throws Exception {
>>>>>>         Preconditions.checkArgument(argArray.length == 4,
>>>>>>             "run: Must have 4 arguments <input path> <output path> " +
>>>>>>             "<source vertex id> <# of workers>");
>>>>>>
>>>>>>         GiraphJob job = new GiraphJob(getConf(), getClass().getName());
>>>>>>         job.setVertexClass(getClass());
>>>>>>         job.setVertexInputFormatClass(
>>>>>>             SimpleShortestPathsVertexInputFormat.class);
>>>>>>         job.setVertexOutputFormatClass(
>>>>>>             SimpleShortestPathsVertexOutputFormat.class);
>>>>>>         FileInputFormat.addInputPath(job, new Path(argArray[0]));
>>>>>>         FileOutputFormat.setOutputPath(job, new Path(argArray[1]));
>>>>>>
>>>>>> job.getConfiguration().setLong(SimpleGiraphSumEdgeWeights.SOURCE_ID,
>>>>>>                                        Long.parseLong(argArray[2]));
>>>>>>         job.setWorkerConfiguration(Integer.parseInt(argArray[3]),
>>>>>>                                    Integer.parseInt(argArray[3]),
>>>>>>                                    100.0f);
>>>>>>
>>>>>>         return job.run(true) ? 0 : -1;
>>>>>>     }
>>>>>>
>>>>>>     public static void main(String[] args) throws Exception {
>>>>>>         System.exit(ToolRunner.run(new SimpleGiraphSumEdgeWeights(),
>>>>>> args));
>>>>>>     }
>>>>>> }
>>>>>>
>>>>>> On Fri, Feb 1, 2013 at 5:37 PM, Eli Reisman <apache.mailbox@gmail.com>
>>>>>> wrote:
>>>>>> > Your best bet is to look over the two code components that users most
>>>>>> > often
>>>>>> > have to tweak or implement to write application code. That is, the
>>>>>> > Vertex
>>>>>> > implementations in examples/ and benchmark/ and the IO formats and
>>>>>> > related
>>>>>> > goodies like RecordReaders etc. that are mostly in the io/ dir. You
>>>>>> > might
>>>>>> > also take a look at the test suite for some quick ideas of how some
>>>>>>of
>>>>>> > the
>>>>>> > moving parts fit together.
>>>>>> >
>>>>>> > If you have real work to do with Giraph, you're going to need to get
>>>>>> > used to
>>>>>> > 0.2 and its API. The old API is both limited in what kind of data it
>>>>>> > will
>>>>>> > process, and not compatible into the future. The API we have now,
>>>>>>while
>>>>>> > evolving, is much much closer to being "final" than anything in 0.1
>>>>>>And
>>>>>> > regardless, we now have (in hindsight) the sure knowledge that none
>>>>>>of
>>>>>> > the
>>>>>> > code you write for 0.1 will be portable into the future.
>>>>>> >
>>>>>> > I am first in line to be sorry about the state of the docs. There are
>>>>>> > efforts underway now to fix this.  We all owe the users a collective
>>>>>> > apology
>>>>>> > for this. In lieu of proper apologies, feel free to ask any and all
>>>>>> > questions, no matter how dumb, they can't be as dumb as mine! The
>>>>>> > codebase
>>>>>> > is under heavy development and has a lot of confusingly-named moving
>>>>>> > parts
>>>>>> > so first get used to the plumbing an app writer has to know to
>>>>>>function,
>>>>>> > get
>>>>>> > some apps up and running, then dig into the framework code and it
>>>>>>will
>>>>>> > make
>>>>>> > more sense.
>>>>>> >
>>>>>> > One string to pull on to begin to look inside the framework is
>>>>>> > bin/giraph ->
>>>>>> > org.apache.giraph.GiraphRunner (hands job to Hadoop) -> ... ->
>>>>>> > o.a.g.graph.GraphMapper (is a mapper instance on a Hadoop cluster,
>>>>>> > started
>>>>>> > according to the Job sumbitted to Hadoop, but running our BSP code
>>>>>> > instead)
>>>>>> > -> o.a.g.graph.GraphTaskManager -> lots of places from there...
>>>>>> >
>>>>>> > The overarching BSP activity management for a single job run is
>>>>>> > basically
>>>>>> > all stemming out of GraphTaskManager now. You can look at setup() and
>>>>>> > execute() and get a decent idea of the major events in a job run, and
>>>>>> > where
>>>>>> > to look to get a better peek under the hood at any given task or
>>>>>>event.
>>>>>> > Good
>>>>>> > luck!
>>>>>> >
>>>>>> >
>>>>>> > On Fri, Feb 1, 2013 at 4:59 PM, Gustavo Enrique Salazar Torres
>>>>>> > <gsalazar@ime.usp.br> wrote:
>>>>>> >>
>>>>>> >> Hi Ryan:
>>>>>> >>
>>>>>> >> It's the simplest thing:
>>>>>> >> 1. Define your type of parameters for a type of Vertex (for example
>>>>>> >> EdgeListVertex)
>>>>>> >> 2. Implement compute method.
>>>>>> >>
>>>>>> >> From what I saw out there in the M/R world, Giraph provides the
>>>>>> >> simplest
>>>>>> >> way to work with graphs.
>>>>>> >>
>>>>>> >> Take a look at
>>>>>> >>
>>>>>> >>
>>>>>>https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Exampl
>>>>>>e
>>>>>> >> and use release 0.1
>>>>>> >> (http://www.apache.org/dyn/closer.cgi/incubator/giraph/)
>>>>>> >> because 0.2-SNAPSHOT is under heavy work.
>>>>>> >>
>>>>>> >> Hope this helps you.
>>>>>> >>
>>>>>> >> Gustavo
>>>>>> >>
>>>>>> >> On Fri, Feb 1, 2013 at 9:17 PM, Ryan Compton
>>>>>><compton.ryan@gmail.com>
>>>>>> >> wrote:
>>>>>> >>>
>>>>>> >>> I am having trouble understand what all the classes do and the
>>>>>> >>> documentation looks like it might be out of date. I searched around
>>>>>> >>> and found this: https://github.com/edaboussi/Giraph but it won't
>>>>>> >>> compile with 0.2, any suggestions?
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >>
>>>>>> >
>>>

Mime
View raw message