Return-Path: X-Original-To: apmail-giraph-user-archive@www.apache.org Delivered-To: apmail-giraph-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7894AE6AC for ; Fri, 22 Feb 2013 01:36:57 +0000 (UTC) Received: (qmail 73421 invoked by uid 500); 22 Feb 2013 01:36:57 -0000 Delivered-To: apmail-giraph-user-archive@giraph.apache.org Received: (qmail 73346 invoked by uid 500); 22 Feb 2013 01:36:57 -0000 Mailing-List: contact user-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@giraph.apache.org Delivered-To: mailing list user@giraph.apache.org Received: (qmail 73338 invoked by uid 99); 22 Feb 2013 01:36:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Feb 2013 01:36:57 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of compton.ryan@gmail.com designates 209.85.220.180 as permitted sender) Received: from [209.85.220.180] (HELO mail-vc0-f180.google.com) (209.85.220.180) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Feb 2013 01:36:52 +0000 Received: by mail-vc0-f180.google.com with SMTP id fo13so111247vcb.11 for ; Thu, 21 Feb 2013 17:36:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=JswGxWKxic5GnyIGgH80gLOStQzo/320KGDWA+Ea6l4=; b=GQoVn0DeT+gwDOQCxs9/sS97JFtGTzazCeKtbTqIxiUfaiebQ87krz1PayxpBojHAL Ugk2+NLn5CxBkHCnXotGdDvPlBs1er0r5MDwHHNvUv7GisqXPjEr5uVKaGAPqFwmUbg5 Chr1UgNkBpOjWsy3Secw+nWzUbykcYLTcIxLvSnW5zhY9V6ij6omkUGKDRkTMZAfb7db 5juN5+x9GH6unhshfbx0iFGyPp0aZG6l/VrpCaO8TvVNVV9WG32huqJlL11uTeZlsZHh n+enXQq7d/rFvWSxFQnXA0mof9tEWrdcdBhaGHg7iCjPz0rfg/XUYqOHxZva7yAFjzjG ez8Q== X-Received: by 10.52.32.69 with SMTP id g5mr198591vdi.81.1361496991867; Thu, 21 Feb 2013 17:36:31 -0800 (PST) MIME-Version: 1.0 Received: by 10.220.67.5 with HTTP; Thu, 21 Feb 2013 17:35:50 -0800 (PST) In-Reply-To: References: <1F592C080E9ACB4CB1C9EA1865BF3EFA05602748@PRN-MBX02-2.TheFacebook.com> From: Ryan Compton Date: Thu, 21 Feb 2013 17:35:50 -0800 Message-ID: Subject: Re: Where can I find a simple "Hello World" example for Giraph To: user@giraph.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Ok, ignoring everything about I/O I can run the below code. But then it hangs at 0% reduce forever -bash-3.2$ hadoop jar target/geocoderV2-1.0-SNAPSHOT-jar-with-dependencies.jar com.hrl.issl.osi.networks.HelloGiraph0p2 13/02/21 17:33:28 INFO graph.GiraphJob: run: Since checkpointing is disabled (default), do not allow any task retries (setting mapred.map.max.attempts = 0, old value = 4) 13/02/21 17:33:29 WARN bsp.BspOutputFormat: checkOutputSpecs: ImmutableOutputCommiter will not check anything 13/02/21 17:33:29 INFO mapred.JobClient: Running job: job_201302201124_0136 13/02/21 17:33:30 INFO mapred.JobClient: map 0% reduce 0% 13/02/21 17:34:03 INFO mapred.JobClient: map 74% reduce 0% 13/02/21 17:34:04 INFO mapred.JobClient: map 80% reduce 0% 13/02/21 17:34:06 INFO mapred.JobClient: map 100% reduce 0% What did I miss? import java.io.IOException; import org.apache.giraph.combiner.DoubleSumCombiner; import org.apache.giraph.conf.GiraphConfiguration; import org.apache.giraph.graph.GiraphJob; import org.apache.giraph.io.formats.PseudoRandomVertexInputFormat; import org.apache.giraph.vertex.EdgeListVertex; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.io.DoubleWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.util.Tool; import org.apache.hadoop.util.ToolRunner; import org.apache.log4j.Logger; /** * Default Pregel-style PageRank computation. */ public class HelloGiraph0p2 implements Tool { /** * Configuration from Configurable */ private Configuration conf; @Override public Configuration getConf() { return conf; } @Override public void setConf(Configuration conf) { this.conf = conf; } @Override public final int run(final String[] args) throws Exception { String name = getClass().getName(); GiraphJob job = new GiraphJob(getConf(), name); GiraphConfiguration configuration = job.getConfiguration(); configuration.setVertexClass(EdgeListVertexTwoPlusTwo.class); configuration.useUnsafeSerialization(true); configuration.setVertexCombinerClass(DoubleSumCombiner.class); configuration.setVertexInputFormatClass(PseudoRandomVertexInputFormat.class); configuration.setLong(PseudoRandomVertexInputFormat.AGGREGATE_VERTICES, Long.parseLong("1000") ); configuration.setLong(PseudoRandomVertexInputFormat.EDGES_PER_VERTEX,Long.parseLong("10")); int workers = Integer.parseInt("30"); configuration.setWorkerConfiguration(workers, workers, 100.0f); //configuration.setInt(PageRankComputation.SUPERSTEP_COUNT, Integer.parseInt("10")); boolean isVerbose = true; if (job.run(isVerbose)) { return 0; } else { return -1; } } int NUM_SUPERSTEPS = 5; public class EdgeListVertexTwoPlusTwo extends EdgeListVertex { @Override public void compute(Iterable messages) throws IOException { if (this.getSuperstep() >= 1) { double four = 2+2; this.setValue( new DoubleWritable(four)); } if (this.getSuperstep() < NUM_SUPERSTEPS) { this.sendMessageToAllEdges( new DoubleWritable(this.getValue().get())); } else { this.voteToHalt(); } } } /** * Execute the benchmark. * * @param args Typically the command line arguments. * @throws Exception Any exception from the computation. */ public static void main(final String[] args) throws Exception { System.exit(ToolRunner.run(new HelloGiraph0p2(), args)); } } On Thu, Feb 21, 2013 at 4:10 PM, Ryan Compton wrote: > Ok, I've been looking at the PageRankBenchmark. There's a lot going on > in there... > > It looks like the minimum amount of stuff I need to run a do-nothing > job is what I've got below. > > But now it's telling me that (and PageRankBenchmark doesn't have the > word "output" anywhere). > > 13/02/21 16:08:37 ERROR security.UserGroupInformation: > PriviledgedActionException as:rfcompton (auth:SIMPLE) > cause:org.apache.hadoop.mapred.InvalidJobConfException: Output > directory not set. > Exception in thread "main" > org.apache.hadoop.mapred.InvalidJobConfException: Output directory not > set. > at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:123) > at org.apache.giraph.io.formats.TextVertexOutputFormat.checkOutputSpecs(TextVertexOutputFormat.java:55) > at org.apache.giraph.bsp.BspOutputFormat.checkOutputSpecs(BspOutputFormat.java:48) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:872) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) > at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) > at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) > at org.apache.giraph.graph.GiraphJob.run(GiraphJob.java:282) > at com.hrl.issl.osi.networks.HelloGiraph0p2.run(HelloGiraph0p2.java:49) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) > at com.hrl.issl.osi.networks.HelloGiraph0p2.main(HelloGiraph0p2.java:55) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:197) > > > > import org.apache.giraph.conf.GiraphConfiguration; > import org.apache.giraph.graph.GiraphJob; > import org.apache.giraph.io.formats.JsonBase64VertexInputFormat; > import org.apache.giraph.io.formats.JsonBase64VertexOutputFormat; > import org.apache.giraph.io.formats.TextVertexInputFormat; > import org.apache.giraph.io.formats.TextVertexOutputFormat; > import org.apache.giraph.vertex.SimpleVertex; > import org.apache.hadoop.conf.Configuration; > import org.apache.hadoop.util.Tool; > import org.apache.hadoop.util.ToolRunner; > > /** > * > * Hello world giraph 0.2... > * > */ > public class HelloGiraph0p2 implements Tool { > /** Configuration */ > private Configuration conf; > > @Override > public void setConf(Configuration conf) { > this.conf = conf; > } > > @Override > public Configuration getConf() { > return conf; > } > > @Override > public int run(String[] arg0) throws Exception { > > GiraphJob job = new GiraphJob(getConf(), getClass().getName()); > > GiraphConfiguration configuration = job.getConfiguration(); > > configuration.setVertexClass(SimpleVertex.class); > > configuration.setVertexInputFormatClass(JsonBase64VertexInputFormat.class); > configuration.setVertexOutputFormatClass(JsonBase64VertexOutputFormat.class); > > configuration.setWorkerConfiguration(30, 30, 100.0f); > > return job.run(true) ? 0 : -1; > > } > > public static void main(String[] args) throws Exception { > > System.exit(ToolRunner.run(new HelloGiraph0p2(), args)); > } > > } > > On Thu, Feb 21, 2013 at 2:58 PM, Maja Kabiljo wrote: >> Hi Ryan, >> >> Before running the job, you need to set Vertex and input/output format >> classes on it. Please take a look at one of the benchmarks to see how to >> do that. Alternatively, you can try using GiraphRunner, where you pass >> these classes as command line arguments. >> >> Maja >> >> On 2/21/13 2:43 PM, "Ryan Compton" wrote: >> >>>I'm still struggling with this. I am trying to use 0.2, I dont have >>>permissions to edit core-site.xml >>> >>>I think this the most basic boiler plate code for a 0.2 Giraph >>>project, but I still can't run it. >>> >>>Exception in thread "main" java.lang.NullPointerException >>>at >>>org.apache.giraph.utils.ReflectionUtils.getTypeArguments(ReflectionUtils.j >>>ava:85) >>>at >>>org.apache.giraph.conf.GiraphClasses.readFromConf(GiraphClasses.java:117) >>>at org.apache.giraph.conf.GiraphClasses.(GiraphClasses.java:105) >>>at >>>org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.(Immutabl >>>eClassesGiraphConfiguration.java:84) >>>at >>>com.hrl.issl.osi.networks.HelloGiraph0p2.setConf(HelloGiraph0p2.java:34) >>>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:61) >>>at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >>>at com.hrl.issl.osi.networks.HelloGiraph0p2.main(HelloGiraph0p2.java:70) >>>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>at >>>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: >>>39) >>>at >>>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >>>pl.java:25) >>>at java.lang.reflect.Method.invoke(Method.java:597) >>>at org.apache.hadoop.util.RunJar.main(RunJar.java:197) >>> >>> >>> >>>package networks; >>> >>>import java.io.IOException; >>> >>>import org.apache.giraph.conf.ImmutableClassesGiraphConfiguration; >>>import org.apache.giraph.graph.GiraphJob; >>>import org.apache.giraph.vertex.EdgeListVertex; >>>import org.apache.hadoop.conf.Configuration; >>>import org.apache.hadoop.io.LongWritable; >>>import org.apache.hadoop.io.Text; >>>import org.apache.hadoop.util.Tool; >>>import org.apache.hadoop.util.ToolRunner; >>>import org.apache.log4j.Logger; >>> >>>/** >>> * >>> * Hello world giraph 0.2... >>> * >>> */ >>>public class HelloGiraph0p2 extends EdgeListVertex>>Text, Text> implements Tool { >>>/** Configuration */ >>>private ImmutableClassesGiraphConfiguration>>Text> conf; >>>/** Class logger */ >>>private static final Logger LOG = Logger.getLogger(HelloGiraph0p2.class); >>> >>>@Override >>>public void compute(Iterable arg0) throws IOException { >>>int four = 2+2; >>>} >>>@Override >>>public void setConf(Configuration configurationIn) { >>>this.conf = new ImmutableClassesGiraphConfiguration>>Text, Text, Text>(configurationIn); >>>return; >>>} >>>@Override >>>public ImmutableClassesGiraphConfiguration>>Text> getConf() { >>>return conf; >>>} >>> >>>/** >>>* >>>* ToolRunner run >>>* >>>* @param arg0 >>>* @return >>>* @throws Exception >>>*/ >>>@Override >>>public int run(String[] arg0) throws Exception { >>>GiraphJob job = new GiraphJob(getConf(), getClass().getName()); >>> >>>return job.run(true) ? 0 : -1; >>> >>>} >>>/** >>>* main... >>>* >>>* @param args >>>* @throws Exception >>>*/ >>>public static void main(String[] args) throws Exception { >>>System.exit(ToolRunner.run(new HelloGiraph0p2(), args)); >>>} >>> >>>} >>> >>> >>> >>>On Tue, Feb 5, 2013 at 4:24 AM, Gustavo Enrique Salazar Torres >>> wrote: >>>> Hi Ryan: >>>> >>>> I got that same error and discovered that I have to start a zookeeper >>>> instance. What I did was to download Zookeeper, write a new zoo.cfg file >>>> under conf directory with the following: >>>> >>>> dataDir=/home/user/zookeeper-3.4.5/tmp >>>> clientPort=2181 >>>> >>>> Also I added some lines in Hadoop's core-site.xml: >>>> >>>> giraph.zkList >>>> localhost:2181 >>>> >>>> >>>> Then I start Zookeper with bin/zkServer.sh start (also you will have to >>>> restart Hadoop) and then you can launch your Giraph Job. >>>> This setup worked for me (maybe there is an easiest way :D), hope it is >>>> useful. >>>> >>>> Best regards >>>> Gustavo >>>> >>>> >>>> On Mon, Feb 4, 2013 at 10:06 PM, Ryan Compton >>>> wrote: >>>>> >>>>> Ok great, thanks. I've been working with 0.1, I can get things to >>>>> compile (see below code) but they still are not running, the maps hang >>>>> (also below). I have no idea how to fix it, I may consider updating >>>>> that code I have that compiles to 0.2 and see if it works then. The >>>>> only difference I can see is that 0.2 requires everything have a >>>>> "message" >>>>> >>>>> -bash-3.2$ hadoop jar target/giraph-0.1-jar-with-dependencies.jar >>>>> com.SimpleGiraphSumEdgeWeights /user/rfcompton/giraphTSPInput >>>>> /user/rfcompton/giraphTSPOutput 3 3 >>>>> 13/02/04 15:48:23 INFO mapred.JobClient: Running job: >>>>> job_201301230932_1199 >>>>> 13/02/04 15:48:24 INFO mapred.JobClient: map 0% reduce 0% >>>>> 13/02/04 15:48:35 INFO mapred.JobClient: map 25% reduce 0% >>>>> 13/02/04 15:58:40 INFO mapred.JobClient: Task Id : >>>>> attempt_201301230932_1199_m_000003_0, Status : FAILED >>>>> java.lang.IllegalStateException: run: Caught an unrecoverable >>>>> exception setup: Offlining servers due to exception... >>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:641) >>>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) >>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) >>>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:270) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at javax.security.auth.Subject.doAs(Subject.java:396) >>>>> at >>>>> >>>>>org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformatio >>>>>n.java:1157) >>>>> at org.apache.hadoop.mapred.Child.main(Child.java:264) >>>>> Caused by: java.lang.RuntimeException: setup: Offlining servers due to >>>>> exception... >>>>> at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:466) >>>>> at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:630) >>>>> ... 7 more >>>>> Caused by: java.lang.IllegalStateException: setup: loadVertices failed >>>>> at >>>>> >>>>>org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:582 >>>>>) >>>>> at org.apache. >>>>> Task attempt_201301230932_1199_m_000003_0 failed to report status for >>>>> 600 seconds. Killing! >>>>> 13/02/04 15:58:43 INFO mapred.JobClient: Task Id : >>>>> attempt_201301230932_1199_m_000002_0, Status : FAILED >>>>> Task attempt_201301230932_1199_m_000002_0 failed to report status for >>>>> 600 seconds. Killing! >>>>> 13/02/04 15:58:43 INFO mapred.JobClient: Task Id : >>>>> attempt_201301230932_1199_m_000000_0, Status : FAILED >>>>> >>>>> >>>>> >>>>> >>>>>======================================================================== >>>>>========================= >>>>> This is the code I was using: >>>>> >>>>> import com.google.common.base.Preconditions; >>>>> import com.google.common.collect.Maps; >>>>> >>>>> import org.apache.giraph.comm.ArrayListWritable; >>>>> import org.apache.giraph.graph.BasicVertex; >>>>> import org.apache.giraph.graph.BspUtils; >>>>> import org.apache.giraph.graph.GiraphJob; >>>>> import org.apache.giraph.graph.EdgeListVertex; >>>>> import org.apache.giraph.graph.VertexReader; >>>>> import org.apache.giraph.graph.VertexWriter; >>>>> import org.apache.giraph.lib.TextVertexInputFormat; >>>>> import org.apache.giraph.lib.TextVertexInputFormat.TextVertexReader; >>>>> import org.apache.giraph.lib.TextVertexOutputFormat; >>>>> import org.apache.giraph.lib.TextVertexOutputFormat.TextVertexWriter; >>>>> import org.apache.hadoop.conf.Configuration; >>>>> import org.apache.hadoop.fs.Path; >>>>> import org.apache.hadoop.io.DoubleWritable; >>>>> import org.apache.hadoop.io.FloatWritable; >>>>> import org.apache.hadoop.io.IntWritable; >>>>> import org.apache.hadoop.io.LongWritable; >>>>> import org.apache.hadoop.io.Text; >>>>> import org.apache.hadoop.mapreduce.InputSplit; >>>>> import org.apache.hadoop.mapreduce.RecordReader; >>>>> import org.apache.hadoop.mapreduce.RecordWriter; >>>>> import org.apache.hadoop.mapreduce.TaskAttemptContext; >>>>> import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; >>>>> import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; >>>>> import org.apache.hadoop.util.Tool; >>>>> import org.apache.hadoop.util.ToolRunner; >>>>> import org.apache.log4j.Logger; >>>>> import org.json.JSONArray; >>>>> import org.json.JSONException; >>>>> >>>>> import java.io.IOException; >>>>> import java.util.ArrayList; >>>>> import java.util.Iterator; >>>>> import java.util.Map; >>>>> import java.util.StringTokenizer; >>>>> >>>>> /** >>>>> * Shows an example of a brute-force implementation of the Travelling >>>>> Salesman Problem >>>>> */ >>>>> public class SimpleGiraphSumEdgeWeights extends >>>>> EdgeListVertex, >>>>> FloatWritable, ArrayListWritable> implements Tool { >>>>> /** Configuration */ >>>>> private Configuration conf; >>>>> /** Class logger */ >>>>> private static final Logger LOG = >>>>> Logger.getLogger(SimpleGiraphSumEdgeWeights.class); >>>>> /** The shortest paths id */ >>>>> public static String SOURCE_ID = >>>>>"SimpleShortestPathsVertex.sourceId"; >>>>> /** Default shortest paths id */ >>>>> public static long SOURCE_ID_DEFAULT = 1; >>>>> >>>>> >>>>> /** >>>>> * Is this vertex the source id? >>>>> * >>>>> * @return True if the source id >>>>> */ >>>>> private boolean isSource() { >>>>> return (getVertexId().get() == >>>>> getContext().getConfiguration().getLong(SOURCE_ID, >>>>> >>>>>SOURCE_ID_DEFAULT)); >>>>> } >>>>> public class Message extends ArrayListWritable { >>>>> public Message() { >>>>> super(); >>>>> } >>>>> >>>>> @Override >>>>> public void setClass() { >>>>> // TODO Auto-generated method stub >>>>> } >>>>> } >>>>> public class Valeur extends ArrayListWritable { >>>>> public Valeur() { >>>>> super(); >>>>> } >>>>> >>>>> @Override >>>>> public void setClass() { >>>>> // TODO Auto-generated method stub >>>>> >>>>> } >>>>> } >>>>> >>>>> @Override >>>>> public void compute(Iterator> msgIterator) >>>>>{ >>>>> System.out.println("**** LAUNCHING COMPUTATION FOR VERTEX >>>>> "+this.getVertexId().get()+", SUPERSTEP "+this.getSuperstep()+" >>>>> ****"); >>>>> //We get the source ID, we will need it >>>>> String sourceID = new >>>>> LongWritable(this.getContext().getConfiguration().getLong(SOURCE_ID, >>>>> SOURCE_ID_DEFAULT)).toString(); >>>>> //We get the total number of verticles, and the current superstep >>>>> number, we will need it too >>>>> int J=1; >>>>> >>>>> voteToHalt(); >>>>> } >>>>> >>>>> /** >>>>> * VertexInputFormat that supports {@link >>>>>SimpleGiraphSumEdgeWeights} >>>>> */ >>>>> public static class SimpleShortestPathsVertexInputFormat extends >>>>> TextVertexInputFormat, >>>>> FloatWritable, >>>>> DoubleWritable> { >>>>> @Override >>>>> public VertexReader>>>> ArrayListWritable, FloatWritable, DoubleWritable> >>>>> createVertexReader(InputSplit split, >>>>> TaskAttemptContext context) >>>>> throws IOException { >>>>> return new SimpleShortestPathsVertexReader( >>>>> textInputFormat.createRecordReader(split, context)); >>>>> } >>>>> } >>>>> >>>>> /** >>>>> * VertexReader that supports {@link SimpleGiraphSumEdgeWeights}. >>>>>In >>>>> this >>>>> * case, the edge values are not used. The files should be in the >>>>> * following JSON format: >>>>> * JSONArray(, , >>>>> * JSONArray(JSONArray(, ), >>>>> ...)) >>>>> * Here is an example with vertex id 1, vertex value 4.3, and two >>>>> edges. >>>>> * First edge has a destination vertex 2, edge value 2.1. >>>>> * Second edge has a destination vertex 3, edge value 0.7. >>>>> * [1,4.3,[[2,2.1],[3,0.7]]] >>>>> */ >>>>> public static class SimpleShortestPathsVertexReader extends >>>>> TextVertexReader>>>> ArrayListWritable, FloatWritable, >>>>> DoubleWritable> { >>>>> >>>>> public SimpleShortestPathsVertexReader( >>>>> RecordReader lineRecordReader) { >>>>> super(lineRecordReader); >>>>> } >>>>> >>>>> public class Valeur extends ArrayListWritable { >>>>> public Valeur() { >>>>> super(); >>>>> } >>>>> >>>>> @Override >>>>> public void setClass() { >>>>> // TODO Auto-generated method stub >>>>> >>>>> } >>>>> } >>>>> >>>>> @Override >>>>> public BasicVertex>>>> ArrayListWritable, FloatWritable, >>>>> DoubleWritable> getCurrentVertex() >>>>> throws IOException, InterruptedException { >>>>> BasicVertex, >>>>> FloatWritable, >>>>> DoubleWritable> vertex = BspUtils.>>>> ArrayListWritable, FloatWritable, >>>>> >>>>> DoubleWritable>createVertex(getContext().getConfiguration()); >>>>> >>>>> Text line = getRecordReader().getCurrentValue(); >>>>> try { >>>>> JSONArray jsonVertex = new JSONArray(line.toString()); >>>>> LongWritable vertexId = new >>>>> LongWritable(jsonVertex.getLong(0)); >>>>> Valeur vertexValue = new Valeur(); >>>>> vertexValue.add(new >>>>> DoubleWritable(jsonVertex.getDouble(1))); >>>>> Map edges = >>>>> Maps.newHashMap(); >>>>> JSONArray jsonEdgeArray = jsonVertex.getJSONArray(2); >>>>> for (int i = 0; i < jsonEdgeArray.length(); ++i) { >>>>> JSONArray jsonEdge = jsonEdgeArray.getJSONArray(i); >>>>> edges.put(new LongWritable(jsonEdge.getLong(0)), >>>>> new FloatWritable((float) >>>>> jsonEdge.getDouble(1))); >>>>> } >>>>> vertex.initialize(vertexId, vertexValue, edges, null); >>>>> } catch (JSONException e) { >>>>> throw new IllegalArgumentException( >>>>> "next: Couldn't get vertex from line " + >>>>> line.toString(), e); >>>>> } >>>>> return vertex; >>>>> } >>>>> >>>>> @Override >>>>> public boolean nextVertex() throws IOException, >>>>> InterruptedException { >>>>> return getRecordReader().nextKeyValue(); >>>>> } >>>>> } >>>>> >>>>> /** >>>>> * VertexOutputFormat that supports {@link >>>>>SimpleGiraphSumEdgeWeights} >>>>> */ >>>>> public static class SimpleShortestPathsVertexOutputFormat extends >>>>> TextVertexOutputFormat>>>> ArrayListWritable, >>>>> FloatWritable> { >>>>> >>>>> @Override >>>>> public VertexWriter>>>> ArrayListWritable, FloatWritable> >>>>> createVertexWriter(TaskAttemptContext context) >>>>> throws IOException, InterruptedException { >>>>> RecordWriter recordWriter = >>>>> textOutputFormat.getRecordWriter(context); >>>>> return new SimpleShortestPathsVertexWriter(recordWriter); >>>>> } >>>>> } >>>>> >>>>> /** >>>>> * VertexWriter that supports {@link SimpleGiraphSumEdgeWeights} >>>>> */ >>>>> public static class SimpleShortestPathsVertexWriter extends >>>>> TextVertexWriter>>>> ArrayListWritable, FloatWritable> { >>>>> public SimpleShortestPathsVertexWriter( >>>>> RecordWriter lineRecordWriter) { >>>>> super(lineRecordWriter); >>>>> } >>>>> >>>>> @Override >>>>> public void writeVertex(BasicVertex>>>> ArrayListWritable, >>>>> FloatWritable, ?> vertex) >>>>> throws IOException, InterruptedException { >>>>> String sourceID = new >>>>> LongWritable(vertex.getContext().getConfiguration().getLong(SOURCE_ID, >>>>> SOURCE_ID_DEFAULT)).toString(); >>>>> JSONArray jsonVertex = new JSONArray(); >>>>> try { >>>>> jsonVertex.put(vertex.getVertexId().get()); >>>>> jsonVertex.put(vertex.getVertexValue().toString()); >>>>> JSONArray jsonEdgeArray = new JSONArray(); >>>>> for (LongWritable targetVertexId : vertex) { >>>>> JSONArray jsonEdge = new JSONArray(); >>>>> jsonEdge.put(targetVertexId.get()); >>>>> >>>>> jsonEdge.put(vertex.getEdgeValue(targetVertexId).get()); >>>>> jsonEdgeArray.put(jsonEdge); >>>>> } >>>>> jsonVertex.put(jsonEdgeArray); >>>>> } catch (JSONException e) { >>>>> throw new IllegalArgumentException( >>>>> "writeVertex: Couldn't write vertex " + vertex); >>>>> } >>>>> getRecordWriter().write(new Text(jsonVertex.toString()), >>>>> null); >>>>> } >>>>> } >>>>> >>>>> @Override >>>>> public Configuration getConf() { >>>>> return conf; >>>>> } >>>>> >>>>> @Override >>>>> public void setConf(Configuration conf) { >>>>> this.conf = conf; >>>>> } >>>>> >>>>> @Override >>>>> public int run(String[] argArray) throws Exception { >>>>> Preconditions.checkArgument(argArray.length == 4, >>>>> "run: Must have 4 arguments " + >>>>> " <# of workers>"); >>>>> >>>>> GiraphJob job = new GiraphJob(getConf(), getClass().getName()); >>>>> job.setVertexClass(getClass()); >>>>> job.setVertexInputFormatClass( >>>>> SimpleShortestPathsVertexInputFormat.class); >>>>> job.setVertexOutputFormatClass( >>>>> SimpleShortestPathsVertexOutputFormat.class); >>>>> FileInputFormat.addInputPath(job, new Path(argArray[0])); >>>>> FileOutputFormat.setOutputPath(job, new Path(argArray[1])); >>>>> >>>>> job.getConfiguration().setLong(SimpleGiraphSumEdgeWeights.SOURCE_ID, >>>>> Long.parseLong(argArray[2])); >>>>> job.setWorkerConfiguration(Integer.parseInt(argArray[3]), >>>>> Integer.parseInt(argArray[3]), >>>>> 100.0f); >>>>> >>>>> return job.run(true) ? 0 : -1; >>>>> } >>>>> >>>>> public static void main(String[] args) throws Exception { >>>>> System.exit(ToolRunner.run(new SimpleGiraphSumEdgeWeights(), >>>>> args)); >>>>> } >>>>> } >>>>> >>>>> On Fri, Feb 1, 2013 at 5:37 PM, Eli Reisman >>>>> wrote: >>>>> > Your best bet is to look over the two code components that users most >>>>> > often >>>>> > have to tweak or implement to write application code. That is, the >>>>> > Vertex >>>>> > implementations in examples/ and benchmark/ and the IO formats and >>>>> > related >>>>> > goodies like RecordReaders etc. that are mostly in the io/ dir. You >>>>> > might >>>>> > also take a look at the test suite for some quick ideas of how some >>>>>of >>>>> > the >>>>> > moving parts fit together. >>>>> > >>>>> > If you have real work to do with Giraph, you're going to need to get >>>>> > used to >>>>> > 0.2 and its API. The old API is both limited in what kind of data it >>>>> > will >>>>> > process, and not compatible into the future. The API we have now, >>>>>while >>>>> > evolving, is much much closer to being "final" than anything in 0.1 >>>>>And >>>>> > regardless, we now have (in hindsight) the sure knowledge that none >>>>>of >>>>> > the >>>>> > code you write for 0.1 will be portable into the future. >>>>> > >>>>> > I am first in line to be sorry about the state of the docs. There are >>>>> > efforts underway now to fix this. We all owe the users a collective >>>>> > apology >>>>> > for this. In lieu of proper apologies, feel free to ask any and all >>>>> > questions, no matter how dumb, they can't be as dumb as mine! The >>>>> > codebase >>>>> > is under heavy development and has a lot of confusingly-named moving >>>>> > parts >>>>> > so first get used to the plumbing an app writer has to know to >>>>>function, >>>>> > get >>>>> > some apps up and running, then dig into the framework code and it >>>>>will >>>>> > make >>>>> > more sense. >>>>> > >>>>> > One string to pull on to begin to look inside the framework is >>>>> > bin/giraph -> >>>>> > org.apache.giraph.GiraphRunner (hands job to Hadoop) -> ... -> >>>>> > o.a.g.graph.GraphMapper (is a mapper instance on a Hadoop cluster, >>>>> > started >>>>> > according to the Job sumbitted to Hadoop, but running our BSP code >>>>> > instead) >>>>> > -> o.a.g.graph.GraphTaskManager -> lots of places from there... >>>>> > >>>>> > The overarching BSP activity management for a single job run is >>>>> > basically >>>>> > all stemming out of GraphTaskManager now. You can look at setup() and >>>>> > execute() and get a decent idea of the major events in a job run, and >>>>> > where >>>>> > to look to get a better peek under the hood at any given task or >>>>>event. >>>>> > Good >>>>> > luck! >>>>> > >>>>> > >>>>> > On Fri, Feb 1, 2013 at 4:59 PM, Gustavo Enrique Salazar Torres >>>>> > wrote: >>>>> >> >>>>> >> Hi Ryan: >>>>> >> >>>>> >> It's the simplest thing: >>>>> >> 1. Define your type of parameters for a type of Vertex (for example >>>>> >> EdgeListVertex) >>>>> >> 2. Implement compute method. >>>>> >> >>>>> >> From what I saw out there in the M/R world, Giraph provides the >>>>> >> simplest >>>>> >> way to work with graphs. >>>>> >> >>>>> >> Take a look at >>>>> >> >>>>> >> >>>>>https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Exampl >>>>>e >>>>> >> and use release 0.1 >>>>> >> (http://www.apache.org/dyn/closer.cgi/incubator/giraph/) >>>>> >> because 0.2-SNAPSHOT is under heavy work. >>>>> >> >>>>> >> Hope this helps you. >>>>> >> >>>>> >> Gustavo >>>>> >> >>>>> >> On Fri, Feb 1, 2013 at 9:17 PM, Ryan Compton >>>>> >>>>> >> wrote: >>>>> >>> >>>>> >>> I am having trouble understand what all the classes do and the >>>>> >>> documentation looks like it might be out of date. I searched around >>>>> >>> and found this: https://github.com/edaboussi/Giraph but it won't >>>>> >>> compile with 0.2, any suggestions? >>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> > >>