Return-Path: X-Original-To: apmail-giraph-commits-archive@www.apache.org Delivered-To: apmail-giraph-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1E6FF100C2 for ; Wed, 13 Nov 2013 00:08:33 +0000 (UTC) Received: (qmail 73270 invoked by uid 500); 13 Nov 2013 00:08:33 -0000 Delivered-To: apmail-giraph-commits-archive@giraph.apache.org Received: (qmail 73170 invoked by uid 500); 13 Nov 2013 00:08:32 -0000 Mailing-List: contact commits-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list commits@giraph.apache.org Received: (qmail 73155 invoked by uid 99); 13 Nov 2013 00:08:32 -0000 Received: from tyr.zones.apache.org (HELO tyr.zones.apache.org) (140.211.11.114) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Nov 2013 00:08:32 +0000 Received: by tyr.zones.apache.org (Postfix, from userid 65534) id 03ED6823D16; Wed, 13 Nov 2013 00:08:31 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: claudio@apache.org To: commits@giraph.apache.org Date: Wed, 13 Nov 2013 00:08:31 -0000 Message-Id: <3b6bec18a449414e95d30019a50e92c7@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [1/2] git commit: updated refs/heads/trunk to 4dd605a Updated Branches: refs/heads/trunk 4a8b5c3f1 -> 4dd605a3a GIRAPH-749: Update documentation according to the new EdgeOutputFormat API Project: http://git-wip-us.apache.org/repos/asf/giraph/repo Commit: http://git-wip-us.apache.org/repos/asf/giraph/commit/0be911bc Tree: http://git-wip-us.apache.org/repos/asf/giraph/tree/0be911bc Diff: http://git-wip-us.apache.org/repos/asf/giraph/diff/0be911bc Branch: refs/heads/trunk Commit: 0be911bc430d736befce5b3a272be485e835460c Parents: 4a8b5c3 Author: Claudio Martella Authored: Wed Nov 13 01:06:15 2013 +0100 Committer: Claudio Martella Committed: Wed Nov 13 01:06:15 2013 +0100 ---------------------------------------------------------------------- src/site/xdoc/io.xml | 14 +++++++++++++- src/site/xdoc/quick_start.xml | 18 ++++++++++++++---- 2 files changed, 27 insertions(+), 5 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/giraph/blob/0be911bc/src/site/xdoc/io.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/io.xml b/src/site/xdoc/io.xml index 38c74e0..6276de1 100644 --- a/src/site/xdoc/io.xml +++ b/src/site/xdoc/io.xml @@ -48,7 +48,7 @@ To summarize, VertexInputFormat is usually used by itself, whereas EdgeInputFormat may be used in combination with VertexValueInputFormat.

- Output is always done on a per-vertex basis: a VertexOutputFormat will specify what data to write for each vertex. This usually means (some function of) the vertex value, but nothing prevents us from writing back the edges instead. + Output can be done both on a per-vertex and a per-edge basis: a VertexOutputFormat will specify what data to write for each vertex while EdgeOutputFormat will specify what data to write for each edge. This usually means (some function of) the vertex value, but nothing prevents us from writing back the edges instead.

Let's have a quick look at the base classes: @@ -71,6 +71,18 @@

  • EdgeReader<I, E>: the main methods are getCurrentSourceId(), which returns the source vertex id, and getCurrentEdge(), which returns an Edge<I, E> (i.e., the target vertex id, possibly with an edge value).
  • +
  • + VertexOutputFormat<I, V, E>: modeled based on the Hadoop OutputFormat class, this class is intended for output vertices and related edges after the computation. The createVertexWriter returns a VertexWriter to save the vertices. Additionally getOutputCommiter returns an OutputCommiter used to guarantee that the output process is correctly committed and checkOutputSpecs is used to check that the correct setup before running the computation. +
  • +
  • + VertexWriter<I, V, E>: this is where the user defines how to write vertices and possibly edges. The infrastructure just provides an initialize and a close method to deal with the initial and final part of the output. It also inherits SimpleVertexWriter#writeVertex which is the main function used to actually save the vertices. +
  • +
  • + EdgeOutputFormat<I, V, E>: modeled similar to VertexOutputFormat, this class is intended for output edges after the computation. The createEdgeWriter returns a EdgeWriter to save the edges. Additionally getOutputCommiter returns an OutputCommiter used to guarantee that the output process is correctly committed and checkOutputSpecs is used to check that the correct setup before running the computation. +
  • +
  • + EdgeWriter<I, V, E>: this class is similar to VertexWriter providing initialization and closing facilities. It is inteded to save edges and the main function that needs to be extended by the user for such purpose is writeEdge. +
  • http://git-wip-us.apache.org/repos/asf/giraph/blob/0be911bc/src/site/xdoc/quick_start.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/quick_start.xml b/src/site/xdoc/quick_start.xml index d704cba..cdef668 100644 --- a/src/site/xdoc/quick_start.xml +++ b/src/site/xdoc/quick_start.xml @@ -218,9 +218,10 @@ $HADOOP_HOME/bin/hadoop jar $GIRAPH_HOME/giraph-examples/target/giraph-examples-

    This will output the following:

    usage: org.apache.giraph.utils.ConfigurationUtils [-aw <arg>] [-c <arg>] - [-ca <arg>] [-cf <arg>] [-eif <arg>] [-eip <arg>] [-h] [-la] [-mc - <arg>] [-vof <arg>] [-op <arg>] [-pc <arg>] [-q] [-ve <arg>] [-vif - <arg>] [-vip <arg>] [-vvf <arg>] [-w <arg>] [-wc <arg>] [-yh <arg>] + [-ca <arg>] [-cf <arg>] [-eif <arg>] [-eip <arg>] [-eof <arg>] + [-esd <arg>] [-h] [-jyc <arg>] [-la] [-mc <arg>] [-op <arg>] [-pc + <arg>] [-q] [-th <arg>] [-ve <arg>] [-vif <arg>] [-vip <arg>] [-vof + <arg>] [-vsd <arg>] [-vvf <arg>] [-w <arg>] [-wc <arg>] [-yh <arg>] [-yj <arg>] -aw,--aggregatorWriter <arg> AggregatorWriter class -c,--messageCombiner <arg> Message messageCombiner class @@ -234,16 +235,25 @@ usage: org.apache.giraph.utils.ConfigurationUtils [-aw <arg>] [-c <arg& -eif,--edgeInputFormat <arg> Edge input format -eip,--edgeInputPath <arg> Edge input path -eof,--vertexOutputFormat <arg> Edge output format + -esd,--edgeSubDir <arg> subdirectory to be used for the + edge output -h,--help Help + -jyc,--jythonClass <arg> Jython class name, used if + computation passed in is a python + script -la,--listAlgorithms List supported algorithms -mc,--masterCompute <arg> MasterCompute class - -vof,--vertexOutputFormat <arg> Vertex output format -op,--outputPath <arg> Vertex output path -pc,--partitionClass <arg> Partition class -q,--quiet Quiet output + -th,--typesHolder <arg> Class that holds types. Needed + only if Computation is not set -ve,--outEdges <arg> Vertex edges class -vif,--vertexInputFormat <arg> Vertex input format -vip,--vertexInputPath <arg> Vertex input path + -vof,--vertexOutputFormat <arg> Vertex output format + -vsd,--vertexSubDir <arg> subdirectory to be used for the + vertex output -vvf,--vertexValueFactoryClass <arg> Vertex value factory class -w,--workers <arg> Number of workers -wc,--workerContext <arg> WorkerContext class