giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alessandro Presta" <alessan...@fb.com>
Subject Re: Review Request: Multigraph support.
Date Thu, 13 Dec 2012 02:57:46 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8560/
-----------------------------------------------------------

(Updated Dec. 13, 2012, 2:57 a.m.)


Review request for giraph.


Changes
-------

Fixed some Checkstyle and Findbugs errors.


Summary (updated)
-----------------

Multigraph support.


Description
-------

This patch introduces support for multigraphs in Giraph, by:
- changing Vertex#initialize() and Vertex#setEdges() to take a collection of edges instead
of a map
- adding two vertex implementations (MultiGraphEdgeListVertex and MultiGraphRepresentativeVertex)
where addEdge() only appends the edge, instead of checking for duplicates

This is not only useful for representing multigraphs, but also for implementing addEdge()
efficiently (critical for edge-based input when the out-degree can be really large).

Other minor changes:
- changed MutableVertex#addEdge() to get an Edge instead of an id and value, mainly for consistency
with other code
- added PseudoRandomEdgeInputFormat and extended PageRankBenchmark to accept edge-based input
- various fixes on the way

I ran several benchmarks to compare this patch with the current version, and edge-based input
with vertex-based input:

Graph: 10M vertices, 1B edges, 10 workers

EdgeListVertex + VertexInputFormat, current version:
Total (milliseconds) 118,377
Input superstep (milliseconds) 44,041
Superstep 0 (milliseconds) 20,970
Superstep 1 (milliseconds) 23,731
Superstep 2 (milliseconds) 24,269

EdgeListVertex + VertexInputFormat, patched version:
Total (milliseconds) 116,298
Input superstep (milliseconds) 40,441
Superstep 0 (milliseconds) 22,444
Superstep 1 (milliseconds) 24,164
Superstep 2 (milliseconds) 25,036

MultiGraphEdgeListVertex + EdgeInputFormat, patched version:
Total (milliseconds) 148,905
Input superstep (milliseconds) 72,425
Superstep 0 (milliseconds) 23,684
Superstep 1 (milliseconds) 25,580
Superstep 2 (milliseconds) 22,456

RepresentativeVertex + VertexInputFormat, current version:
Total (milliseconds) 111,450
Input superstep (milliseconds) 39,301
Superstep 0 (milliseconds) 21,615
Superstep 1 (milliseconds) 22,213
Superstep 2 (milliseconds) 21,840

RepresentativeVertex + VertexInputFormat, patched version:
Total (milliseconds) 106,512
Input superstep (milliseconds) 38,142
Superstep 0 (milliseconds) 20,812
Superstep 1 (milliseconds) 22,906
Superstep 2 (milliseconds) 21,250

MultiGraphRepresentativeVertex + EdgeInputFormat, patched version:
Total (milliseconds) 143,831
Input superstep (milliseconds) 75,030
Superstep 0 (milliseconds) 20,661
Superstep 1 (milliseconds) 21,406
Superstep 2 (milliseconds) 22,456


This addresses bug GIRAPH-141.
    https://issues.apache.org/jira/browse/GIRAPH-141


Diffs (updated)
-----

  /trunk/giraph-formats-contrib/src/main/java/org/apache/giraph/io/hcatalog/HCatalogVertexInputFormat.java
1419623 
  /trunk/giraph-formats-contrib/src/test/java/org/apache/giraph/io/accumulo/edgemarker/AccumuloEdgeInputFormat.java
1419623 
  /trunk/giraph-formats-contrib/src/test/java/org/apache/giraph/io/hbase/edgemarker/TableEdgeInputFormat.java
1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/benchmark/MultiGraphEdgeListVertexPageRankBenchmark.java
PRE-CREATION 
  /trunk/giraph/src/main/java/org/apache/giraph/benchmark/MultiGraphRepresentativeVertexPageRankBenchmark.java
PRE-CREATION 
  /trunk/giraph/src/main/java/org/apache/giraph/benchmark/PageRankBenchmark.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/examples/LongDoubleFloatDoubleTextInputFormat.java
1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/examples/NormalizingLongDoubleFloatDoubleTextInputFormat.java
1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/examples/SimpleCheckpointVertex.java 1419623

  /trunk/giraph/src/main/java/org/apache/giraph/examples/SimplePageRankVertex.java 1419623

  /trunk/giraph/src/main/java/org/apache/giraph/examples/SimpleSuperstepVertex.java 1419623

  /trunk/giraph/src/main/java/org/apache/giraph/examples/VerifyMessage.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/DefaultVertexResolver.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/EdgeListVertex.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/EdgeListVertexBase.java PRE-CREATION

  /trunk/giraph/src/main/java/org/apache/giraph/graph/HashMapVertex.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/IntIntNullIntVertex.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/IntNullNullNullVertex.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/LongDoubleFloatDoubleEdgeListVertex.java
1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/LongDoubleFloatDoubleVertex.java 1419623

  /trunk/giraph/src/main/java/org/apache/giraph/graph/LongDoubleNullDoubleVertex.java 1419623

  /trunk/giraph/src/main/java/org/apache/giraph/graph/MultiGraphEdgeListVertex.java PRE-CREATION

  /trunk/giraph/src/main/java/org/apache/giraph/graph/MultiGraphRepresentativeVertex.java
PRE-CREATION 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/MutableVertex.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/RepresentativeVertex.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/RepresentativeVertexBase.java PRE-CREATION

  /trunk/giraph/src/main/java/org/apache/giraph/graph/SimpleMutableVertex.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/SimpleVertex.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/graph/Vertex.java 1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/io/AdjacencyListTextVertexInputFormat.java
1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/io/IntIntNullIntTextInputFormat.java 1419623

  /trunk/giraph/src/main/java/org/apache/giraph/io/IntNullNullNullTextInputFormat.java 1419623

  /trunk/giraph/src/main/java/org/apache/giraph/io/JsonBase64VertexInputFormat.java 1419623

  /trunk/giraph/src/main/java/org/apache/giraph/io/JsonLongDoubleFloatDoubleVertexInputFormat.java
1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/io/LongDoubleDoubleAdjacencyListVertexInputFormat.java
1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/io/PseudoRandomEdgeInputFormat.java PRE-CREATION

  /trunk/giraph/src/main/java/org/apache/giraph/io/PseudoRandomVertexInputFormat.java 1419623

  /trunk/giraph/src/main/java/org/apache/giraph/io/TextDoubleDoubleAdjacencyListVertexInputFormat.java
1419623 
  /trunk/giraph/src/main/java/org/apache/giraph/io/TextVertexInputFormat.java 1419623 
  /trunk/giraph/src/test/java/org/apache/giraph/examples/SimpleShortestPathsVertexTest.java
1419623 
  /trunk/giraph/src/test/java/org/apache/giraph/examples/SimpleTriangleClosingVertexTest.java
1419623 
  /trunk/giraph/src/test/java/org/apache/giraph/graph/TestIntIntNullIntVertex.java 1419623

  /trunk/giraph/src/test/java/org/apache/giraph/graph/TestMutableVertex.java 1419623 
  /trunk/giraph/src/test/java/org/apache/giraph/graph/partition/TestGiraphTransferRegulator.java
1419623 
  /trunk/giraph/src/test/java/org/apache/giraph/io/TestTextDoubleDoubleAdjacencyListVertexInputFormat.java
1419623 

Diff: https://reviews.apache.org/r/8560/diff/


Testing
-------

mvn verify and the benchmarks described


Thanks,

Alessandro Presta


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message