giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alessandro Presta" <alessan...@fb.com>
Subject Review Request: Range-partitioning and edge locality benchmark
Date Fri, 22 Feb 2013 18:25:53 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9562/
-----------------------------------------------------------

Review request for giraph.


Description
-------

Range-based partitioning can drastically reduce network communication when using a vertex
id space where close ids correspond to vertices likely to be connected.
We currently have an incomplete implementation of range-based partitioning that tries to be
very generic (allowing arbitrarily different partition sizes).
Talking about this with Avery, we thought that for now it's better to add a simpler version
(which tries to split as evenly as possible) and leave the current classes there in case someone
wants to implement a more complex logic.
A nice-to-have is also extending the pseudo-random formats to generate a required ratio of
partition-local edges, in order to estimate the impact of locality with benchmarks.


This addresses bug GIRAPH-535.
    https://issues.apache.org/jira/browse/GIRAPH-535


Diffs
-----

  giraph-core/src/main/java/org/apache/giraph/benchmark/AggregatorsBenchmark.java a82a9f88c7947980ae0c512981000edd494bb30f

  giraph-core/src/main/java/org/apache/giraph/benchmark/PageRankBenchmark.java 8341dce2d66e924c402c27b252db604b0b1f13be

  giraph-core/src/main/java/org/apache/giraph/benchmark/RandomMessageBenchmark.java d0d80afde3fa46b521e146dcc2c3bc5036f84052

  giraph-core/src/main/java/org/apache/giraph/benchmark/ShortestPathsBenchmark.java 52bbac4bc7f47efbb7b309d071cee6c6dd5364f6

  giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java e0aeba347564d25e39c27ed67cc1393b5ac62d99

  giraph-core/src/main/java/org/apache/giraph/io/formats/PseudoRandomEdgeInputFormat.java
90f814ceb960b1bc570106c06c7155f12b3acae4 
  giraph-core/src/main/java/org/apache/giraph/io/formats/PseudoRandomInputFormatConstants.java
PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/formats/PseudoRandomLocalEdgesHelper.java
PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/io/formats/PseudoRandomVertexInputFormat.java
f2a2c93890eb483367cb7cc37ca9b3ceeb5da2ac 
  giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java a9611d9b21a264da2dc7fdcbbd6a955a6f85fee0

  giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java bb6e38d64442bb65aeda70794905693b53126f80

  giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java bdbd467a352f27ac9cfacbd2933f0f8520ea60d9

  giraph-core/src/main/java/org/apache/giraph/partition/PartitionUtils.java e472ac64f312df815862bf364015761bd469d370

  giraph-core/src/main/java/org/apache/giraph/partition/SimpleIntRangePartitionerFactory.java
PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/partition/SimpleLongRangePartitionerFactory.java
PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/partition/SimpleRangeMasterPartitioner.java
PRE-CREATION 
  giraph-core/src/main/java/org/apache/giraph/partition/SimpleRangeWorkerPartitioner.java
PRE-CREATION 
  giraph-core/src/test/java/org/apache/giraph/io/TestJsonBase64Format.java f4d97a4aad2b151c0dce77716c32edccb541c0fb

  giraph-examples/src/test/java/org/apache/giraph/TestGraphPartitioner.java 2e12bdc3d8f8a63395b070bc7cfdb3e2474af8a1

  giraph-examples/src/test/java/org/apache/giraph/TestPartitionContext.java cdf1f65bff36a4a198a94f7ac75ef2ade8e66eba

  giraph-examples/src/test/java/org/apache/giraph/examples/TestPageRank.java 5e6159606c82a68bdc9db90139ff9ca32fdf56b1


Diff: https://reviews.apache.org/r/9562/diff/


Testing
-------

1) mvn verify

2) Example runs of PageRankBenchmark with 1B vertices, 100B edges, 200 workers, 32 compute
threads:
Range-partitioning, random edges (run with "-p 1")
—
Superstep 0 (milliseconds) 66,906
Superstep 1 (milliseconds) 72,256
Superstep 2 (milliseconds) 71,551
Range-partitioning, 75% local edges (run with "-Dpartition.vertexKeySpaceSize=1000000000 -p
1 -l 0.75")
—
Superstep 0 (milliseconds) 27,883
Superstep 1 (milliseconds) 34,693
Superstep 2 (milliseconds) 33,729


Thanks,

Alessandro Presta


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message