giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Avery Ching" <avery.ch...@gmail.com>
Subject Re: Review Request: GIRAPH-374 Multithreading in input split loading and compute
Date Wed, 17 Oct 2012 00:42:59 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7613/
-----------------------------------------------------------

(Updated Oct. 17, 2012, 12:42 a.m.)


Review request for giraph and Maja Kabiljo.


Description
-------

Cleaned up the WorkerClient hierarchy
- WorkerClientRequestProcessor is a request cache for every thread (input split loading /
compute)
- With RPC gone, got rid of ugly WorkerClientServer and NettyWorkerClientServer
SendPartitionCache
Made GraphState immutable for multi-threading
Added multithreading for loading the input splits
Added multithreading for compute
Added thread-level debugging as an option
Added additional testing on the number of vertices, edges
Optimization on HashWorkerPartitioner to use CopyOnWriteArrayList instead of sychronized list
(this is a bottleneck)
Added multithreaded TestPageRank test case

I ran the PageRankBenchmark on 20 workers with 10M vertices, 1B edges. All supersteps are
about the same time, so I just compared superstep 0 from every test. Compute performance gains
are quite nice (even a little faster than before with one thread). Actual gains will depend
heavily on the number of cores you have and possible parallelism of the application.

Trunk
# threads  compute time (secs)   total time (secs)
1          89                    97.543

Multithreading
1          86.70094              92.477
2          50.41521              57.850
4          38.07716              50.246
8          38.63188              45.940
16         22.999943             48.607
24         23.649189             45.112
32         21.412325             44.201

We also saw similar gains on the input split loading on an internal app. Future work can be
to further improve the scalability of multithreading.


This addresses bug GIRAPH-374.
    https://issues.apache.org/jira/browse/GIRAPH-374


Diffs
-----

  http://svn.apache.org/repos/asf/giraph/trunk/giraph-formats-contrib/src/main/java/org/apache/giraph/io/hbase/HBaseVertexInputFormat.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/GiraphConfiguration.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedService.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/SendMessageCache.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/SendPartitionCache.java
PRE-CREATION 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/WorkerClient.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/WorkerClientRequestProcessor.java
PRE-CREATION 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/WorkerServer.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/ChannelRotater.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyClient.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyServer.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClient.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClientRequestProcessor.java
PRE-CREATION 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClientServer.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerServer.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/handler/AddressRequestIdGenerator.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/examples/SimpleSuperstepVertex.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/AggregatorWrapper.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceMaster.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/ComputeCallable.java
PRE-CREATION 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/FinishedSuperstepStats.java
PRE-CREATION 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphMapper.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphState.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/InputSplitsCallable.java
PRE-CREATION 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/MutableVertex.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/SimpleMutableVertex.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/Vertex.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/partition/HashWorkerPartitioner.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/partition/PartitionStats.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/LoggerUtils.java
PRE-CREATION 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/ProgressableUtils.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/Time.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/zk/ZooKeeperExt.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/BspCase.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/TestBspBasic.java
1398568 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/TestPageRank.java
PRE-CREATION 
  http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/utils/MockUtils.java
1398568 

Diff: https://reviews.apache.org/r/7613/diff/


Testing
-------

mvn clean install
pseudo-distributed unittests
Running on internal FB apps as well.


Thanks,

Avery Ching


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message