giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maja Kabiljo" <majakabi...@fb.com>
Subject Re: Review Request: GIRAPH-374 Multithreading in input split loading and compute
Date Wed, 17 Oct 2012 01:49:41 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7613/#review12498
-----------------------------------------------------------


Wow, Avery, this is a huge patch! :-)

Looks mostly good to me. I just didn't completely go through the parts which are supposed
to be copied from a place to another, like from NettyWorkerClient to NettyWorkerClientRequestProcessor,
BspServiceWorker to InputSplitsCallable, etc.

I'm wondering does this work correctly with out-of-core messages and graph. For graph at least
it seems all partitions are read prior to computation. Could you have a blocking list of partition
ids and then read them on the fly?

Have you verified that all the counters work correctly now, I believe we have no tests for
that?

You have the change from GIRAPH-357 in there, maybe that's the reason of small speed-up...

Anyway, great work, and the results seem good!


http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClient.java
<https://reviews.apache.org/r/7613/#comment26607>

    Should this part be thread-safe?



http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceMaster.java
<https://reviews.apache.org/r/7613/#comment26603>

    Why is this returning boolean now? (set to true and not used)


- Maja Kabiljo


On Oct. 17, 2012, 12:42 a.m., Avery Ching wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7613/
> -----------------------------------------------------------
> 
> (Updated Oct. 17, 2012, 12:42 a.m.)
> 
> 
> Review request for giraph and Maja Kabiljo.
> 
> 
> Description
> -------
> 
> Cleaned up the WorkerClient hierarchy
> - WorkerClientRequestProcessor is a request cache for every thread (input split loading
/ compute)
> - With RPC gone, got rid of ugly WorkerClientServer and NettyWorkerClientServer
> SendPartitionCache
> Made GraphState immutable for multi-threading
> Added multithreading for loading the input splits
> Added multithreading for compute
> Added thread-level debugging as an option
> Added additional testing on the number of vertices, edges
> Optimization on HashWorkerPartitioner to use CopyOnWriteArrayList instead of sychronized
list (this is a bottleneck)
> Added multithreaded TestPageRank test case
> 
> I ran the PageRankBenchmark on 20 workers with 10M vertices, 1B edges. All supersteps
are about the same time, so I just compared superstep 0 from every test. Compute performance
gains are quite nice (even a little faster than before with one thread). Actual gains will
depend heavily on the number of cores you have and possible parallelism of the application.
> 
> Trunk
> # threads  compute time (secs)   total time (secs)
> 1          89                    97.543
> 
> Multithreading
> 1          86.70094              92.477
> 2          50.41521              57.850
> 4          38.07716              50.246
> 8          38.63188              45.940
> 16         22.999943             48.607
> 24         23.649189             45.112
> 32         21.412325             44.201
> 
> We also saw similar gains on the input split loading on an internal app. Future work
can be to further improve the scalability of multithreading.
> 
> 
> This addresses bug GIRAPH-374.
>     https://issues.apache.org/jira/browse/GIRAPH-374
> 
> 
> Diffs
> -----
> 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph-formats-contrib/src/main/java/org/apache/giraph/io/hbase/HBaseVertexInputFormat.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/GiraphConfiguration.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedService.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceMaster.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/SendMessageCache.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/SendPartitionCache.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/WorkerClient.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/WorkerClientRequestProcessor.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/WorkerServer.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/ChannelRotater.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyClient.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyServer.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClient.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClientRequestProcessor.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClientServer.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerServer.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/handler/AddressRequestIdGenerator.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/examples/SimpleSuperstepVertex.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/AggregatorWrapper.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceMaster.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceWorker.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/ComputeCallable.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/FinishedSuperstepStats.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphMapper.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/GraphState.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/InputSplitsCallable.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/MutableVertex.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/SimpleMutableVertex.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/Vertex.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/partition/HashWorkerPartitioner.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/partition/PartitionStats.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/LoggerUtils.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/ProgressableUtils.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/Time.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/zk/ZooKeeperExt.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/BspCase.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/TestBspBasic.java
1398568 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/TestPageRank.java
PRE-CREATION 
>   http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/utils/MockUtils.java
1398568 
> 
> Diff: https://reviews.apache.org/r/7613/diff/
> 
> 
> Testing
> -------
> 
> mvn clean install
> pseudo-distributed unittests
> Running on internal FB apps as well.
> 
> 
> Thanks,
> 
> Avery Ching
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message