Return-Path: X-Original-To: apmail-giraph-dev-archive@www.apache.org Delivered-To: apmail-giraph-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9DA54D0B2 for ; Wed, 17 Oct 2012 04:32:55 +0000 (UTC) Received: (qmail 98705 invoked by uid 500); 17 Oct 2012 04:32:55 -0000 Delivered-To: apmail-giraph-dev-archive@giraph.apache.org Received: (qmail 98632 invoked by uid 500); 17 Oct 2012 04:32:54 -0000 Mailing-List: contact dev-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list dev@giraph.apache.org Received: (qmail 98602 invoked by uid 99); 17 Oct 2012 04:32:54 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Oct 2012 04:32:54 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id CDEBE1C015C; Wed, 17 Oct 2012 04:32:51 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============6321774503827828289==" MIME-Version: 1.0 Subject: Re: Review Request: GIRAPH-374 Multithreading in input split loading and compute From: "Maja Kabiljo" To: "Maja Kabiljo" Cc: "Avery Ching" , "giraph" Date: Wed, 17 Oct 2012 04:32:51 -0000 Message-ID: <20121017043251.16939.82693@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org Auto-Submitted: auto-generated Sender: "Maja Kabiljo" X-ReviewGroup: giraph X-ReviewRequest-URL: https://reviews.apache.org/r/7613/ X-Sender: "Maja Kabiljo" References: <20121017034554.16939.19036@reviews.apache.org> In-Reply-To: <20121017034554.16939.19036@reviews.apache.org> Reply-To: "Maja Kabiljo" --===============6321774503827828289== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/7613/#review12503 ----------------------------------------------------------- Thanks Avery, +1 from me. - Maja Kabiljo On Oct. 17, 2012, 3:45 a.m., Avery Ching wrote: > = > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/7613/ > ----------------------------------------------------------- > = > (Updated Oct. 17, 2012, 3:45 a.m.) > = > = > Review request for giraph and Maja Kabiljo. > = > = > Description > ------- > = > Cleaned up the WorkerClient hierarchy > - WorkerClientRequestProcessor is a request cache for every thread (input= split loading / compute) > - With RPC gone, got rid of ugly WorkerClientServer and NettyWorkerClient= Server > SendPartitionCache > Made GraphState immutable for multi-threading > Added multithreading for loading the input splits > Added multithreading for compute > Added thread-level debugging as an option > Added additional testing on the number of vertices, edges > Optimization on HashWorkerPartitioner to use CopyOnWriteArrayList instead= of sychronized list (this is a bottleneck) > Added multithreaded TestPageRank test case > = > I ran the PageRankBenchmark on 20 workers with 10M vertices, 1B edges. Al= l supersteps are about the same time, so I just compared superstep 0 from e= very test. Compute performance gains are quite nice (even a little faster t= han before with one thread). Actual gains will depend heavily on the number= of cores you have and possible parallelism of the application. > = > Trunk > # threads compute time (secs) total time (secs) > 1 89 97.543 > = > Multithreading > 1 86.70094 92.477 > 2 50.41521 57.850 > 4 38.07716 50.246 > 8 38.63188 45.940 > 16 22.999943 48.607 > 24 23.649189 45.112 > 32 21.412325 44.201 > = > We also saw similar gains on the input split loading on an internal app. = Future work can be to further improve the scalability of multithreading. > = > = > This addresses bug GIRAPH-374. > https://issues.apache.org/jira/browse/GIRAPH-374 > = > = > Diffs > ----- > = > http://svn.apache.org/repos/asf/giraph/trunk/giraph-formats-contrib/src= /main/java/org/apache/giraph/io/hbase/HBaseVertexInputFormat.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/GiraphConfiguration.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/bsp/CentralizedService.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/bsp/CentralizedServiceMaster.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/bsp/CentralizedServiceWorker.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/SendMessageCache.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/SendPartitionCache.java PRE-CREATION = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/WorkerClient.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/WorkerClientRequestProcessor.java PRE-CREATION = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/WorkerServer.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/netty/ChannelRotater.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/netty/NettyClient.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/netty/NettyServer.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/netty/NettyWorkerClient.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/netty/NettyWorkerClientRequestProcessor.java PRE-CREATION = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/netty/NettyWorkerClientServer.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/netty/NettyWorkerServer.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/comm/netty/handler/AddressRequestIdGenerator.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/examples/SimpleSuperstepVertex.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/AggregatorWrapper.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/BspServiceMaster.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/BspServiceWorker.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/ComputeCallable.java PRE-CREATION = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/FinishedSuperstepStats.java PRE-CREATION = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/GraphMapper.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/GraphState.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/InputSplitsCallable.java PRE-CREATION = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/MutableVertex.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/SimpleMutableVertex.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/Vertex.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/partition/HashWorkerPartitioner.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/partition/PartitionStats.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/graph/partition/PartitionStore.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/utils/LoggerUtils.java PRE-CREATION = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/utils/ProgressableUtils.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/utils/Time.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/a= pache/giraph/zk/ZooKeeperExt.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/a= pache/giraph/BspCase.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/a= pache/giraph/TestBspBasic.java 1399043 = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/a= pache/giraph/TestPageRank.java PRE-CREATION = > http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/a= pache/giraph/utils/MockUtils.java 1399043 = > = > Diff: https://reviews.apache.org/r/7613/diff/ > = > = > Testing > ------- > = > mvn clean install > pseudo-distributed unittests > Running on internal FB apps as well. > = > = > Thanks, > = > Avery Ching > = > --===============6321774503827828289==--