giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eli Reisman <apache.mail...@gmail.com>
Subject Re: Review Request: More SendMessageCache improvements
Date Tue, 06 Nov 2012 18:44:36 GMT
Hey out of curiosity as I have not been able to run any cluster jobs myself
on the new trunk for some time now:

How much more memory does a job use now compared to before the recent
changes? Has the new multithreading, lists of messages, etc. made a
noticable dent in the heap? I realize heap is a worthwhile tradeoff for
speed for you guys either way, just curious.

On Tue, Nov 6, 2012 at 10:38 AM, Eli Reisman <apache.mailbox@gmail.com>wrote:

> Hey Maja, you know when I redid these I was trying to minimize the message
> duplication as you know, and so I checked for existing references before
> putting a new (duplicate) copy of the same VertexId or Message (etc) into
> the maps. This slowed things down a tad (not too bad for us) but saved us
> some memory (less with more frequent flushing)
>
> I think the effects of this stuff must be more visible to you guys because
> you don't get any speed benefit from the increased parallelism of lots of
> workers as I was getting. Maybe the multithreading has helped make up for
> that for you guys? But anyway my point is: it could be the maps aren't the
> slowdown, but all the checking I had in there. So if you ever hit a
> situation where the maps seem like a better solution, try them out again
> (but without the checking) and see if the speed comes back?
>
> Nice work!
>
>
>
> On Mon, Nov 5, 2012 at 2:59 PM, Maja Kabiljo <majakabiljo@fb.com> wrote:
>
>>
>> -----------------------------------------------------------
>> This is an automatically generated e-mail. To reply, visit:
>> https://reviews.apache.org/r/7883/
>> -----------------------------------------------------------
>>
>> Review request for giraph.
>>
>>
>> Description
>> -------
>>
>> Having a lot of maps in SendMessageCache still makes it slow, so here is
>> another step towards making it faster.
>>
>> Here are the results on PageRankBenchmark:
>> 10m vertices, 100 edges per vertex, 10 workers
>> 1 thread: Total superstep time: 54s -> 35s
>> 20m vertices, 100 edges per vertex, 12 workers
>> 4 threads: Computation time: 26s -> 17s
>>
>> Also tested on one of our real applications, speedup was a bit smaller,
>> about 20-25%.
>>
>>
>> This addresses bug GIRAPH-404.
>>     https://issues.apache.org/jira/browse/GIRAPH-404
>>
>>
>> Diffs
>> -----
>>
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/bsp/CentralizedServiceWorker.java1405925
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/SendMessageCache.java1405925
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/VertexIdMessageCollection.java1405925
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/messages/DiskBackedMessageStoreByPartition.java1405925
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/messages/SimpleMessageStore.java1405925
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/netty/NettyWorkerClientRequestProcessor.java1405925
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/comm/requests/SendWorkerMessagesRequest.java1405925
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/graph/BspServiceWorker.java1405925
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/PairList.javaPRE-CREATION
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/main/java/org/apache/giraph/utils/PairListWritable.javaPRE-CREATION
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/comm/RequestFailureTest.java1405925
>>
>> http://svn.apache.org/repos/asf/giraph/trunk/giraph/src/test/java/org/apache/giraph/comm/RequestTest.java1405925
>>
>> Diff: https://reviews.apache.org/r/7883/diff/
>>
>>
>> Testing
>> -------
>>
>> mvn clean verify, pseudo-distributed tests
>> PageRankBenchmark (results above)
>> Tried it out with a real application
>>
>>
>> Thanks,
>>
>> Maja Kabiljo
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message