giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Saltz <sal...@gmail.com>
Subject Re: Multiple sendMessage calls vs. sendMessageToMultipleEdges
Date Wed, 22 Oct 2014 23:11:06 GMT
Actually,  one more question: are there any disadvantages to enabling
oneToAllMessaging? Is there any reason not to do it by default?

Best,
Matthew
El 22/10/2014 23:28, "Matthew Saltz" <saltzm@gmail.com> escribió:

> Lukas,
>
> Thank you so much for the help. By 'the first class', you mean SendMessageToAllCache
> is not used unless I set the property to true, right? Because I actually do
> have giraph.oneToAllMsgSending=true, so if that means it's using
> SendMessageToAllCache  then everything makes much more sense. So I guess
> it makes sense then that case (b) that I mentioned that would be much
> faster than case (a)? I really appreciate it.  And do you have any ideas
> about the second question I asked? I think the answer is no but I'm kind of
> hoping it's not.
>
> Best,
> Matthew
>
>
>
> On Wed, Oct 22, 2014 at 11:16 PM, Lukas Nalezenec <
> lukas.nalezenec@firma.seznam.cz> wrote:
>
>>  Hi Matthew,
>>
>> See class SendMessageToAllCache. Its in the same directory as
>> SendMessageCache. The first class is not used by Giraph unless you set
>> property giraph.oneToAllMsgSending to true.
>>
>> Lukas
>>
>>
>> On 22.10.2014 20:10, Matthew Saltz wrote:
>>
>> Hi everyone,
>>
>> I have two questions:
>>
>>  *Question 1)* I'm using release 1.1.0 and I'm really confused about the
>> fact that I'm having massive performance differences in the following
>> scenario. I need to send one message from each vertex to a subset of its
>> neighbors (all that satisfy a certain condition). For that, I see two basic
>> options:
>>
>>     a) Loop over all edges, making a call to sendMessage(source, target)
>> whenever target satisfies a condition I want, reusing the same IntWritable
>> for the target vertex by calling target.set(_)
>>    b) Loop over all edges, building up an ArrayList (or whatever) of
>> targets that satisfy the condition, and calling
>> sendMessageToMultipleMessages(targets) at the end.
>>
>>  Surprisingly, I get much, much worse performance using option (a),
>> which I would think would be much faster. So I looked in the code and
>> eventually found my way to SendMessageCache
>> <https://github.com/apache/giraph/blob/release-1.1/giraph-core/src/main/java/org/apache/giraph/comm/SendMessageCache.java>,
>> where it turns out that sendMessageToMultipleMessages ->
>> sendMessageToAllRequest(Iterator, Message) actually just loops over the
>> iterator, repeatedly calling sendMessageRequest (which is what I thought I
>> was doing in scenario (a). I might have incorrectly traced the code though.
>> Can anyone tell me what might be going on? I'm really puzzled by this.
>>
>>  *Question 2) *Is there a good way of sending a vertex's adjacency list
>> to its neighbors, without building up your own copy of an adjacency list
>> and then sending that? I'm going through the Edge iterable and building an
>> ArrayPrimitiveWritable of ids but it would be nice if I could somehow
>> access the underlying data structure behind the iterable or just wrap the
>> iterable as a writable somehow.
>>
>>  Thanks so much for the help,
>> Matthew Saltz
>>
>>
>>
>>
>>
>

Mime
View raw message