giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maja Kabiljo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-388) Improve the way we keep outgoing messages
Date Mon, 29 Oct 2012 18:46:12 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486247#comment-13486247
] 

Maja Kabiljo commented on GIRAPH-388:
-------------------------------------

Eli, thank you for your comments.

Because of message flushing, it rarely happens that we have more than one message for single
vertex in one request (which was also the reason why GIRAPH-357 was an improvement). So having
same vertexId several times in VertexIdMessageCollection is very rare. From my experiments
it turns out that in applications like Page Rank, the most of the time goes on creating and
querying all messages structures, both on sender and receiver side (in SendMessageCache, SimpleMessageStore,
SendWorkerMessagesRequest). I have a few more changes there which I'll be posting soon which
further improve performance.

As for the problem which you are addressing in GIRAPH-314, I think we won't be able to have
one implementation which has the best performance in all the cases. Overhead of having mappings
and other stuff around will be too expensive for another kind of application. So we'll probably
end up having something like that as an option. Would like to hear your thoughts on this.
                
> Improve the way we keep outgoing messages
> -----------------------------------------
>
>                 Key: GIRAPH-388
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-388
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
>         Attachments: GIRAPH-388.patch
>
>
> As per discussion on GIRAPH-357, in standard application chances that we get to use client-side
combiner are very low. I experimented with benefits which we can get from not having the client-side
combiner at all. It turns out that having a lot of maps in SendMessageCache, and then collection
inside each of them, really hurts the performance. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message