incubator-giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Avery Ching (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-45) Improve the way to keep outgoing messages
Date Fri, 16 Dec 2011 00:13:31 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13170613#comment-13170613
] 

Avery Ching commented on GIRAPH-45:
-----------------------------------

You might not need the BTree for indexing the destination vertices I think.  Couldn't we use
files to group the messages sent to the same partition?  If you simply dump all the received
<vertex id, messages> tuples to a file that is specific for a partition, we can simply
load all the tuples for a single partition prior to computing on the worker and assign them
to their destinations.  I'm a little concerned that using an in-memory data structure to keep
the message indices might be a little expensive (i.e. one BTree per vertex in your model if
I'm understanding correctly).

Regarding the "streaming", I am not proposing to change the BSP model.  I'm talking about
sending the messages as we go along during the computation.  Currently the messages are bulk
sent at the end of the superstep.  So rather than a bulk send, allow every worker to "stream"
out some bunch of messages when under some pressure, rather than everything at the end.

As far as detecting memory pressure, it looks like Runtime seems to do an okay job.  If anyone
knows anything better, that's cool too.  You can look at MemoryUtils#getRuntimeMemoryStats()
for a Runtime example.  We'll need to define limits for "memory pressure".
                
> Improve the way to keep outgoing messages
> -----------------------------------------
>
>                 Key: GIRAPH-45
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-45
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Hyunsik Choi
>            Assignee: Hyunsik Choi
>
> As discussed in GIRAPH-12(http://goo.gl/CE32U), I think that there is a potential problem
to cause out of memory when the rate of message generation is higher than the rate of message
flush (or network bandwidth).
> To overcome this problem, we need more eager strategy for message flushing or some approach
to spill messages into disk.
> The below link is Dmitriy's suggestion.
> https://issues.apache.org/jira/browse/GIRAPH-12?focusedCommentId=13116253&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13116253

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message