hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-704) Optimization of memory usage during message processing
Date Tue, 19 Feb 2013 06:45:12 GMT

    [ https://issues.apache.org/jira/browse/HAMA-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581054#comment-13581054
] 

Thomas Jungblut commented on HAMA-704:
--------------------------------------

bq.The problem is that we don't know the Max of Map<Vertex, List<M>>.

You can certainly run every vertex id (in the vertexmap) through the partitioner and check
if the peer index is the same. Once known you know that this is the max, so you can track
throughout the supersteps. But vertices usually don't add to the map after partitioning, you
can make the map/list/set unmodifiable.
If you step over some vertices that doesn't belong there, kudos to the partitioning voodoo.

In case of the messaging map, it is discarded every superstep so I don't think that creates
memory leaks- at least lots of garbage (only if the vertex implementation itself starts to
buffer messages on it's own so the references can't escape the GC).

bq."each Task loads 256MB vertices and uses 30GB memory"

How is the outlink distribution? I don't think that this is a scenario of typical powerlaw
distributions here. Usually the factor _was_ x10, so 256mb sequence file create 2gigs of memory.
Which is huge, bit still acceptable per task.
                
> Optimization of memory usage during message processing
> ------------------------------------------------------
>
>                 Key: HAMA-704
>                 URL: https://issues.apache.org/jira/browse/HAMA-704
>             Project: Hama
>          Issue Type: Improvement
>          Components: graph
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>            Priority: Critical
>             Fix For: 0.6.1
>
>         Attachments: HAMA-704.patch-v1, hama-704_v05.patch, localdisk.patch, mytest.patch,
patch.txt, patch.txt, removeMsgMap.patch
>
>
> <vertex, message> map seems consume a lot of memory. We should figure out an efficient
way to reduce memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message