hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-596) Optimize memory usage of graph job
Date Sat, 15 Sep 2012 14:30:07 GMT

    [ https://issues.apache.org/jira/browse/HAMA-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456419#comment-13456419
] 

Thomas Jungblut commented on HAMA-596:
--------------------------------------

Obviously, the hashmap that contains the vertex is consuming the most memory. 
If you dig deeper into the vertex, you see that the edges are consuming most of the space.
That was during partitioning.
After partitioning, it gets even worse, because of the real messaging going on.
At the end, for a 70mb textfile it used about 600mb of graph. That is still way too much.
And plus 400mb of messages. = 1gb. That is 14 times the size of the raw file.

So how can we cut down the cost of the hashmap and of the edges. Best would be to solve it
with HAMA-642, but I think this will degrade performance totally.

[1] http://wiki.apache.org/hama/WriteHamaGraphFile#Google_Web_dataset_.28local_mode.2C_pseudo_distributed_cluser.29
                
> Optimize memory usage of graph job
> ----------------------------------
>
>                 Key: HAMA-596
>                 URL: https://issues.apache.org/jira/browse/HAMA-596
>             Project: Hama
>          Issue Type: Improvement
>          Components: graph
>    Affects Versions: 0.5.0
>            Reporter: Edward J. Yoon
>            Assignee: Thomas Jungblut
>             Fix For: 0.6.0
>
>         Attachments: HAMA-596.patch
>
>
> This somewhat problematic.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message