incubator-giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Avery Ching (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-12) Investigate communication improvements
Date Thu, 29 Sep 2011 05:46:45 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117011#comment-13117011
] 

Avery Ching commented on GIRAPH-12:
-----------------------------------

If the default stack size is 1 MB, then for instance if you have 1024 workers, you are talking
about 1 GB just wasted for thread stack space per node.  The aggregate wasted memory would
be 1 GB * 1024 = 1 TB, that's a lot of memory =).

The issue is that many clusters (including Yahoo!'s) have are running only 32-bit JVMs.  So
if you are using 1 GB just for stack space, you only get so much left for heap (graph + messages).
 I think this should help quite a bit until GIRAPH-37 is taken on. 

Can you run the unittests against a real Hadoop instance as well?  Then I'd say +1, unless
someone disagrees.
                
> Investigate communication improvements
> --------------------------------------
>
>                 Key: GIRAPH-12
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-12
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>         Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
>
>
> Currently every worker will start up a thread to communicate with every other workers.
 Hadoop RPC is used for communication.  For instance if there are 400 workers, each worker
will create 400 threads.  This ends up using a lot of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll our own to
improve this situation.  By moving away from Hadoop RPC, we would also make compatibility
of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message