incubator-giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-12) Investigate communication improvements
Date Wed, 28 Sep 2011 07:10:45 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116253#comment-13116253
] 

Dmitriy V. Ryaboy commented on GIRAPH-12:
-----------------------------------------

Julien has a nice post describing how one goes about detecting low memory conditions: https://techblug.wordpress.com/2011/07/21/detecting-low-memory-in-java-part-2/
. The first thing to do when this happens is probably to run through combiners to attempt
to free some memory.

Assuming we still need to do something with the messages, there are two approaches that come
to mind:

1) Spill to disk and keep track of spilled messages. This is going to cost us, but it'll make
it possible to make progress when otherwise an OOM error would occur.

2) Send the messages to intended recipients instead of spilling to disk. That will be speedier,
but does run the risk of the other side being out of memory and unable to accumulate, too.

Either way, this ticket is more about reworking the communication code than about memory improvements
-- we haven't measured how much memory individual threads are taking up, but I am betting
their impact is dwarfed by buffered messages and the in-memory graph segment we are working
on, so it would be surprising if we could substantially reduce the amount of memory by simply
switching to thread pools.

Let's open a separate JIRA to deal with message accumulation better, and consider this code
on merits other than memory footprint.
                
> Investigate communication improvements
> --------------------------------------
>
>                 Key: GIRAPH-12
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-12
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>         Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
>
>
> Currently every worker will start up a thread to communicate with every other workers.
 Hadoop RPC is used for communication.  For instance if there are 400 workers, each worker
will create 400 threads.  This ends up using a lot of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll our own to
improve this situation.  By moving away from Hadoop RPC, we would also make compatibility
of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message