incubator-giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyunsik Choi (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-12) Investigate communication improvements
Date Thu, 29 Sep 2011 02:16:45 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13116961#comment-13116961
] 

Hyunsik Choi commented on GIRAPH-12:
------------------------------------

Avery,

Thank you for your review. You are right. Runtime's totalMem() and freeMem() methods doesn't
measure stack sizes. I'm sure of it after testing the below code.

https://gist.github.com/1249761

I have looked for how to measure the stack size of a java application. I could not find about
that. Still, I'm not sure how to show that thread stack memory is reduced by the thread pool
approach. Now, your way seems a only method to prove them.

However, I'm curious to know how much thread overhead is in terms of memory consumption. Before
I try your approach. I conducted some simple experiments.

I used the above source code to investigate the memory usage of threads. This is executed
on a machine with intel i3, ubuntu 11.10 (64bit), and 8G memory. I measure their memory by
using 'top'. 'top' shows several columns including VIRT and RES, and SHR. We only need to
focus RES, resident memory. RES includes all resident memory usages, such as heap and stack.
I could know this from this page (http://goo.gl/JE7fD).

Firstly, I executed the above code with 1000 threads and without a jvm option '-Xss'. Accoring
to this page (http://goo.gl/sz2qM), the default stack size 'Xss' is 1024k on the jvm of 64bit
linux. After all threads are created, I executed 'top' to print the memory usages as follows:

1k threads with default thread stack size.
{noformat}
                      VIRT   RES SHR
9163 hyunsik   20   0 3366m  30m 8296 S   18  0.4   0:01.52 java
{noformat}

2k threads with default thread stack size.
{noformat}
                       VIRT   RES SHR
11223 hyunsik   20   0 4434m  46m 8340 S   40  0.6   0:04.11 java
{noformat}

With 1k and 2k threads, that program consumes only 30 and 46 mega bytes respectively. The
memory usage of threads are smaller than I expected. I wonder if thread stack size is the
main cause of the memory problem that we have faced.

Besides, the default stack size is 1024k. The thread stack size seems to not affect RES. I
had more tests with 'Xss' in order to investigate more the thread stack size.

1k threads with '-Xss4096k'.
{noformat}
28301 hyunsik   20   0 6380m  30m 8292 S   17  0.4   0:05.25 java
{noformat}

2k threads with '-Xss4096k'
{noformat}
29326 hyunsik   20   0 10.1g  46m 8300 S   38  0.6   0:03.42 java
{noformat}

VIRT surely is affected by '-Xss', but RES is not. 'Xss' seems the maximum stack size of each
thread because it doesn't affect RES.

What do you think about that?
                
> Investigate communication improvements
> --------------------------------------
>
>                 Key: GIRAPH-12
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-12
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>         Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
>
>
> Currently every worker will start up a thread to communicate with every other workers.
 Hadoop RPC is used for communication.  For instance if there are 400 workers, each worker
will create 400 threads.  This ends up using a lot of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll our own to
improve this situation.  By moving away from Hadoop RPC, we would also make compatibility
of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message