incubator-giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Avery Ching (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (GIRAPH-12) Investigate communication improvements
Date Wed, 21 Sep 2011 17:57:16 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-12?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109715#comment-13109715
] 

Avery Ching edited comment on GIRAPH-12 at 9/21/11 5:56 PM:
------------------------------------------------------------

Nice results!  I'm certainly glad that performance seems comparable in this case.  A couple
of questions:

1)  In the patched version, did you stick to the 7 default cores?  Since you ran with 6 workers,
isn't one of the cores doing nothing?  Shouldn't the core count be limited by the number of
workers, even if the user specifies more?  Both for the core default and core max parameters?

2)  Is checkpointing turned off?  It appears not since superstep 2 is pretty long in comparison
to supersteps 0 and 1.  Probably would be best to also run tests without checkpointing to
isolate the communication performance.

3)  Any thoughts on how to show that the memory usage has actually gone down?  It should,
but we make sure somehow.

In a few days, I can hopefully help to run some tests at a large scale at Yahoo! using your
changes as well.

      was (Author: aching):
    Nice results!  I'm certainly glad that performance seems comparable even though we're
not using the same amount of threads.  A couple of questions:

1)  In the patched version, did you stick to the 7 default cores?  Since you ran with 6 workers,
isn't one of the cores doing nothing?  Shouldn't the core count be limited by the number of
workers, even if the user specifies more?  Both for the core default and core max parameters?

2)  Is checkpointing turned off?  It appears not since superstep 2 is pretty long in comparison
to supersteps 0 and 1.  Probably would be best to also run tests without checkpointing to
isolate the communication performance.

3)  Any thoughts on how to show that the memory usage has actually gone down?  It should,
but we make sure somehow.

In a few days, I can hopefully help to run some tests at a large scale at Yahoo! using your
changes as well.
  
> Investigate communication improvements
> --------------------------------------
>
>                 Key: GIRAPH-12
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-12
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>         Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
>
>
> Currently every worker will start up a thread to communicate with every other workers.
 Hadoop RPC is used for communication.  For instance if there are 400 workers, each worker
will create 400 threads.  This ends up using a lot of memory, even with the option  
> -Dmapred.child.java.opts="-Xss64k".  
> It would be good to investigate using frameworks like Netty or custom roll our own to
improve this situation.  By moving away from Hadoop RPC, we would also make compatibility
of different Hadoop versions easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message