incubator-giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hyunsik Choi (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-12) Investigate communication improvements
Date Mon, 26 Sep 2011 15:02:28 GMT


Hyunsik Choi commented on GIRAPH-12:

I have thought about question 3. That is, how we can measure the memory usage while Giraph
is running.

Probably, the most basic way is to use the hadoop metrics (
However, this way needs to change _hadoop-metrics.properties_ file. So, it may be restricted
for most large clusters; e.g., Yahoo! cluster that Avery can access. 

If the above way is impossible, we can implement a thread class mimic to hadoop metric in
order to measure the memory usage on JVM periodically and sends that to a specific remote

What do you think about that?

> Investigate communication improvements
> --------------------------------------
>                 Key: GIRAPH-12
>                 URL:
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>            Reporter: Avery Ching
>            Assignee: Hyunsik Choi
>            Priority: Minor
>         Attachments: GIRAPH-12_1.patch, GIRAPH-12_2.patch
> Currently every worker will start up a thread to communicate with every other workers.
 Hadoop RPC is used for communication.  For instance if there are 400 workers, each worker
will create 400 threads.  This ends up using a lot of memory, even with the option  
> It would be good to investigate using frameworks like Netty or custom roll our own to
improve this situation.  By moving away from Hadoop RPC, we would also make compatibility
of different Hadoop versions easier.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message