hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@googlemail.com>
Subject Re: Performance on large cluster.
Date Tue, 25 Oct 2011 13:13:52 GMT
Hey Edward,

from what you tell you are speaking about some kind of speculative task
This is also possible in YARN, if you have free resources.

Overall, the running time of a task is highly dependend on how much work it
has to do. So splitting the data even will result in good performance in
every task.
Also we can reduce the time we spend in synchronization by using
asynchronous messaging.
In this case we send messages while computation phase and just transferring
the missing rest which has not yet been transfered in the sync phase.
This should result in a performance boost, especially in messaging heavy
jobs like SSSP.

2011/10/25 Edward J. Yoon <edwardyoon@apache.org>

> Hi,
> I heard that the task scheduling will be most important factor for
> high performance on large cluster since the barrier waits for the
> slowest task. What do you think about this?
> P.S., If user use YARN cluster, BSP task scheduling will be done by
> their resource management system.
> --
> Best Regards, Edward J. Yoon
> @eddieyoon

Thomas Jungblut
Berlin <thomas.jungblut@gmail.com>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message