hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Jungblut <thomas.jungb...@googlemail.com>
Subject Re: Queries on Hama Usage
Date Fri, 23 Mar 2012 11:01:51 GMT

1: Each task has its own RPC Server, so you directly send to a task, rather
than to a groom.
2: BSPMessageBundle is a bundle of messages that are batched per
destination to improve the transfer speed. Combiners are there to do the
same purpose, so you return a "message-batch" when combining.
3: Hadoop is input-driven. That's from the functional programming where you
have an input list and apply functions like map and reduce on it.
BSP is not strongly functional related and we had no input before. For
several task no input is a valid input, e.G. realtime processing. However
you want to control the parallelization factor by controlling how many
tasks are launched.
So it is a mixture of backward compatibility and freedom of launching a few
tasks in a cluster without input.

Regarding your other mail, if you want to contribute parts of a mapreduce
version, feel free to code one. I have not scheduled it to any release
since this is just a "side-effect" example.

Hope I clarified it :)

Am 23. März 2012 11:47 schrieb Praveen Sripati <praveensripati@gmail.com>:

> Hi,
> 1. 0.4.0 introduced multiple tasks on groom servers. How does the framework
> send a message to a particular task on a groom server. If I am not wrong,
> BSPPeer.send() sends messages to all the tasks on a groom server and it is
> an overhead.
> 2. What is the difference between message combiners (0.4.0) and
> BSPMessageBundle (0.3.0)?
> 3. What is the significance of BSPJob.setNumBspTask()? I thought that in
> Hama the input will be split and a task will be spawned for each split in
> the groom server similar to Hadoop?
> Regards,
> Praveen

Thomas Jungblut
Berlin <thomas.jungblut@gmail.com>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message