hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ηλίας Καπουράνης <ikapo...@csd.auth.gr>
Subject Re: Selective Aggregator execution
Date Mon, 30 Sep 2013 13:04:30 GMT
If there is some sort of distributed cache we can have the list there.

Στις 30/9/2013 1:13 μμ, ο/η Anastasis Andronidis έγραψε:
> I quote from the JIRA issue:
>> Ilias Kapouranis added a comment
>> I don't think it would be much of an issue.
>> 	• We have the List where we keep all the aggregators.
>> 	• When executeAggregator(int aggrIndex) is called, we move the aggrIndex to a
new List (say tempList) which keeps a pair (aggrIndex,aggrClass).
>> 	• At the end of the superstep, if tempList is empty then all the aggregators will
be executed, else only those which are in it.
>> 	• When all aggregators have finished, we move the pairs from tempList to the main
List and we put the aggregators to their previous indexes.
>> Hope this helps.
> I totally agree that this is the case in a higher level. The problem is that the implementation
is not that simple.
> Every node (a machine let's say) that is running in the distributed environment has a
BSP peer that runs as a local instance. In every BSP peer, vertices execute their code. This
means that when you ask for an aggregator not to run in a specific vertex, this invocation
happens only in 1 node. You need to sync all other nodes not to run the same aggregator and
in the end also skip the master aggregator. This is a little bit tricky, because it is very
depended on the implementation of the software you use (in this case Hama).
> Of course, if your code is exactly the same in every vertex, every peer will have a local
invoke of skipping their aggregators and no sync is needed. But as it's not always the case
we need to plan for the first scenario as well.
> If you have any questions, or something is not clear. Please reply.
> Cheers,
> Anastasis

View raw message