incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ChiaHung Lin (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-503) Chainable computations for tault tolerance
Date Sun, 05 Feb 2012 09:33:53 GMT

    [ https://issues.apache.org/jira/browse/HAMA-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13200701#comment-13200701
] 

ChiaHung Lin commented on HAMA-503:
-----------------------------------

For the first method we can have a class like Configurator so the unit of execution step can
be composed through, for instance, configurator.add(superstep1).add(superstep2)... 

To reuse the unit within e.g. for loop, each superstep can be viewed as a command and a `For'
class, which extends the command interface, can be used to collect units to be computed. 

    For for = new For(conidtion);
    for.add(superstepN).add(superstepN1)...;
    configurator.add(superstep1).add(superstep2).add(for)...;

The reason to have this rework conceived is because the framework needs a way to recover back
to a working state. The original bsp() is more natural because users just write code, but
this has an issue that it would increase the difficulty in recovery process. For example,
the framework needs to understand/ parse the code within bsp() and add the checkpoint appropriately.
Also, if something goes wrong, users might feel difficult to figure out from which problems
stem because errors may happen in instrumentation code. However, there may have other better
mechanisms, we can use it instead.  


                
> Chainable computations for tault tolerance
> ------------------------------------------
>
>                 Key: HAMA-503
>                 URL: https://issues.apache.org/jira/browse/HAMA-503
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp
>    Affects Versions: 0.4.0
>            Reporter: Thomas Jungblut
>             Fix For: 0.5.0
>
>
> refactor bsp() in allowing checkpointed messages to be recovered. 
> ChiaHung Lin had a fancy idea in chaining superstep class to make the whole recovering
more convenient and less error prone, or at least possible.
> A user does not define a BSP anymore, instead he defines a single superstep inside of
a computation class. A user is able to chain these in a specific ordering. After each of this
computation the framework calls sync() and exchanges the messages.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message