incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-503) Chainable computations for tault tolerance
Date Sun, 04 Mar 2012 12:43:58 GMT

    [ https://issues.apache.org/jira/browse/HAMA-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221871#comment-13221871
] 

Thomas Jungblut commented on HAMA-503:
--------------------------------------

bq.That's a minor issue, not a big problem. Just thought that may avoid confusing users with
the same function name.

If we find a better name, we can rename it. But since BSP is composed of computations and
syncs, I guess this is a valid name.

Glad you mention kmeans, I actually wanted to script it that way.
The assignment step is its own superstep then comes the updateCenters superstep.
In the assignment step you can never "escape" the while loop, instead in the updateCenter
step you override the "haltComputation" method, that can look like this:

{noformat}
 @Override
    protected boolean haltComputation(
        BSPPeer<NullWritable, NullWritable, Text, DoubleWritable> peer) {
      return converged == 0 || iterations > maxIterations;
    }
{noformat}

The big problem with kmeans is that there is a shared state between the supersteps (centers),
but since the "part" centers are send between the supersteps as messages, a failed task can
reconstruct the state from its messages. However a failed task cannot compare to the last
state of the centers since this was stored in RAM, in this case we can assume that either
they have converged (skip the step) or we compare to the input means from input split.

In one of the cleanup methods you would write the assignments onto disk.
                
> Chainable computations for tault tolerance
> ------------------------------------------
>
>                 Key: HAMA-503
>                 URL: https://issues.apache.org/jira/browse/HAMA-503
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp
>    Affects Versions: 0.4.0
>            Reporter: Thomas Jungblut
>            Assignee: Thomas Jungblut
>             Fix For: 0.5.0
>
>         Attachments: HAMA-503.patch
>
>
> refactor bsp() in allowing checkpointed messages to be recovered. 
> ChiaHung Lin had a fancy idea in chaining superstep class to make the whole recovering
more convenient and less error prone, or at least possible.
> A user does not define a BSP anymore, instead he defines a single superstep inside of
a computation class. A user is able to chain these in a specific ordering. After each of this
computation the framework calls sync() and exchanges the messages.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message