Return-Path: X-Original-To: apmail-incubator-hama-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-hama-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6FD589639 for ; Tue, 7 Feb 2012 16:27:21 +0000 (UTC) Received: (qmail 33124 invoked by uid 500); 7 Feb 2012 16:27:21 -0000 Delivered-To: apmail-incubator-hama-dev-archive@incubator.apache.org Received: (qmail 33073 invoked by uid 500); 7 Feb 2012 16:27:20 -0000 Mailing-List: contact hama-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hama-dev@incubator.apache.org Delivered-To: mailing list hama-dev@incubator.apache.org Received: (qmail 33065 invoked by uid 99); 7 Feb 2012 16:27:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Feb 2012 16:27:20 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Feb 2012 16:27:19 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 30CA11A7FDC for ; Tue, 7 Feb 2012 16:26:59 +0000 (UTC) Date: Tue, 7 Feb 2012 16:26:59 +0000 (UTC) From: "Thomas Jungblut (Commented) (JIRA)" To: hama-dev@incubator.apache.org Message-ID: <882586981.8989.1328632019201.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1763960788.11496.1328366273928.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HAMA-503) Chainable computations for tault tolerance MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HAMA-503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202510#comment-13202510 ] Thomas Jungblut commented on HAMA-503: -------------------------------------- Hey Lin, I have made a bit of an "interface". For a superstep: https://github.com/thomasjungblut/thomasjungblut-common/blob/master/src/de/jungblut/bsp/ft/Superstep.java For the BSP that can handle faults: https://github.com/thomasjungblut/thomasjungblut-common/blob/master/src/de/jungblut/bsp/ft/FaultTolerantBSP.java The idea behind it is, that you init a task with a kind of start superstep. This is the index of the array of user defined supersteps. When fault happens, we inject the index where the superstep failed to the new task, so at runtime it will start computation from the given point. I have not really tried to make a real-world BSP example with it, so the Superstep class may not be a good interface. What do you think? > Chainable computations for tault tolerance > ------------------------------------------ > > Key: HAMA-503 > URL: https://issues.apache.org/jira/browse/HAMA-503 > Project: Hama > Issue Type: Sub-task > Components: bsp > Affects Versions: 0.4.0 > Reporter: Thomas Jungblut > Fix For: 0.5.0 > > > refactor bsp() in allowing checkpointed messages to be recovered. > ChiaHung Lin had a fancy idea in chaining superstep class to make the whole recovering more convenient and less error prone, or at least possible. > A user does not define a BSP anymore, instead he defines a single superstep inside of a computation class. A user is able to chain these in a specific ordering. After each of this computation the framework calls sync() and exchanges the messages. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira