hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Jungblut (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HAMA-557) Implement Checkpointing service in Hama
Date Wed, 01 Aug 2012 11:09:02 GMT

    [ https://issues.apache.org/jira/browse/HAMA-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426522#comment-13426522
] 

Thomas Jungblut edited comment on HAMA-557 at 8/1/12 11:07 AM:
---------------------------------------------------------------

Great, can we fix that the client don't print our superstep numbers less than the one before?

This seems to be very annoying.
+ A counter for restarted tasks (complete rollbacks and just a simple task restart) would
be cool ;)
                
      was (Author: thomas.jungblut):
    Great, can we fix that the client don't print our superstep numbers less than the one
before?

This seems to be very annoying.
                  
> Implement Checkpointing service in Hama
> ---------------------------------------
>
>                 Key: HAMA-557
>                 URL: https://issues.apache.org/jira/browse/HAMA-557
>             Project: Hama
>          Issue Type: Sub-task
>          Components: bsp core
>    Affects Versions: 0.6.0
>            Reporter: Suraj Menon
>            Assignee: Suraj Menon
>             Fix For: 0.6.0
>
>         Attachments: HAMA-505-557-610-611-v1.patch, HAMA-505-557-610-611-v2.patch, HAMA-557-ft-framework.patch
>
>
> Implement checkpointing service in Apache Hama. My patches for HAMA-533 and HAMA-534
are blocked on this.
> - Checkpointing should be done as messages are either sent or received. I prefer while
receiving messages, as we can achieve some parallelism with asynchronous messages. Please
comment if you differ.
> - BSPMaster should hold the checkpoint status for each task. Checkpoint status includes
superstep count and file information for which checkpointing is complete
> - MessageManager should notify Checkpointer of a new message at BSPPeer.
> - Implement/Reuse MessageBundle class as splitClass in BSPPeerImpl for recovery in initInput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message