hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ChiaHung Lin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HAMA-411) Support checkpoint based on HDFS
Date Mon, 11 Jul 2011 02:31:59 GMT

    [ https://issues.apache.org/jira/browse/HAMA-411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13062843#comment-13062843
] 

ChiaHung Lin commented on HAMA-411:
-----------------------------------

With BSP model, we can have checkpoints when computation reaches the barrier synchronization,
which forms a consistent global state. So in the case where a user configures to have checkpoint
with every 3 superstep, once a task failure the computation can roll back to a global state
a few supersteps ago. 

The drawback of having such global checkpoint would be if involved processes in computation
increase, rolling back to a consistent global state is an overhead. 

> Support checkpoint based on HDFS
> --------------------------------
>
>                 Key: HAMA-411
>                 URL: https://issues.apache.org/jira/browse/HAMA-411
>             Project: Hama
>          Issue Type: New Feature
>          Components: bsp
>            Reporter: Thomas Jungblut
>
> We need to add checkpointing to Hama to deal with fault in future. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message