hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Fault Tolerance in Hama
Date Mon, 29 Feb 2016 23:05:23 GMT
Internally, the framework checkpoint the messages transferred among
BSP tasks during the BSP synchronization period.

If user want to checkpoint additional other things, user should use
HDFS APIs directly.

On Mon, Feb 29, 2016 at 11:15 PM, Behroz Sikander <behroz89@gmail.com> wrote:
> Ok. So, Hama does support FT but it is not thoroughly tested.
>
> Btw, how can a user checkpoint or Hama does that internally ? Is there any
> method exposed using BSPPeer ?
>
> Regards,
> Behroz
>
> On Mon, Feb 29, 2016 at 2:03 PM, Edward J. Yoon <edwardyoon@apache.org>
> wrote:
>
>> If I remember correctly, .. the framework change the job status as a
>> "recovering" first, and then simply restart all the tasks from the
>> last checkpoint. It works well but I only tested simple jobs (no
>> input/output) on my cluster (see also HAMA-973).
>>
>> To write perfect FT application from user side, every states in BSP
>> program need to be written on the disk. So, some people discussed and
>> introduced new Superstep API that provides more abstract interface
>> like Pregel.
>>
>>
>> On Mon, Feb 29, 2016 at 8:09 PM, Behroz Sikander <behroz89@gmail.com>
>> wrote:
>> > Hi,
>> > Just a quick question, is Hama fault tolerant ? What happens if a Hama
>> > tasks fails ?
>> >
>> > Regards,
>> > Behroz
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>>



-- 
Best Regards, Edward J. Yoon

Mime
View raw message