hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hassan Asghar <haxxanasg...@gmail.com>
Subject Re: how to implement checkpointing and fault tolerance based fault tolerance in hadoop
Date Tue, 27 Jun 2017 19:30:41 GMT
Thanks, I'll check it out.

On Tue, 27 Jun 2017 at 10:22 PM, Jasson Chenwei <ynjassionchen@gmail.com>
wrote:

> hi, Hassan
>
> Actually, I didn't find any tryout for implementing
> checkpoint&&replication based fault tolerance in the community yet.
> I think the reason is the overhead is much larger than the gain, given the
> fact that each map task only runs for 30s~40s. However, I have ever read
> some academic papers that propose fault tolerance approach.
>
> Check out this:
> http://ieeexplore.ieee.org/document/7161515/\
>
>
>
> Wei
>
>
> On Mon, Jun 26, 2017 at 9:16 PM, Hassan Asghar <haxxanasghar@gmail.com>
> wrote:
>
>> Thank you for the clarification. i am talking about fault toldrance in
>> map reduce, is there any algorithm implement in it?? For Checkpointing and
>> Replication???
>>
>> On Mon, 26 Jun 2017 at 3:32 AM, Jasson Chenwei <ynjassionchen@gmail.com>
>> wrote:
>>
>>> hi, Hassan.
>>>
>>> First, YARN( the scheduler) doesn't provide any fault tolerance
>>> techniques, But applications(e.g., MapReduce or Spark) do.
>>>
>>> For MapReduce, its fault tolerance is based on speculative execution,
>>> which simply re-launched failed tasks.
>>> For Spark, it does provide checkpoint API which users can leverage in
>>> their code by designating which RDDs should be checkpointed at what time.
>>> For more details about Spark checkpoint, you can refer to Spark docs. If
>>> checkpoint is not enabled, it fails back to speculative execution, the same
>>> as MapReduce.
>>>
>>>
>>>
>>> Wei
>>>
>>> On Sat, Jun 24, 2017 at 8:44 PM, Hassan Asghar <haxxanasghar@gmail.com>
>>> wrote:
>>>
>>>> Dear users,
>>>>
>>>> I am performing a comparative study on different fault tolerance
>>>> techniques, so, my question is that, how can we implement checkpointing
>>>> based and replication based fault tolerance in hadoop, is there any patch
>>>> already implement so that i can use that in my hadoop cluster??
>>>>
>>>>
>>>> Regards,
>>>> Hassan Asghar
>>>>
>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
>>>> For additional commands, e-mail: user-help@hadoop.apache.org
>>>>
>>>>
>>> --
>> Dear,
>>
>> Best Regards
>> ------------------
>> Hassan Asghar
>> *M*:*+923400400374 <+92%20340%200400374>*
>> *E*: *haxxanasghar@gmail.com <haxxanasghar@gmail.com>*
>>
>
> --
Dear,

Best Regards
------------------
Hassan Asghar
*M*:*+923400400374*
*E*: *haxxanasghar@gmail.com <haxxanasghar@gmail.com>*

Mime
View raw message