apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaurav Gupta <gau...@datatorrent.com>
Subject Re: Why is Async checkpointing made default?
Date Thu, 26 Nov 2015 08:08:16 GMT
Tim,

The trigger to external systems is send by the operator in checkpointed() call back and not
by the Storage Agent. Not sure how suggested solution will solve Chetan’s use case.

Thanks
- Gaurav

> On Nov 25, 2015, at 11:58 PM, Timothy Farkas <tim@datatorrent.com> wrote:
> 
> Gaurav,
> 
> Chandni's method would address your point. Or you can copy the state
> wherever you want (even asynchronously) from the checkpointed callback.
> 
> On Wed, Nov 25, 2015 at 11:47 PM, Chandni Singh <chandni@datatorrent.com>
> wrote:
> 
>> Another approach for Chetan's use case can be to extend AsyncFSStorageAgent
>> and perform the function after copy to hdfs is completed.
>> Benefit is that the function will be performed asynchronously (with copy to
>> hdfs) and will not block operator's thread.
>> 
>> Chandni
>> 
>> On Wed, Nov 25, 2015 at 11:33 PM, Timothy Farkas <tim@datatorrent.com>
>> wrote:
>> 
>>> Chetan your use case is not valid, if checkpointed is called after the
>>> operator state is stored to local disk a similar storage function could
>> be
>>> performed. It is not necessary to wait for that same state to be
>>> asynchronously moved to hdfs. Please provide an example of a valid use
>> case
>>> :)
>>> 
>>> +1 for fixing this bug/regression as Thomas suggested.
>>> 
>>> On Wed, Nov 25, 2015 at 5:35 PM, Chandni Singh <chandni@datatorrent.com>
>>> wrote:
>>> 
>>>> Semver was broken with Async checkpointing. The behavior was changed as
>>>> pointed out before in the discussion. Also making it difficult for
>>> operator
>>>> developer doesn't give us anything.
>>>> 
>>>> +1 for fixing it in the way Thomas suggested.
>>>> 
>>>> On Wed, Nov 25, 2015 at 4:07 PM, Chetan Narsude (cnarsude) <
>>>> cnarsude@cisco.com> wrote:
>>>> 
>>>>> Yes - a few but cannot share the details - protected under NDA - ping
>>> me
>>>>> in private and I can probably be able to give you more generic
>> details
>>> on
>>>>> similar cooked up examples.
>>>>> 
>>>>> The part that follows “e.g.” below is an example that probably is
>>>>> sufficient to infer the use case logically, I think. I shared that to
>>>>> exemplify how changing the semantics will break semver.
>>>>> 
>>>>> —
>>>>> Chetan
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On 11/25/15, 3:51 PM, "Thomas Weise" <thomas@datatorrent.com> wrote:
>>>>> 
>>>>>> Do you have a specific example?
>>>>>> 
>>>>>> I see this happening in committed(), but not in checkpointed() where
>>> the
>>>>>> checkpoint remains intermediate, whether it was copied to HDFS or
>> not.
>>>>>> 
>>>>>> 
>>>>>> On Wed, Nov 25, 2015 at 3:42 PM, Chetan Narsude (cnarsude) <
>>>>>> cnarsude@cisco.com> wrote:
>>>>>> 
>>>>>>>> 
>>>>>>>> Until we have this, how about we restore the previous behavior
>>>>>>>> temporarily?
>>>>>>>> Calling checkpointed() immediately does not seem to pose
any
>>>> practical
>>>>>>>> issue but ensures that the code that was written under this
>>>> assumption
>>>>>>> is
>>>>>>>> not broken.
>>>>>>> 
>>>>>>> We can¹t do it. It would be incorrect. It breaks all the other
>> code
>>>> that
>>>>>>> (unassumingly) correctly complied to the semantics. e.g. an
>> operator
>>>>>>> which
>>>>>>> informs interesting parties that the checkpointed data is
>> available
>>>> for
>>>>>>> immediate consumption from storage.
>>>>>>> 
>>>>>>> ‹
>>>>>>> Chetan
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message