apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Munagala Ramanath <...@datatorrent.com>
Subject Re: Checkpointing behaviour
Date Fri, 20 Nov 2015 19:24:20 GMT
To reinforce Tim's point, here is a quote from
https://docs.datatorrent.com/application_development/#checkpointing

--------------
In case of an operator that has an application window size that is
larger than the size of the streaming window, the checkpointing by
default still happens at same intervals as with other operators. To
align checkpointing with application window boundary, the application
developer should set the attribute “CHECKPOINT_WINDOW_COUNT” to
“APPLICATION_WINDOW_COUNT”. This ensures that the checkpoint happens
at the end of the application window and not within that window. Such
operators now treat the application window as an atomic computation
unit. The downside is that it does need the upstream buffer server to
keep tuples for the entire application window.
---------------

Ram

On Fri, Nov 20, 2015 at 11:12 AM, Timothy Farkas <tim@datatorrent.com> wrote:
> Hi Bhupesh,
>
> 1. Yes the checkpoint will be delayed until the end of the checkpont window
> (not the app window).
> 2.  For the case OperatorContext.APPLICATION_WINDOW_COUNT =
> OperatorContext.CHECKPOINT_WINDOW_COUNT = 1 I would expect the checkpoint
> to happen at the end of window 59 not 58. Sounds like something is wrong
> their
>       For the case  OperatorContext.APPLICATION_WINDOW_COUNT =
> OperatorContext.CHECKPOINT_WINDOW_COUNT > 1 Your observations are correct.
> 3. APPLICATION_WINDOW_COUNT plays no role in checkpointing, when a
> checkpoint happens is determined entirely by CHECKPOINT_WINDOW_COUNT.
> However, In a lot of cases however operators need APPLICATION_WINDOW_COUNT
> to equal CHECKPOINT_WINDOW_COUNT in order to be fault tolerant.
>
> Thanks,
> Tim
>
> On Fri, Nov 20, 2015 at 11:03 AM, Bhupesh Chawda <bhupesh@datatorrent.com>
> wrote:
>
>> Hi All,
>>
>> I have some confusion regarding the attributes which control the
>> checkpointing in Apex.
>>
>> As per my understanding, there are three attributes which play a role in
>> deciding when to perform checkpointing:
>>
>>    1. OperatorContext.APPLICATION_WINDOW_COUNT (How many streaming windows
>>    form an app window)
>>    2. OperatorContext.CHECKPOINT_WINDOW_COUNT (Indicates the optimal
>>    checkpoint boundary)
>>    3. DagContext.CHECKPOINT_WINDOW_COUNT (how many streaming windows form a
>>    checkpoint window)
>>
>> I have the following doubts:
>>
>>    - Let's consider the case where checkpointing always happens at an
>>    Application window boundary. This implies
>>    OperatorContext.APPLICATION_WINDOW_COUNT =
>>    OperatorContext.CHECKPOINT_WINDOW_COUNT. The engine will always insert a
>>    "checkpoint marker" tuple after the DagContext.CHECKPOINT_WINDOW_COUNT
>>    streaming windows are done. However, if this marker arrives at a time
>> when
>>    the application window is still not over, then the checkpoint will be
>>    delayed till the end of the application window. Is this understanding
>>    correct?
>>
>>
>>    - One inconsistency that I have observed: When
>>    OperatorContext.APPLICATION_WINDOW_COUNT =
>>    OperatorContext.CHECKPOINT_WINDOW_COUNT = 1, the first checkpoint
>> happens
>>    after the 58th window instead of 59th window (0 indexed). The following
>>    checkpoints however happen at the intended 118th, 178th etc. windows.
>> This
>>    is for the default case of 60 windows checkpoint. In other cases, when
>>    OperatorContext.APPLICATION_WINDOW_COUNT =
>>    OperatorContext.CHECKPOINT_WINDOW_COUNT > 1 (and a perfect divisor of
>>    DagContext.CHECKPOINT_WINDOW_COUNT), it happens at the 59th window,
>> 119th
>>    window and so on. Is this behaviour expected?
>>
>>
>>    - Moving on to the other case of allowing checkpointing within
>>    application windows; here OperatorContext.APPLICATION_WINDOW_COUNT may
>> not
>>    be equal to OperatorContext.CHECKPOINT_WINDOW_COUNT. Similar to the
>>    previous case, the "checkpoint marker" tuple arrives at the end of
>>    DagContext.CHECKPOINT_WINDOW_TUPLES. In this case, there is no
>> restriction
>>    on when not to checkpoint. So what is the role of
>>    OperatorContext.CHECKPOINT_WINDOW_COUNT in this case?
>>
>>
>> Thanks.
>> -Bhupesh
>>

Mime
View raw message