apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaurav Gupta <gau...@datatorrent.com>
Subject Re: Checkpointing behaviour
Date Fri, 20 Nov 2015 19:38:50 GMT
Bhupesh,

You can also find details in earlier discussion on similar topic
http://mail-archives.apache.org/mod_mbox/incubator-apex-dev/201511.mbox/%3CCAMaZnT6Nn6fqKV+gP-fjEcMGt9y6By1XhUVjDo1Y==Apc_=21A@mail.gmail.com%3E


Thanks
- Gaurav

> On Nov 20, 2015, at 11:24 AM, Munagala Ramanath <ram@datatorrent.com> wrote:
> 
> To reinforce Tim's point, here is a quote from
> https://docs.datatorrent.com/application_development/#checkpointing
> 
> --------------
> In case of an operator that has an application window size that is
> larger than the size of the streaming window, the checkpointing by
> default still happens at same intervals as with other operators. To
> align checkpointing with application window boundary, the application
> developer should set the attribute “CHECKPOINT_WINDOW_COUNT” to
> “APPLICATION_WINDOW_COUNT”. This ensures that the checkpoint happens
> at the end of the application window and not within that window. Such
> operators now treat the application window as an atomic computation
> unit. The downside is that it does need the upstream buffer server to
> keep tuples for the entire application window.
> ---------------
> 
> Ram
> 
> On Fri, Nov 20, 2015 at 11:12 AM, Timothy Farkas <tim@datatorrent.com> wrote:
>> Hi Bhupesh,
>> 
>> 1. Yes the checkpoint will be delayed until the end of the checkpont window
>> (not the app window).
>> 2.  For the case OperatorContext.APPLICATION_WINDOW_COUNT =
>> OperatorContext.CHECKPOINT_WINDOW_COUNT = 1 I would expect the checkpoint
>> to happen at the end of window 59 not 58. Sounds like something is wrong
>> their
>>      For the case  OperatorContext.APPLICATION_WINDOW_COUNT =
>> OperatorContext.CHECKPOINT_WINDOW_COUNT > 1 Your observations are correct.
>> 3. APPLICATION_WINDOW_COUNT plays no role in checkpointing, when a
>> checkpoint happens is determined entirely by CHECKPOINT_WINDOW_COUNT.
>> However, In a lot of cases however operators need APPLICATION_WINDOW_COUNT
>> to equal CHECKPOINT_WINDOW_COUNT in order to be fault tolerant.
>> 
>> Thanks,
>> Tim
>> 
>> On Fri, Nov 20, 2015 at 11:03 AM, Bhupesh Chawda <bhupesh@datatorrent.com>
>> wrote:
>> 
>>> Hi All,
>>> 
>>> I have some confusion regarding the attributes which control the
>>> checkpointing in Apex.
>>> 
>>> As per my understanding, there are three attributes which play a role in
>>> deciding when to perform checkpointing:
>>> 
>>>   1. OperatorContext.APPLICATION_WINDOW_COUNT (How many streaming windows
>>>   form an app window)
>>>   2. OperatorContext.CHECKPOINT_WINDOW_COUNT (Indicates the optimal
>>>   checkpoint boundary)
>>>   3. DagContext.CHECKPOINT_WINDOW_COUNT (how many streaming windows form a
>>>   checkpoint window)
>>> 
>>> I have the following doubts:
>>> 
>>>   - Let's consider the case where checkpointing always happens at an
>>>   Application window boundary. This implies
>>>   OperatorContext.APPLICATION_WINDOW_COUNT =
>>>   OperatorContext.CHECKPOINT_WINDOW_COUNT. The engine will always insert a
>>>   "checkpoint marker" tuple after the DagContext.CHECKPOINT_WINDOW_COUNT
>>>   streaming windows are done. However, if this marker arrives at a time
>>> when
>>>   the application window is still not over, then the checkpoint will be
>>>   delayed till the end of the application window. Is this understanding
>>>   correct?
>>> 
>>> 
>>>   - One inconsistency that I have observed: When
>>>   OperatorContext.APPLICATION_WINDOW_COUNT =
>>>   OperatorContext.CHECKPOINT_WINDOW_COUNT = 1, the first checkpoint
>>> happens
>>>   after the 58th window instead of 59th window (0 indexed). The following
>>>   checkpoints however happen at the intended 118th, 178th etc. windows.
>>> This
>>>   is for the default case of 60 windows checkpoint. In other cases, when
>>>   OperatorContext.APPLICATION_WINDOW_COUNT =
>>>   OperatorContext.CHECKPOINT_WINDOW_COUNT > 1 (and a perfect divisor of
>>>   DagContext.CHECKPOINT_WINDOW_COUNT), it happens at the 59th window,
>>> 119th
>>>   window and so on. Is this behaviour expected?
>>> 
>>> 
>>>   - Moving on to the other case of allowing checkpointing within
>>>   application windows; here OperatorContext.APPLICATION_WINDOW_COUNT may
>>> not
>>>   be equal to OperatorContext.CHECKPOINT_WINDOW_COUNT. Similar to the
>>>   previous case, the "checkpoint marker" tuple arrives at the end of
>>>   DagContext.CHECKPOINT_WINDOW_TUPLES. In this case, there is no
>>> restriction
>>>   on when not to checkpoint. So what is the role of
>>>   OperatorContext.CHECKPOINT_WINDOW_COUNT in this case?
>>> 
>>> 
>>> Thanks.
>>> -Bhupesh
>>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message