apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashwin Chandra Putta <ashwinchand...@gmail.com>
Subject Re: operator recovery window
Date Tue, 15 Dec 2015 19:41:38 GMT
Tim,

Thanks, that is pretty much inline with what I was thinking. A little
different thought though in terms of picking the checkpoint based on
downstream operators. For A, is it not going to be "the checkpoint with the
largest window id that is less than or equal to the checkpoint with the
largest common window id (instead of largest window id) among all the
operators down stream to A"

For example,

If A -> B -> C -> D is the dag. And say, the checkpoint window count is 5
and the largest checkpoints are as follows.

A - 30
B - 25
C - 20
D - 15

Does A recover at 25 (checkpoint with largest window id) or 15 (checkpoint
with largest common window id)?

Also, regarding recovering at committed window id. Is it not possible in
the following scenario where all operators have checkpointed at 30 and got
the committed window call back. And then an operator fails before any
operator checkpoints further. In that case, the recovery window is 30 right?

Regards,
Ashwin.

On Mon, Dec 14, 2015 at 11:58 PM, Timothy Farkas <tim@datatorrent.com>
wrote:

> Hi Ashwin,
>
> The recovery checkpoint for operator A is computed by taking the checkpoint
> with the largest window id that is less than or equal to the checkpoint
> with the largest window id among all the operators down stream to A. The
> output operators in a dag will always recover to their most recent
> checkpoint. The input operator of the dag may recover to the earliest
> checkpoint. Operators between the input and ouput operators could recover
> to a window in between.
>
> I don't think you can ever recover to a committed window, the earliest I
> think you can recover to is the window after the committed window (may be
> wrong on this).
>
> On Mon, Dec 14, 2015 at 11:05 PM, Ashwin Chandra Putta <
> ashwinchandrap@gmail.com> wrote:
>
> > In the apex architecture there is concept of checkpointing and concept of
> > committed when all operator have crossed a common checkpoint.
> >
> > So, in which scenarios does a given operator recover at last checkpoint
> > window vs last committed window vs some other checkpoint window in
> between?
> > --
> >
> > Regards,
> > Ashwin.
> >
>



-- 

Regards,
Ashwin.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message