apex-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Weise <tho...@datatorrent.com>
Subject Re: Why is Async checkpointing made default?
Date Mon, 23 Nov 2015 05:44:34 GMT
Alternatively I would ask why the checkpointed callback needs to wait until
the data was copied to HDFS instead upon completion of the state
serialization.

Thomas


On Sun, Nov 22, 2015 at 9:41 PM, Chandni Singh <chandni@datatorrent.com>
wrote:

> Gaurav,
>
> My question is about why Async was made the default when it changed the
> semantics of operator callbacks. Your response doesn't answer that.
>
> In a way we broke backward compatibility.
>
> Chandni
>
> On Sun, Nov 22, 2015 at 9:22 PM, Gaurav Gupta <gaurav@datatorrent.com>
> wrote:
>
> > The idea behind Async checkpointing is to unblock operator while the
> state
> > is getting transferred to HDFS.
> > Just to clarify that this beginWindow (x) -> endWindow(x) -> checkpointed
> > (x-1 ) should be an ideal sequence, but if the HDFS is slow or for some
> > other reason transferring the state to HDFS is slow this sequence may not
> > hold true.
> >
> > Can your use case be addressed by
> > https://malhar.atlassian.net/browse/APEX-78 <
> > https://malhar.atlassian.net/browse/APEX-78>?
> >
> > Thanks
> > - Gaurav
> >
> > > On Nov 22, 2015, at 3:56 PM, Chandni Singh <chandni@datatorrent.com>
> > wrote:
> > >
> > > With Async checkpointing the checkpoint callback in CheckpointPoint
> > > listener is called for a previous window, that is,
> > > beginWindow (x) -> endWindow(x) -> checkpointed (x-1 )
> > >
> > > This feature was newly introduced. With synchronous checkpointing, the
> > > behavior was always
> > > beginWindow(x) -> endWindow(x) -> checkpointed (x)
> > >
> > > A lot of operators were written before asynchronous checkpointing was
> > > introduced and few of them can rely on the sequencing guaranteed by
> > > synchronous checkpointing.
> > >
> > > So why was Async Checkpointed made default?
> > >
> > > With how Async checkpoint is today, the complexity to handle transient
> > > state in checkpointed callback falls on every operator. For eg, lets
> say
> > > earlier I had a transient map which I cleared every time the
> checkpointed
> > > was called, with async checkpointing this simple task will be a lot
> more
> > > complicated.
> > >
> > > I think Async checkpointing broke the semantics of operator callbacks
> and
> > > should NOT be the default.
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message