aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Farner <wfar...@apache.org>
Subject Re: Checkpoint
Date Fri, 20 Nov 2015 15:08:24 GMT
Framework failover timeout is orthogonal.  Slave checkpointing is, IIRC,
completely slave side.  Framework failover timeout just decides when the
master will consider the framework gone (and effectively commence rm -rf of
its tasks) after the framework disconnects.

On Thursday, November 19, 2015, <meghdoot_b@yahoo.com.invalid> wrote:

>  I started digging the code and found the same as well. Thx for confirming
> Bill.
>
> General question, does the framework failover timeout feature only work if
> checkpoint is set by framework as well? Or the checkpoint feature is
> strictly for slave side and mesos master will keep tasks running regardless
> of checkpoint flag value if framework comes back in time as long just the
> timeout is set? Guess I can check the mesos code.
>
> Thx
>
> Sent from my iPhone
>
> > On Nov 19, 2015, at 11:15 PM, Bill Farner <wfarner@apache.org
> <javascript:;>> wrote:
> >
> > It was Aurora that drove this requirement, and Aurora only operates in
> this
> > mode.
> >
> >
> https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/mesos/CommandLineDriverSettingsModule.java#L130-L131
> >
> >> On Thu, Nov 19, 2015 at 11:07 PM, <meghdoot_b@yahoo.com.invalid> wrote:
> >>
> >> I am guessing aurora does not use mesos checkpoint feature where tasks
> can
> >> run even if slave stopped (for an upgrade say).
> >> Can this be supported (optionally) if not today especially as part of
> >> custom executor support?
> >> Mesos slaves has enabled check pointing by default a while back but it
> >> needs framework to set it as well for the feature to work.
> >>
> >> Thx
> >>
> >> Sent from my iPhone
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message