aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Farner <wfar...@apache.org>
Subject Re: Health Check Disabler Discussion
Date Fri, 10 Oct 2014 22:26:26 GMT
I'm generally in #1, but could land somewhere in between.  I think the idea
of using mtime came up, which i like more than parsing the snooze file and
giving full control.  I'd be fine with expiring this file at mtime +
SNOOZE_TIMEOUT (constant).  This fails closed, is relatively simple to
implement, and doesn't allow the user to snooze for an unreasonable amount
of time.

-=Bill

On Fri, Oct 10, 2014 at 2:48 PM, Joshua Cohen <jcohen@twopensource.com>
wrote:

> I'm in camp #2, I don't feel that it adds a significant amount of
> complexity to the health check logic, and it provides a substantial
> safeguard against users accidentally shooting themselves in the foot by
> accidentally leaving a health check snoozed.
>
> On Fri, Oct 10, 2014 at 2:32 PM, Maxim Khutornenko <maxim@apache.org>
> wrote:
>
> > +1 to the #1. Disabling health checks is like signing a waiver where
> > all health check guarantees are off.
> >
> > On Fri, Oct 10, 2014 at 2:23 PM, David Pan <david.pan2@gmail.com> wrote:
> > > Hi Aurora,
> > >
> > > I am currently working on a feature that allows for health checks to be
> > > disabled temporarily for a running instance of a job.  The code review
> > can
> > > be found at https://reviews.apache.org/r/26383/.  The idea is that the
> > > presence of a special "snooze file" in the task's sandbox will trigger
> > the
> > > disabling of the health checks.
> > >
> > > Currently, the code reviewers have split off into two camps:
> > > 1. One set of reviewers believe that simplicity is key.  Disable the
> > health
> > > checks if the snooze file is present, enable it otherwise.
> > >
> > > 2. The other set of reviewers believe that there should be a snooze
> > > duration.  The timer starts when the snooze file is touched.  After the
> > > snooze duration is exhausted, the snooze file should be deleted by the
> > > health checker, and health checks resume.  This is useful if the
> process
> > > that initially disabled the health checks dies unexpectedly, and is no
> > > longer there to re-enable the health checks.
> > >
> > > I would like to invite anyone interested to voice your opinions and
> chime
> > > in.
> > >
> > > Thanks,
> > >
> > > David Pan
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message