aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McLaughlin <dmclaugh...@apache.org>
Subject Heartbeat mechanism auditing
Date Thu, 29 Jan 2015 22:07:26 GMT
Hi all,

There is a little bit of a stalemate with regards to the implementation of
the pulse RPC in the scheduler.

As a brief overview of this feature - the pulse RPC is designed so that an
external service can monitor the new in-scheduler updates reliably. This
external service could be doing something like keeping an eye on
application level alerts and pausing the update if things slip into a bad
state. The purpose of the pulse is to make sure the update does not
continue if it's not being monitored (i.e. the external service might have
failed) by requiring positive acknowledgement at a given time interval.

The implementation is in this review: https://reviews.apache.org/r/30225/

The contention is around whether or not the "blocked" state deserves its
own explicit state in the update state machine, and whether this is
important enough to block the review. Currently any blocked updates are
only known to the scheduler and the update will show as
UPDATING/ROLLING_FORWARD in the UI and any history that the update was
blocked will be lost - we only track current state.

If you have any opinions on this feature, please feel free to chime in to
the RB!

Thanks,
David

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message