aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Ly <>
Subject [Design Doc] Hot Standby in Replicas to Reduce Failover Time
Date Thu, 31 Aug 2017 02:18:51 GMT
Hi everyone,

Following up on the discussion here:

I've created a design document detailing the implementation of a "hot
standby" mechanism where scheduler followers would eagerly read and
apply entries from the replicated log. The goal of this change is
that, in the event of a failover, the newly elected follower will not
have to replay as many entries to rebuild its state and thus can start
serving traffic faster.

I have a working prototype of the above design running on a test
cluster. Please feel free to comment on the doc!

This document references a current proposal in Mesos by Ilya Pronin


Jordan Ly

View raw message