aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erb, Stephan" <Stephan....@blue-yonder.com>
Subject Re: [Design Doc] Hot Standby in Replicas to Reduce Failover Time
Date Mon, 04 Sep 2017 17:31:51 GMT
Thanks for the detailed design document and the in-depth walkthrough [1]! 
Your proposal seems to be sound. (But be warned, I don’t have much experience in this part
of Aurora or Mesos :-))

[1] https://docs.google.com/presentation/d/1fQMfNLaRex9rJyq3h08HIujtpULoYnpFFV7-P6p6Zt0/edit#slide=id.p4

On 31.08.17, 04:18, "Jordan Ly" <jordan.ly8@gmail.com> wrote:

    Hi everyone,
    
    Following up on the discussion here:
    https://lists.apache.org/thread.html/e31d7dbcb054ed570f969ae2043eadfc090383edfe0751cec59b29d3@%3Cdev.aurora.apache.org%3E
    
    I've created a design document detailing the implementation of a "hot
    standby" mechanism where scheduler followers would eagerly read and
    apply entries from the replicated log. The goal of this change is
    that, in the event of a failover, the newly elected follower will not
    have to replay as many entries to rebuild its state and thus can start
    serving traffic faster.
    
    https://docs.google.com/document/d/1DOtKA4-vrtxat1MaUYMQ6Y1iXhA8ob6Mfztzt-R1Oss/edit?usp=sharing
    
    I have a working prototype of the above design running on a test
    cluster. Please feel free to comment on the doc!
    
    This document references a current proposal in Mesos by Ilya Pronin
    here: https://lists.apache.org/thread.html/1b8fd10e151054a85c9ea3dc808f7fecb9a87fe5f5e87b10caa46e2a@%3Cdev.mesos.apache.org%3E
    
    Cheers,
    
    Jordan Ly
    

Mime
View raw message