mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Mahler" <benjamin.mah...@gmail.com>
Subject Review Request 20981: Updated the Registrar to abort permanently upon encountering a Failure.
Date Thu, 01 May 2014 21:52:11 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20981/
-----------------------------------------------------------

Review request for mesos, Benjamin Hindman and Vinod Kone.


Bugs: MESOS-1274
    https://issues.apache.org/jira/browse/MESOS-1274


Repository: mesos-git


Description
-------

It's possible for a backed-up master (many items in its queue) to have many operations enqueued
in the Registrar.

In this event, the Master won't commit suicide until the initial failure is processed. However,
in the interim, subsequent operations are potentially being performed against the Registrar.
This could lead to fighting between Masters if a "demoted" Master re-attempts to acquire log-leadership!
This scenario can occur if the "demoted" master has a large queue and the demotion event is
towards the back of the Master's queue.

It would be preferable to ensure that after losing log leadership, the "demoted" master does
not try to re-acquire log leadership and write to the log.

This is the motivation for this patch.


Diffs
-----

  src/master/registrar.cpp e4b0b3968a7a50bc951717160769daf2c1850b65 
  src/tests/registrar_tests.cpp 917a470f326523fbf11e245f4156fc8ce1d974d5 

Diff: https://reviews.apache.org/r/20981/diff/


Testing
-------

Added a test and improved the existing tests.


Thanks,

Ben Mahler


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message