mesos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jie Yu" <yujie....@gmail.com>
Subject Re: Review Request 14631: Catch-up Replicated Log 1: decoupled coordinator logics and made them asynchronous.
Date Wed, 23 Oct 2013 20:04:56 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14631/
-----------------------------------------------------------

(Updated Oct. 23, 2013, 8:04 p.m.)


Review request for mesos and Benjamin Hindman.


Changes
-------

Addressed BenH's comments. This patch now only decouples the coordinator logics and does not
libprossesifies it (in next patch). Here are a few changes I made:

1) Passed Network and Replica using the new shared pointer abstraction.
2) Renamed 'id' to 'proposal' in protobuf and everywhere else.
3) Modified tests accordingly.
4) Used two zookeeper::Group instances, one for Network and one for log (for membership renewing).

Tests:

bin/mesos-tests.sh --gtest_filter=CoordinatorTest.*:LogTest.*:ReplicaTest.* --gtest_repeat=100


Summary (updated)
-----------------

Catch-up Replicated Log 1: decoupled coordinator logics and made them asynchronous.


Repository: mesos-git


Description
-------

This is the first patch of a series of patches that implement a catch-up mechanism for replicated
log. See the following ticket for more details:
https://issues.apache.org/jira/browse/MESOS-736

Here is a brief summary of this patch: (Sorry for the fact that we are not able to break it
into smaller patches :()

1) Pulled the original Coordinator logic out and divides it into several Paxos phases (see
src/log/consensus.hpp). Instead of using a blocking semantics, we implemented all the logics
asynchronously.

2) In order to ensure the liveness of a catch-uper, we implemented a retry logic by bumping
the proposal number. This also requires us to slightly change the existing replica protocol.

3) Made the "fill" operation independent of the underlying replica. Instead, introduced a
catchup (see src/log/catchup.hpp) function to make sure the underlying local replica has learned
each write.

4) Modified the log tests to adapt to the new semantics (see (3) above)

This is a joint work with Yan Xu.


Diffs (updated)
-----

  src/Makefile.am a2d8242 
  src/log/catchup.hpp PRE-CREATION 
  src/log/catchup.cpp PRE-CREATION 
  src/log/consensus.hpp PRE-CREATION 
  src/log/consensus.cpp PRE-CREATION 
  src/log/coordinator.hpp 3f6fb7c 
  src/log/coordinator.cpp 6e6466f 
  src/log/log.hpp 77edc7a 
  src/log/network.hpp d34cf78 
  src/log/replica.hpp d1f5ead 
  src/log/replica.cpp 59a6ff3 
  src/messages/log.proto 3d5859f 
  src/tests/log_tests.cpp ff5f86c 

Diff: https://reviews.apache.org/r/14631/diff/


Testing
-------

bin/mesos-tests.sh --gtest_filter=*CoordinatorTest*:*LogTest*:*ReplicaTest* --gtest_repeat=100


Thanks,

Jie Yu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message