mesos-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r1633476 [3/3] - in /mesos/site: publish/ publish/documentation/ publish/documentation/latest/ publish/documentation/latest/reconciliation/ publish/documentation/latest/running-torque-or-mpi-on-mesos/ publish/documentation/reconciliation/ s...
Date Tue, 21 Oct 2014 22:22:37 GMT
Modified: mesos/site/source/documentation/
--- mesos/site/source/documentation/ (original)
+++ mesos/site/source/documentation/ Tue Oct 21 22:22:36 2014
@@ -4,6 +4,11 @@ layout: documentation
 # Documentation
+## Mesos Fundamentals
+* [Mesos Architecture](/documentation/latest/mesos-architecture/) providing an overview of
Mesos concepts.
+* [Video and Slides of Mesos Presentations](/documentation/latest/mesos-presentations/)
 ## Running Mesos
 * [Configuration](/documentation/latest/configuration/) for command-line arguments.
@@ -20,17 +25,17 @@ layout: documentation
  * [Mesos frameworks](/documentation/latest/mesos-frameworks/) for a list of apps built on
top of Mesos, and instructions on how to run them.
-## Developing Mesos Frameworks and Applications
+## Developing Mesos Frameworks
 * [Framework Development Guide](/documentation/latest/app-framework-development-guide/) describes
how to build applications on top of Mesos.
-* [Mesos Architecture](/documentation/latest/mesos-architecture/) providing an overview of
Mesos concepts.
+* [Reconciliation](/documentation/latest/reconciliation/) for ensuring a framework's state
remains eventually consistent in the face of failures.
 * [Javadoc](/api/latest/java/) documents the Mesos Java API.
 * [Developer Tools](/documentation/latest/tools/) for hacking on Mesos or writing frameworks.
 ## Contributing to Mesos
-* [Committer's Guide](/documentation/latest/committers-guide/): A guiding document for etiquette
in reviews and commits.
-* [Code Internals](/documentation/latest/mesos-code-internals/) overview of the codebase
and internal organization.
+* [Committer's Guide](/documentation/latest/committers-guide/) a guiding document for etiquette
in reviews and commits.
+* [Code Internals](/documentation/latest/mesos-code-internals/) an overview of the codebase
and internal organization.
 * [C++ Style Guide](/documentation/latest/mesos-c++-style-guide/)
 * [Developers Guide](/documentation/latest/mesos-developers-guide/) includes resources for
developers contributing to Mesos and the process of submitting patches for review.
 * [Development Road Map](/documentation/latest/mesos-roadmap/)
@@ -39,5 +44,4 @@ layout: documentation
 ## More Info about Mesos
 * [Powered by Mesos](/documentation/latest/powered-by-mesos/) lists organizations and software
that are powered by Apache Mesos.
-* [Video and Slides of Mesos Presentations](/documentation/latest/mesos-presentations/)
 * Academic Papers and Project History
\ No newline at end of file

Added: mesos/site/source/documentation/latest/
--- mesos/site/source/documentation/latest/ (added)
+++ mesos/site/source/documentation/latest/ Tue Oct 21 22:22:36 2014
@@ -0,0 +1,108 @@
+# Reconciliation
+There's no getting around it, **frameworks on Mesos are distributed systems**.
+**Distributed systems must deal with failures**, and partitions (the two are
+indistinguishable from a system's perspective).
+Concretely, what does this mean for frameworks? Mesos uses an actor-like
+**message passing programming model, in which messages are delivered
+at-most-once**. (Exceptions to this include task status updates, most of
+which are delivered at-least-once through the use of acknowledgements).
+**The messages passed between the master and the framework are therefore
+susceptible to be dropped, in the presence of failures**.
+When these non-reliable messages are dropped, inconsistent state can arise
+between the framework and Mesos.
+As a simple example, consider a launch task request sent by a framework.
+There are many ways that failures can lead to the loss of the task, for
+* Framework fails after persisting its intent to launch the task, but
+before the launch task message was sent.
+* Master fails before receiving the message.
+* Master fails after receiving the message, but before sending it to the
+In these cases, the framework believes the task to be staging, but the
+task is unknown to Mesos. To cope with such situations, **task state must be
+reconciled between the framework and Mesos whenever a failure is detected**.
+## Detecting Failures
+It is the responsibility of Mesos (scheduler driver / Master) to ensure
+that the framework is notified when a disconnection, and subsequent
+(re-)registration occurs. At this point, the scheduler should perform
+task state reconciliation.
+## Task Reconciliation
+**Tasks must be reconciled explicitly by the framework after a failure.**
+This is because the scheduler driver does not persist any task information.
+In the future, the scheduler driver (or a pure-language mesos library) could
+perform task reconciliation seamlessly under the covers on behalf of the
+So, for now, let's look at how one needs to implement task state
+reconciliation in a framework scheduler.
+### API
+Frameworks send a list of `TaskStatus` messages to the master:
+  // Allows the framework to query the status for non-terminal tasks.
+  // This causes the master to send back the latest task status for
+  // each task in 'statuses', if possible. Tasks that are no longer
+  // known will result in a TASK_LOST update. If statuses is empty,
+  // then the master will send the latest status for each task
+  // currently known.
+  message Reconcile {
+    repeated TaskStatus statuses = 1; // Should be non-terminal only.
+  }
+Currently, the master will only examine two fields in `TaskStatus`:
+* `TaskID`: This is required.
+* `SlaveID`: Optional, leads to faster reconciliation in the presence of
+slaves that are transitioning between states.
+### Algorithm
+The technique for performing reconciliation should reconcile all non-terminal
+tasks, until an update is received for each task, using exponential backoff:
+1. let `start = now()`
+2. let `remaining = { T ϵ tasks | T is non-terminal }`
+3. Perform reconciliation: `reconcile(remaining)`
+4. Wait for status updates to arrive (use truncated exponential backoff). For each update,
note the time of arrival.
+5. let `remaining = { T ϵ remaining | T.last_update_arrival() < start }`
+6. If `remaining` is non-empty, go to 3.
+This reconciliation algorithm **must** be run after each (re-)registration.
+* When waiting for updates to arrive, **use a truncated exponential backoff**.
+This will avoid a snowball effect in the case of the driver or master being
+backed up.
+* Implicit reconciliation (passing an empty list) can also be used
+periodically, As a defense against data loss in the framework.
+* It is beneficial to ensure that only 1 reconciliation is in progress at a
+time, to avoid a snowball effect in the face of many re-registrations.
+If another reconciliation should be started while one is in-progress,
+then the previous reconciliation algorithm should stop running.
+## Offer Reconciliation
+Offers are reconciled automatically after a failure:
+* Offers do not persist beyond the lifetime of a Master.
+* If a disconnection occurs, offers are no longer valid.
+* Offers are rescinded and regenerated each time the framework (re-)registers.
\ No newline at end of file

View raw message