Subject [4/8] accumulo git commit: ACCUMULO-3436 Add implementation details section to manual
Date Fri, 19 Dec 2014 18:44:45 GMT
ACCUMULO-3436 Add implementation details section to manual

Only subsection of this is FATE: overview on why it exists,
very general details on how it works, and how users/admins can
interact with it via the shell.


Branch: refs/heads/master
Commit: c7f450258bdfcfa248533d1d45f8963c4cd0bdca
Parents: 67998ed
Author: Josh Elser <>
Authored: Fri Dec 19 13:23:39 2014 -0500
Committer: Josh Elser <>
Committed: Fri Dec 19 13:23:39 2014 -0500

+\chapter{Implementation Details}
+\section{Fault-Tolerant Executor (FATE)}
+Accumulo must implement a number of distributed, multi-step operations to support
+the client API. Creating a new table is a simple example of an atomic client call
+which requires multiple steps in the implementation: get a unique table ID, configure
+default table permissions, populate information in ZooKeeper to record the table's
+existence, create directories in HDFS for the table's data, etc. Implementing these
+steps in a way that is tolerant to node failure and other concurrent operations is
+very difficult to achieve. Accumulo includes a Fault-Tolerant Executor (FATE) which
+is widely used server-side to implement the client API safely and correctly.
+FATE is the implementation detail which ensures that tables in creation when the
+Master dies will be successfully created when another Master process is started.
+This alleviates the need for any external tools to correct some bad state -- Accumulo can
+undo the failure and self-heal without any external intervention.
+FATE consists of two primary components: a repeatable, persisted operation (REPO), a storage
+layer for REPOs and an execution system to run REPOs. Accumulo uses ZooKeeper as the storage
+layer for FATE and the Accumulo Master acts as the execution system to run REPOs.
+The important characteristic of REPOs are that they implemented in a way that is idempotent:
+every operation must be able to undo or replay a partial execution of itself. Requiring the

+implementation of the operation to support this functional greatly simplifies the execution
+of these operations. This property is also what guarantees safety in light of failure conditions.
+Sometimes, it is useful to inspect the current FATE operations, both pending and executing.
+For example, a command that is not completing could be blocked on the execution of another
+operation. Accumulo provides an Accumulo shell command to interact with fate.
+The \texttt{fate} shell command accepts a number of arguments for different functionality:
+\texttt{list}/\texttt{print}, \texttt{fail}, \texttt{delete}.
+Without any additional arguments, this command will print all operations that still exist
+the FATE store (ZooKeeper). This will include active, pending, and completed operations (completed
+operations are lazily removed from the store). Each operation includes a unique "transaction
ID", the
+state of the operation (e.g. \texttt{NEW}, \texttt{IN\_PROGRESS}, \texttt{FAILED}), any locks
+transaction actively holds and any locks it is waiting to acquire.
+This option can also accept transaction IDs which will restrict the list of transactions
+This command can be used to manually fail a FATE transaction and requires a transaction ID
+as an argument. Failing an operation is not a normal procedure and should only be performed
+by an administrator who understands the implications of why they are failing the operation.
+This command requires a transaction ID and will delete any locks that the transaction
+holds. Like the fail command, this command should only be used in extreme circumstances
+by an administrator that understands the implications of the command they are about to 
+invoke. It is not normal to invoke this command.

