fluo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ktur...@apache.org
Subject [incubator-fluo-website] branch gh-pages updated: Recipes 110 docs (#68)
Date Thu, 29 Jun 2017 17:30:05 GMT
This is an automated email from the ASF dual-hosted git repository.

kturner pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/incubator-fluo-website.git


The following commit(s) were added to refs/heads/gh-pages by this push:
     new 7b7a197  Recipes 110 docs (#68)
7b7a197 is described below

commit 7b7a197a95bc4c64777d98eb13c17a2edde7b614
Author: Keith Turner <keith@deenlo.com>
AuthorDate: Thu Jun 29 13:30:03 2017 -0400

    Recipes 110 docs (#68)
---
 _config.yml                                        |   2 +-
 .../1.1.0-incubating/accumulo-export-queue.md      | 104 +++++++
 .../fluo-recipes/1.1.0-incubating/combine-queue.md | 211 ++++++++++++++
 docs/fluo-recipes/1.1.0-incubating/export-queue.md | 303 +++++++++++++++++++++
 docs/fluo-recipes/1.1.0-incubating/index.md        | 103 +++++++
 docs/fluo-recipes/1.1.0-incubating/recording-tx.md |  72 +++++
 docs/fluo-recipes/1.1.0-incubating/row-hasher.md   | 122 +++++++++
 .../fluo-recipes/1.1.0-incubating/serialization.md |  76 ++++++
 docs/fluo-recipes/1.1.0-incubating/spark.md        |  19 ++
 .../1.1.0-incubating/table-optimization.md         |  65 +++++
 docs/fluo-recipes/1.1.0-incubating/testing.md      |  14 +
 docs/fluo-recipes/1.1.0-incubating/transient.md    |  83 ++++++
 docs/index.md                                      |   2 +
 13 files changed, 1175 insertions(+), 1 deletion(-)

diff --git a/_config.yml b/_config.yml
index ff2dfc3..135393a 100644
--- a/_config.yml
+++ b/_config.yml
@@ -43,7 +43,7 @@ color: default
 
 # Fluo specific settings
 latest_fluo_release: "1.1.0-incubating"
-latest_recipes_release: "1.0.0-incubating"
+latest_recipes_release: "1.1.0-incubating"
 
 # Sets links to external API
 api_base: "https://javadoc.io/doc/org.apache.fluo"
diff --git a/docs/fluo-recipes/1.1.0-incubating/accumulo-export-queue.md b/docs/fluo-recipes/1.1.0-incubating/accumulo-export-queue.md
new file mode 100644
index 0000000..380daa5
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/accumulo-export-queue.md
@@ -0,0 +1,104 @@
+---
+layout: recipes-doc
+title: Accumulo Export Queue Specialization
+version: 1.1.0-incubating
+---
+## Background
+
+The [Export Queue Recipe][1] provides a generic foundation for building export mechanism to any
+external data store. The [AccumuloExporter] provides an [Exporter] for writing to
+Accumulo. [AccumuloExporter] is located in the `fluo-recipes-accumulo` module and provides the
+following functionality:
+
+ * Safely batches writes to Accumulo made by multiple transactions exporting data.
+ * Stores Accumulo connection information in Fluo configuration, making it accessible by Export
+   Observers running on other nodes.
+ * Provides utility code that make it easier and shorter to code common Accumulo export patterns.
+
+## Example Use
+
+Exporting to Accumulo is easy. Follow the steps below:
+
+1. First, implement [AccumuloTranslator].  Your implementation translates exported
+   objects to Accumulo Mutations. For example, the `SimpleTranslator` class below translates String
+   key/values and into mutations for Accumulo.  This step is optional, a lambda could
+   be used in step 3 instead of creating a class.
+
+   ```java
+   public class SimpleTranslator implements AccumuloTranslator<String,String> {
+
+     @Override
+     public void translate(SequencedExport<String, String> export, Consumer<Mutation> consumer) {
+       Mutation m = new Mutation(export.getKey());
+       m.put("cf", "cq", export.getSequence(), export.getValue());
+       consumer.accept(m);
+     }
+   }
+
+   ```
+
+2. Configure an `ExportQueue` and the export table prior to initializing Fluo.
+
+   ```java
+
+   FluoConfiguration fluoConfig = ...;
+
+   String instance =       // Name of accumulo instance exporting to
+   String zookeepers =     // Zookeepers used by Accumulo instance exporting to
+   String user =           // Accumulo username, user that can write to exportTable
+   String password =       // Accumulo user password
+   String exportTable =    // Name of table to export to
+
+   // Set properties for table to export to in Fluo app configuration.
+   AccumuloExporter.configure(EXPORT_QID).instance(instance, zookeepers)
+       .credentials(user, password).table(exportTable).save(fluoConfig);
+
+   // Set properties for export queue in Fluo app configuration
+   ExportQueue.configure(EXPORT_QID).keyType(String.class).valueType(String.class)
+       .buckets(119).save(fluoConfig);
+
+   // Initialize Fluo using fluoConfig
+   ```
+
+3.  In the applications `ObserverProvider`, register an observer that will process exports and write
+    them to Accumulo using [AccumuloExporter].  Also, register observers that add to the export
+    queue.
+
+    ```java
+    public class MyObserverProvider implements ObserverProvider {
+
+      @Override
+      public void provide(Registry obsRegistry, Context ctx) {
+        SimpleConfiguration appCfg = ctx.getAppConfiguration();
+
+        ExportQueue<String, String> expQ = ExportQueue.getInstance(EXPORT_QID, appCfg);
+
+        // Register observer that will processes entries on export queue and write them to the Accumulo
+        // table configured earlier. SimpleTranslator from step 1 is passed here, could have used a
+        // lambda instead.
+        expQ.registerObserver(obsRegistry,
+            new AccumuloExporter<>(EXPORT_QID, appCfg, new SimpleTranslator()));
+
+        // An example observer created using a lambda that adds to the export queue.
+        obsRegistry.forColumn(OBS_COL, WEAK).useObserver((tx,row,col) -> {
+          // Read some data and do some work
+
+          // Add results to export queue
+          String key =    // key that identifies export
+          String value =  // object to export
+          expQ.add(tx, key, value);
+        });
+      }
+    }
+    ```
+
+## Other use cases
+
+The `getTranslator()` method in [AccumuloReplicator] creates a specialized [AccumuloTranslator] for replicating a Fluo table to Accumulo.
+
+[1]: /docs/fluo-recipes/1.1.0-incubating/export-queue/
+[Exporter]: {{ site.api_static }}/fluo-recipes-core/1.1.0-incubating/org/apache/fluo/recipes/core/export/function/Exporter.html
+[AccumuloExporter]: {{ site.api_static }}/fluo-recipes-accumulo/1.1.0-incubating/org/apache/fluo/recipes/accumulo/export/function/AccumuloExporter.html
+[AccumuloTranslator]: {{ site.api_static }}/fluo-recipes-accumulo/1.1.0-incubating/org/apache/fluo/recipes/accumulo/export/function/AccumuloTranslator.html
+[AccumuloReplicator]: {{ site.api_static }}/fluo-recipes-accumulo/1.1.0-incubating/org/apache/fluo/recipes/accumulo/export/AccumuloReplicator.html
+
diff --git a/docs/fluo-recipes/1.1.0-incubating/combine-queue.md b/docs/fluo-recipes/1.1.0-incubating/combine-queue.md
new file mode 100644
index 0000000..08d4e58
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/combine-queue.md
@@ -0,0 +1,211 @@
+---
+layout: recipes-doc
+title: Combine Queue Recipe
+version: 1.1.0-incubating
+---
+## Background
+
+When many transactions try to modify the same keys, collisions will occur.  Too many collisions
+cause transactions to fail and throughput to nose dive.  For example, consider [phrasecount]
+which has many transactions processing documents.  Each transaction counts the phrases in a document
+and then updates global phrase counts.  Since transaction attempts to update many phrases
+, the probability of collisions is high.
+
+## Solution
+
+The [combine queue recipe][CombineQueue] provides a reusable solution for updating many keys while
+avoiding collisions.  The recipe also organizes updates into batches in order to improve throughput.
+
+This recipes queues updates to keys for other transactions to process. In the phrase count example
+transactions processing documents queue updates, but do not actually update the counts.  Below is an
+example of computing phrasecounts using this recipe.
+
+ * TX1 queues `+1` update  for phrase `we want lambdas now`
+ * TX2 queues `+1` update  for phrase `we want lambdas now`
+ * TX3 reads the updates and current value for the phrase `we want lambdas now`.  There is no current value and the updates sum to 2, so a new value of 2 is written.
+ * TX4 queues `+2` update  for phrase `we want lambdas now`
+ * TX5 queues `-1` update  for phrase `we want lambdas now`
+ * TX6 reads the updates and current value for the phrase `we want lambdas now`.  The current value is 2 and the updates sum to 1, so a new value of 3 is written.
+
+Transactions processing updates have the ability to make additional updates.
+For example in addition to updating the current value for a phrase, the new
+value could also be placed on an export queue to update an external database.
+
+### Buckets
+
+A simple implementation of this recipe would have an update queue for each key.  However the
+implementation is slightly more complex.  Each update queue is in a bucket and transactions process
+all of the updates in a bucket.  This allows more efficient processing of updates for the following
+reasons :
+
+ * When updates are queued, notifications are made per bucket(instead of per a key).
+ * The transaction doing the update can scan the entire bucket reading updates, this avoids a seek for each key being updated.
+ * Also the transaction can request a batch lookup to get the current value of all the keys being updated.
+ * Any additional actions taken on update (like adding something to an export queue) can also be batched.
+ * Data is organized to make reading exiting values for keys in a bucket more efficient.
+
+Which bucket a key goes to is decided using hash and modulus so that multiple updates for a key go
+to the same bucket.
+
+The initial number of tablets to create when applying table optimizations can be controlled by
+setting the buckets per tablet option when configuring a Combine Queue.  For example if you
+have 20 tablet servers and 1000 buckets and want 2 tablets per tserver initially then set buckets
+per tablet to 1000/(2*20)=25.
+
+## Example Use
+
+The following code snippets show how to use this recipe for wordcount.  The first step is to
+configure it before initializing Fluo.  When initializing an ID is needed.  This ID is used in two
+ways.  First, the ID is used as a row prefix in the table.  Therefore nothing else should use that
+row range in the table.  Second, the ID is used in generating configuration keys.
+
+The following snippet shows how to configure a combine queue.
+
+```java
+    FluoConfiguration fluoConfig = ...;
+
+    // Set application properties for the combine queue.  These properties are read later by
+    // the observers running on each worker.
+    CombineQueue.configure(WcObserverProvider.ID)
+        .keyType(String.class).valueType(Long.class).buckets(119).save(fluoConfig);
+
+    fluoConfig.setObserverProvider(WcObserverProvider.class);
+
+    // initialize Fluo using fluoConfig
+```
+
+Assume the following observer is triggered when a documents is updated.  It examines new
+and old document content and determines changes in word counts.  These changes are pushed to a
+combine queue.
+
+```java
+public class DocumentObserver implements StringObserver {
+  // word count combine queue
+  private CombineQueue<String, Long> wccq;
+
+  public static final Column NEW_COL = new Column("content", "new");
+  public static final Column CUR_COL = new Column("content", "current");
+
+  public DocumentObserver(CombineQueue<String, Long> wccq) {
+    this.wccq = wccq;
+  }
+
+  @Override
+  public void process(TransactionBase tx, String row, Column col) {
+
+    Preconditions.checkArgument(col.equals(NEW_COL));
+
+    String newContent = tx.gets(row, NEW_COL);
+    String currentContent = tx.gets(row, CUR_COL, "");
+
+    Map<String, Long> newWordCounts = getWordCounts(newContent);
+    Map<String, Long> currentWordCounts = getWordCounts(currentContent);
+
+    // determine changes in word counts between old and new document content
+    Map<String, Long> changes = calculateChanges(newWordCounts, currentWordCounts);
+
+    // queue updates to word counts for processing by other transactions
+    wccq.add(tx, changes);
+
+    // update the current content and delete the new content
+    tx.set(row, CUR_COL, newContent);
+    tx.delete(row, NEW_COL);
+  }
+
+  private static Map<String, Long> getWordCounts(String doc) {
+    // TODO extract words from doc
+  }
+
+  private static Map<String, Long> calculateChanges(Map<String, Long> newCounts,
+      Map<String, Long> currCounts) {
+    Map<String, Long> changes = new HashMap<>();
+
+    // guava Maps class
+    MapDifference<String, Long> diffs = Maps.difference(currCounts, newCounts);
+
+    // compute the diffs for words that changed
+    changes.putAll(Maps.transformValues(diffs.entriesDiffering(),
+        vDiff -> vDiff.rightValue() - vDiff.leftValue()));
+
+    // add all new words
+    changes.putAll(diffs.entriesOnlyOnRight());
+
+    // subtract all words no longer present
+    changes.putAll(Maps.transformValues(diffs.entriesOnlyOnLeft(), l -> l * -1));
+
+    return changes;
+  }
+}
+
+
+```
+
+Each combine queue has two extension points, a [combiner][Combiner] and a [change
+observer][ChangeObserver].  The combine queue configures a Fluo observer to process queued
+updates.  When processing updates the two extension points are called.  The code below shows
+how to use these extension points.
+
+A change observer can do additional processing when a batch of key values are updated.  Below
+updates are queued for export to an external database.  The export is given the new and old value
+allowing it to delete the old value if needed.
+
+```java
+public class WcObserverProvider implements ObserverProvider {
+
+  public static final String ID = "wc";
+
+  @Override
+  public void provide(Registry obsRegistry, Context ctx) {
+
+    ExportQueue<String, MyDatabaseExport> exportQ = createExportQueue(ctx);
+
+    // Create a combine queue for computing word counts.
+    CombineQueue<String, Long> wcMap = CombineQueue.getInstance(ID, ctx.getAppConfiguration());
+
+    // Register observer that updates the Combine Queue
+    obsRegistry.forColumn(DocumentObserver.NEW_COL, STRONG).useObserver(new DocumentObserver(wcMap));
+
+    // Used to join new and existing values for a key. The lambda sums all values and returns
+    // Optional.empty() when the sum is zero. Returning Optional.empty() causes the key/value to be
+    // deleted. Could have used the built in SummingCombiner.
+    Combiner<String, Long> combiner = input -> input.stream().reduce(Long::sum).filter(l -> l != 0);
+
+    // Called when the value of a key changes. The lambda exports these changes to an external
+    // database. Make sure to read ChangeObserver's javadoc.
+    ChangeObserver<String, Long> changeObs = (tx, changes) -> {
+      for (Change<String, Long> update : changes) {
+        String word = update.getKey();
+        Optional<Long> oldVal = update.getOldValue();
+        Optional<Long> newVal = update.getNewValue();
+
+        // Queue an export to let an external database know the word count has changed.
+        exportQ.add(tx, word, new MyDatabaseExport(oldVal, newVal));
+      }
+    };
+
+    // Register observer that handles updates to the CombineQueue. This observer will use the
+    // combiner and valueObserver.
+    wcMap.registerObserver(obsRegistry, combiner, changeObs);
+  }
+}
+```
+
+## Guarantees
+
+This recipe makes two important guarantees about updates for a key when it
+calls `process()` on a [ChangeObserver].
+
+ * The new value reported for an update will be derived from combining all
+   updates that were committed before the transaction thats processing updates
+   started.  The implementation may have to make multiple passes over queued
+   updates to achieve this.  In the situation where TX1 queues a `+1` and later
+   TX2 queues a `-1` for the same key, there is no need to worry about only seeing
+   the `-1` processed.  A transaction that started processing updates after TX2
+   committed would process both.
+ * The old value will always be what was reported as the new value in the
+   previous transaction that called `ChangeObserver.process()`.
+
+[phrasecount]: https://github.com/fluo-io/phrasecount
+[CombineQueue]: {{ site.api_static }}/fluo-recipes-core/1.1.0-incubating/org/apache/fluo/recipes/core/combine/CombineQueue.html
+[ChangeObserver]: {{ site.api_static }}/fluo-recipes-core/1.1.0-incubating/org/apache/fluo/recipes/core/combine/ChangeObserver.html
+[Combiner]: {{ site.api_static }}/fluo-recipes-core/1.1.0-incubating/org/apache/fluo/recipes/core/combine/Combiner.html
diff --git a/docs/fluo-recipes/1.1.0-incubating/export-queue.md b/docs/fluo-recipes/1.1.0-incubating/export-queue.md
new file mode 100644
index 0000000..3ee379f
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/export-queue.md
@@ -0,0 +1,303 @@
+---
+layout: recipes-doc
+title: Export Queue Recipe
+version: 1.1.0-incubating
+---
+## Background
+
+Fluo is not suited for servicing low latency queries for two reasons. First, the implementation of
+transactions are designed for throughput.  To get throughput, transactions recover lazily from
+failures and may wait on other transactions.  Both of these design decisions can
+lead to delays of individual transactions, but do not negatively impact throughput.   The second
+reason is that Fluo observers executing transactions will likely cause a large number of random
+accesses.  This could lead to high response time variability for an individual random access.  This
+variability would not impede throughput but would impede the goal of low latency.
+
+One way to make data transformed by Fluo available for low latency queries is
+to export that data to another system.  For example Fluo could run on
+cluster A, continually transforming a large data set, and exporting data to
+Accumulo tables on cluster B.  The tables on cluster B would service user
+queries.  Fluo Recipes has built in support for [exporting to Accumulo][3],
+however this recipe can be used to export to systems other than Accumulo, like
+Redis, Elasticsearch, MySQL, etc.
+
+Exporting data from Fluo is easy to get wrong which is why this recipe exists.
+To understand what can go wrong consider the following example observer
+transaction.
+
+```java
+public class MyObserver implements StringObserver {
+
+    static final Column UPDATE_COL = new Column("meta", "numUpdates");
+    static final Column COUNTER_COL = new Column("meta", "counter1");
+
+    //reperesents a Query system extrnal to Fluo that is updated by Fluo
+    QuerySystem querySystem;
+
+    @Override
+    public void process(TransactionBase tx, String row, Column col) {
+
+       int oldCount = Integer.parseInt(tx.gets(row, COUNTER_COL, "0"));
+       int numUpdates = Integer.parseInt(tx.gets(row, UPDATE_COL, "0"));
+       int newCount = oldCount + numUpdates;
+
+       tx.set(row, COUNTER_COL, "" + newCount);
+       tx.delete(row, UPDATE_COL);
+
+        //Build an inverted index in the query system, based on count from the
+        //meta:counter1 column in fluo.  Do this by creating rows for the
+        //external query system based on the count.
+        String oldCountRow = String.format("%06d", oldCount);
+        String newCountRow = String.format("%06d", newCount);
+
+        //add a new entry to the inverted index
+        querySystem.insertRow(newCountRow, row);
+        //remove the old entry from the inverted index
+        querySystem.deleteRow(oldCountRow, row);
+    }
+}
+```
+
+The above example would keep the external index up to date beautifully as long
+as the following conditions are met.
+
+  * Threads executing transactions always complete successfully.
+  * Only a single thread ever responds to a notification.
+
+However these conditions are not guaranteed by Fluo.  Multiple threads may
+attempt to process a notification concurrently (only one may succeed).  Also at
+any point in time a transaction may fail (for example the computer executing it
+may reboot).   Both of these problems will occur and will lead to corruption of
+the external index in the example.  The inverted index and Fluo  will become
+inconsistent.  The inverted index will end up with multiple entries (that are
+never cleaned up) for single entity even though the intent is to only have one.
+
+The root of the problem in the example above is that its exporting uncommitted
+data.  There is no guarantee that setting the column `<row>:meta:counter1` to
+`newCount` will succeed until the transaction is successfully committed.
+However, `newCountRow` is derived from `newCount` and written to the external query
+system before the transaction is committed (Note : for observers, the
+transaction is committed by the framework after `process(...)` is called).  So
+if the transaction fails, the next time it runs it could compute a completely
+different value for `newCountRow` (and it would not delete what was written by the
+failed transaction).
+
+## Solution
+
+The simple solution to the problem of exporting uncommitted data is to only
+export committed data.  There are multiple ways to accomplish this.  This
+recipe offers a reusable implementation of one method.  This recipe has the
+following elements:
+
+ * An export queue that transactions can add key/values to.  Only if the transaction commits successfully will the key/value end up in the queue.  A Fluo application can have multiple export queues, each one must have a unique id.
+ * When a key/value is added to the export queue, its given a sequence number.  This sequence number is based on the transactions start timestamp.
+ * Each export queue is configured with an observer that processes key/values that were successfully committed to the queue.
+ * When key/values in an export queue are processed, they are deleted so the export queue does not keep any long term data.
+ * Key/values in an export queue are placed in buckets.  This is done so that all of the updates in a bucket can be processed in a single transaction.  This allows an efficient implementation of this recipe in Fluo.  It can also lead to efficiency in a system being exported to, if the system can benefit from batching updates.  The number of buckets in an export queue is configurable.
+
+There are three requirements for using this recipe :
+
+ * Must configure export queues before initializing a Fluo application.
+ * Transactions adding to an export queue must get an instance of the queue using its unique QID.
+ * Must create a class or lambda that implements [Exporter][1] in order to process exports.
+
+## Example Use
+
+This example shows how to incrementally build an inverted index in an external query system using an
+export queue.  The class below is simple POJO used as the value for the export queue.
+
+```java
+class CountUpdate {
+  public int oldCount;
+  public int newCount;
+
+  public CountUpdate(int oc, int nc) {
+    this.oldCount = oc;
+    this.newCount = nc;
+  }
+}
+```
+
+The following code shows how to configure an export queue.  This code
+modifies the FluoConfiguration object with options needed for the export queue.
+This FluoConfiguration object should be used to initialize the fluo
+application.
+
+```java
+public class FluoApp {
+
+  // export queue id "ici" means inverted count index
+  public static final String EQ_ID = "ici";
+
+  static final Column UPDATE_COL = new Column("meta", "numUpdates");
+  static final Column COUNTER_COL = new Column("meta", "counter1");
+
+  public static class AppObserverProvider implements ObserverProvider {
+    @Override
+    public void provide(Registry obsRegistry, Context ctx) {
+      ExportQueue<String, CountUpdate> expQ =
+          ExportQueue.getInstance(EQ_ID, ctx.getAppConfiguration());
+
+      // register observer that will queue data to export
+      obsRegistry.forColumn(UPDATE_COL, STRONG).useObserver(new MyObserver(expQ));
+
+      // register observer that will export queued data
+      expQ.registerObserver(obsRegistry, new CountExporter());
+    }
+  }
+
+  /**
+   * Call this method before initializing Fluo.
+   *
+   * @param fluoConfig the configuration object that will be used to initialize Fluo
+   */
+  public static void preInit(FluoConfiguration fluoConfig) {
+
+    // Set properties for export queue in Fluo app configuration
+    ExportQueue.configure(QUEUE_ID)
+        .keyType(String.class)
+        .valueType(CountUpdate.class)
+        .buckets(1009)
+        .bucketsPerTablet(10)
+        .save(getFluoConfiguration());
+
+    fluoConfig.setObserverProvider(AppObserverProvider.class);
+  }
+}
+```
+
+Below is updated version of the observer from above thats now using an export
+queue.
+
+```java
+public class MyObserver implements StringObserver {
+
+  private ExportQueue<String, CountUpdate> exportQueue;
+
+  public MyObserver(ExportQueue<String, CountUpdate> exportQueue) {
+    this.exportQueue = exportQueue;
+  }
+
+  @Override
+  public void process(TransactionBase tx, String row, Column col) {
+
+    int oldCount = Integer.parseInt(tx.gets(row, FluoApp.COUNTER_COL, "0"));
+    int numUpdates = Integer.parseInt(tx.gets(row, FluoApp.UPDATE_COL, "0"));
+    int newCount = oldCount + numUpdates;
+
+    tx.set(row, FluoApp.COUNTER_COL, "" + newCount);
+    tx.delete(row, FluoApp.UPDATE_COL);
+
+    // Because the update to the export queue is part of the transaction,
+    // either the update to meta:counter1 is made and an entry is added to
+    // the export queue or neither happens.
+    exportQueue.add(tx, row, new CountUpdate(oldCount, newCount));
+  }
+}
+```
+
+The export queue will call the `accept()` method on the class below to process entries queued for
+export.  It is possible the call to `accept()` can fail part way through and/or be called multiple
+times.  In the case of failures the export consumer will be called again with the same data.
+Its possible for the same export entry to be processed on multiple computers at different times.
+This can cause exports to arrive out of order.   The purpose of the sequence number is to help
+systems receiving out of order and redundant data.
+
+```java
+public class CountExporter implements Exporter<String, CountUpdate> {
+  // represents the external query system we want to update from Fluo
+  QuerySystem querySystem;
+
+  @Override
+  public void export(Iterator<SequencedExport<String, CountUpdate>> exports) {
+    BatchUpdater batchUpdater = querySystem.getBatchUpdater();
+
+    while (exports.hasNext()) {
+      SequencedExport<String, CountUpdate> export = exports.next();
+      String row = export.getKey();
+      CountUpdate uc = export.getValue();
+      long seqNum = export.getSequence();
+
+      String oldCountRow = String.format("%06d", uc.oldCount);
+      String newCountRow = String.format("%06d", uc.newCount);
+
+      // add a new entry to the inverted index
+      batchUpdater.insertRow(newCountRow, row, seqNum);
+      // remove the old entry from the inverted index
+      batchUpdater.deleteRow(oldCountRow, row, seqNum);
+    }
+
+    // flush all of the updates to the external query system
+    batchUpdater.close();
+  }
+}
+```
+
+## Schema
+
+Each export queue stores its data in the Fluo table in a contiguous row range.
+This row range is defined by using the export queue id as a row prefix for all
+data in the export queue.  So the row range defined by the export queue id
+should not be used by anything else.
+
+All data stored in an export queue is [transient](transient.md). When an export
+queue is configured, it will recommend split points using the [table
+optimization process](table-optimization.md).  The number of splits generated
+by this process can be controlled by setting the number of buckets per tablet
+when configuring an export queue.
+
+## Concurrency
+
+Additions to the export queue will never collide.  If two transactions add the
+same key at around the same time and successfully commit, then two entries with
+different sequence numbers will always be added to the queue.  The sequence
+number is based on the start timestamp of the transactions.
+
+If the key used to add items to the export queue is deterministically derived
+from something the transaction is writing to, then that will cause a collision.
+For example consider the following interleaving of two transactions adding to
+the same export queue in a manner that will collide. Note, TH1 is shorthand for
+thread 1, ek() is a function the creates the export key, and ev() is a function
+that creates the export value.
+
+ 1. TH1 : key1 = ek(`row1`,`fam1:qual1`)
+ 1. TH1 : val1 = ev(tx1.get(`row1`,`fam1:qual1`), tx1.get(`rowA`,`fam1:qual2`))
+ 1. TH1 : exportQueueA.add(tx1, key1, val1)
+ 1. TH2 : key2 = ek(`row1`,`fam1:qual1`)
+ 1. TH2 : val2 = ev(tx2.get(`row1`,`fam1:qual1`), tx2.get(`rowB`,`fam1:qual2`))
+ 1. TH2 : exportQueueA.add(tx2, key2, val2)
+ 1. TH1 : tx1.set(`row1`,`fam1:qual1`, val1)
+ 1. TH2 : tx2.set(`row1`,`fam1:qual1`, val2)
+
+In the example above only one transaction will succeed because both are setting
+`row1 fam1:qual1`.  Since adding to the export queue is part of the
+transaction, only the transaction that succeeds will add something to the
+queue.  If the funtion ek() in the example is deterministic, then both
+transactions would have been trying to add the same key to the export queue.
+
+With the above method, we know that transactions adding entries to the queue for
+the same key must have executed [serially][2]. Knowing that transactions which
+added the same key did not overlap in time makes reasoning about those export
+entries very simple.
+
+The example below is a slight modification of the example above.  In this
+example both transactions will successfully add entries to the queue using the
+same key.  Both transactions succeed because they are writing to different
+cells (`rowB fam1:qual2` and `rowA fam1:qual2`).  This approach makes it more
+difficult to reason about export entries with the same key, because the
+transactions adding those entries could have overlapped in time.  This is an
+example of write skew mentioned in the Percolater paper.
+
+ 1. TH1 : key1 = ek(`row1`,`fam1:qual1`)
+ 1. TH1 : val1 = ev(tx1.get(`row1`,`fam1:qual1`), tx1.get(`rowA`,`fam1:qual2`))
+ 1. TH1 : exportQueueA.add(tx1, key1, val1)
+ 1. TH2 : key2 = ek(`row1`,`fam1:qual1`)
+ 1. TH2 : val2 = ev(tx2.get(`row1`,`fam1:qual1`), tx2.get(`rowB`,`fam1:qual2`))
+ 1. TH2 : exportQueueA.add(tx2, key2, val2)
+ 1. TH1 : tx1.set(`rowA`,`fam1:qual2`, val1)
+ 1. TH2 : tx2.set(`rowB`,`fam1:qual2`, val2)
+
+[1]: {{ site.api_static }}/fluo-recipes-core/1.1.0-incubating/org/apache/fluo/recipes/core/export/function/Exporter.html
+[2]: https://en.wikipedia.org/wiki/Serializability
+[3]: /docs/fluo-recipes/1.1.0-incubating/accumulo-export-queue/
+
diff --git a/docs/fluo-recipes/1.1.0-incubating/index.md b/docs/fluo-recipes/1.1.0-incubating/index.md
new file mode 100644
index 0000000..bec214c
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/index.md
@@ -0,0 +1,103 @@
+---
+layout: recipes-doc
+title: Fluo Recipes 1.1.0-incubating Documentation
+version: 1.1.0-incubating
+---
+**Fluo Recipes are common code for [Apache Fluo][fluo] application developers.**
+
+Fluo Recipes build on the [Fluo API][fluo-api] to offer additional functionality to
+developers. They are published separately from Fluo on their own release schedule.
+This allows Fluo Recipes to iterate and innovate faster than Fluo (which will maintain
+a more minimal API on a slower release cycle). Fluo Recipes offers code to implement
+common patterns on top of Fluo's API.  It also offers glue code to external libraries
+like Spark and Kryo.
+
+### Documentation
+
+Recipes are documented below and in the [Recipes API docs][recipes-api].
+
+* [Combine Queue][combine-q] - A recipe for concurrently updating many keys while avoiding
+  collisions.
+* [Export Queue][export-q] - A recipe for exporting data from Fluo to external systems.
+* [Row Hash Prefix][row-hasher] - A recipe for spreading data evenly in a row prefix.
+* [RecordingTransaction][recording-tx] - A wrapper for a Fluo transaction that records all transaction
+operations to a log which can be used to export data from Fluo.
+* [Testing][testing] Some code to help write Fluo Integration test.
+
+Recipes have common needs that are broken down into the following reusable components.
+
+* [Serialization][serialization] - Common code for serializing POJOs. 
+* [Transient Ranges][transient] - Standardized process for dealing with transient data.
+* [Table optimization][optimization] - Standardized process for optimizing the Fluo table.
+* [Spark integration][spark] - Spark+Fluo integration code.
+
+### Usage
+
+The Fluo Recipes project publishes multiple jars to Maven Central for each release.
+The `fluo-recipes-core` jar is the primary jar. It is where most recipes live and where
+they are placed by default if they have minimal dependencies beyond the Fluo API.
+
+Fluo Recipes with dependencies that bring in many transitive dependencies publish
+their own jar. For example, recipes that depend on Apache Spark are published in the
+`fluo-recipes-spark` jar.  If you don't plan on using code in the `fluo-recipes-spark`
+jar, you should avoid including it in your pom.xml to avoid a transitive dependency on
+Spark.
+
+Below is a sample Maven POM containing all possible Fluo Recipes dependencies:
+
+```xml
+  <properties>
+    <fluo-recipes.version>1.1.0-incubating</fluo-recipes.version>
+  </properties>
+
+  <dependencies>
+    <!-- Required. Contains recipes that are only depend on the Fluo API -->
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-recipes-core</artifactId>
+      <version>${fluo-recipes.version}</version>
+    </dependency>
+    <!-- Optional. Serialization code that depends on Kryo -->
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-recipes-kryo</artifactId>
+      <version>${fluo-recipes.version}</version>
+    </dependency>
+    <!-- Optional. Common code for using Fluo with Accumulo -->
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-recipes-accumulo</artifactId>
+      <version>${fluo-recipes.version}</version>
+    </dependency>
+    <!-- Optional. Common code for using Fluo with Spark -->
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-recipes-spark</artifactId>
+      <version>${fluo-recipes.version}</version>
+    </dependency>
+    <!-- Optional. Common code for writing Fluo integration tests -->
+    <dependency>
+      <groupId>org.apache.fluo</groupId>
+      <artifactId>fluo-recipes-test</artifactId>
+      <version>${fluo-recipes.version}</version>
+      <scope>test</scope>
+    </dependency>
+  </dependencies>
+```
+
+[fluo]: https://fluo.apache.org/
+[fluo-api]: /api
+[recipes-api]: /api
+[combine-q]: /docs/fluo-recipes/1.1.0-incubating/combine-queue/
+[export-q]: /docs/fluo-recipes/1.1.0-incubating/export-queue/
+[recording-tx]: /docs/fluo-recipes/1.1.0-incubating/recording-tx/
+[serialization]: /docs/fluo-recipes/1.1.0-incubating/serialization/
+[transient]: /docs/fluo-recipes/1.1.0-incubating/transient/
+[optimization]: /docs/fluo-recipes/1.1.0-incubating/table-optimization/
+[row-hasher]: /docs/fluo-recipes/1.1.0-incubating/row-hasher/
+[spark]: /docs/fluo-recipes/1.1.0-incubating/spark/
+[testing]: /docs/fluo-recipes/1.1.0-incubating/testing/
+[ti]: https://travis-ci.org/apache/incubator-fluo-recipes.svg?branch=master
+[tl]: https://travis-ci.org/apache/incubator-fluo-recipes
+[li]: http://img.shields.io/badge/license-ASL-blue.svg
+[ll]: https://github.com/apache/incubator-fluo-recipes/blob/master/LICENSE
diff --git a/docs/fluo-recipes/1.1.0-incubating/recording-tx.md b/docs/fluo-recipes/1.1.0-incubating/recording-tx.md
new file mode 100644
index 0000000..83814bd
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/recording-tx.md
@@ -0,0 +1,72 @@
+---
+layout: recipes-doc
+title: RecordingTransaction recipe
+version: 1.1.0-incubating
+---
+A `RecordingTransaction` is an implementation of `Transaction` that logs all transaction operations
+(i.e GET, SET, or DELETE) to a `TxLog` object for later uses such as exporting data.  The code below
+shows how a RecordingTransaction is created by wrapping a Transaction object:
+
+```java
+RecordingTransactionBase rtx = RecordingTransactionBase.wrap(tx);
+```
+
+A predicate function can be passed to wrap method to select which log entries to record.  The code
+below only records log entries whose column family is `meta`:
+
+```java
+RecordingTransactionBase rtx = RecordingTransactionBase.wrap(tx,
+                               le -> le.getColumn().getFamily().toString().equals("meta"));
+```
+
+After creating a RecordingTransaction, users can use it as they would use a Transaction object.
+
+```java
+Bytes value = rtx.get(Bytes.of("r1"), new Column("cf1", "cq1"));
+```
+
+While SET or DELETE operations are always recorded to the log, GET operations are only recorded if a
+value was found at the requested row/column.  Also, if a GET method returns an iterator, only the GET
+operations that are retrieved from the iterator are logged.  GET operations are logged as they are
+necessary if you want to determine the changes made by the transaction.
+ 
+When you are done operating on the transaction, you can retrieve the TxLog using the following code:
+
+```java
+TxLog myTxLog = rtx.getTxLog()
+```
+
+Below is example code of how a RecordingTransaction can be used in an observer to record all operations
+performed by the transaction in a TxLog.  In this example, a GET (if data exists) and SET operation
+will be logged.  This TxLog can be added to an export queue and later used to export updates from 
+Fluo.
+
+```java
+public class MyObserver extends AbstractObserver {
+
+    private static final TYPEL = new TypeLayer(new StringEncoder());
+    
+    private ExportQueue<Bytes, TxLog> exportQueue;
+
+    @Override
+    public void process(TransactionBase tx, Bytes row, Column col) {
+
+        // create recording transaction (rtx)
+        RecordingTransactionBase rtx = RecordingTransactionBase.wrap(tx);
+        
+        // use rtx to create a typed transaction & perform operations
+        TypedTransactionBase ttx = TYPEL.wrap(rtx);
+        int count = ttx.get().row(row).fam("meta").qual("counter1").toInteger(0);
+        ttx.mutate().row(row).fam("meta").qual("counter1").set(count+1);
+        
+        // when finished performing operations, retrieve transaction log
+        TxLog txLog = rtx.getTxLog()
+
+        // add txLog to exportQueue if not empty
+        if (!txLog.isEmpty()) {
+          //do not pass rtx to exportQueue.add()
+          exportQueue.add(tx, row, txLog)
+        }
+    }
+}
+```
diff --git a/docs/fluo-recipes/1.1.0-incubating/row-hasher.md b/docs/fluo-recipes/1.1.0-incubating/row-hasher.md
new file mode 100644
index 0000000..e38dc85
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/row-hasher.md
@@ -0,0 +1,122 @@
+---
+layout: recipes-doc
+title: Row hash prefix recipe
+version: 1.1.0-incubating
+---
+## Background
+
+Transactions are implemented in Fluo using conditional mutations.  Conditional
+mutations require server side processing on tservers.  If data is not spread
+evenly, it can cause some tservers to execute more conditional mutations than
+others.  These tservers doing more work can become a bottleneck.  Most real
+world data is not uniform and can cause this problem.
+
+Before the Fluo [Webindex example][1] started using this recipe it suffered
+from this problem.  The example was using reverse dns encoded URLs for row keys
+like `p:com.cnn/story1.html`.  This made certain portions of the table more
+popular, which in turn made some tservers do much more work.  This uneven
+distribution of work lead to lower throughput and uneven performance.  Using
+this recipe made those problems go away.
+
+## Solution
+
+This recipe provides code to help add a hash of the row as a prefix of the row.
+Using this recipe rows are structured like the following.
+
+```
+<prefix>:<fixed len row hash>:<user row>
+```
+
+The recipe also provides code to help generate split points and configure
+balancing of the prefix.
+
+## Example Use
+
+```java
+import org.apache.fluo.api.config.FluoConfiguration;
+import org.apache.fluo.api.data.Bytes;
+import org.apache.fluo.recipes.core.data.RowHasher;
+
+public class RowHasherExample {
+
+
+  private static final RowHasher PAGE_ROW_HASHER = new RowHasher("p");
+
+  // Provide one place to obtain row hasher.
+  public static RowHasher getPageRowHasher() {
+    return PAGE_ROW_HASHER;
+  }
+
+  public static void main(String[] args) {
+    RowHasher pageRowHasher = getPageRowHasher();
+
+    String revUrl = "org.wikipedia/accumulo";
+
+    // Add a hash prefix to the row. Use this hashedRow in your transaction
+    Bytes hashedRow = pageRowHasher.addHash(revUrl);
+    System.out.println("hashedRow      : " + hashedRow);
+
+    // Remove the prefix. This can be used by transactions dealing with the hashed row.
+    Bytes orig = pageRowHasher.removeHash(hashedRow);
+    System.out.println("orig           : " + orig);
+
+
+    // Generate table optimizations for the recipe. This can be called when setting up an
+    // application that uses a hashed row.
+    int numTablets = 20;
+
+    // The following code would normally be called before initializing Fluo. This code
+    // registers table optimizations for your prefix+hash.
+    FluoConfiguration conf = new FluoConfiguration();
+    RowHasher.configure(conf, PAGE_ROW_HASHER.getPrefix(), numTablets);
+
+    // Normally you would not call the following code, it would be called automatically for you by
+    // TableOperations.optimizeTable(). Calling this code here to show what table optimization will
+    // be generated.
+    TableOptimizations tableOptimizations = new RowHasher.Optimizer()
+        .getTableOptimizations(PAGE_ROW_HASHER.getPrefix(), conf.getAppConfiguration());
+    System.out.println("Balance config : " + tableOptimizations.getTabletGroupingRegex());
+    System.out.println("Splits         : ");
+    tableOptimizations.getSplits().forEach(System.out::println);
+    System.out.println();
+  }
+}
+```
+
+The example program above prints the following.
+
+```
+hashedRow      : p:1yl0:org.wikipedia/accumulo
+orig           : org.wikipedia/accumulo
+Balance config : (\Qp:\E).*
+Splits         : 
+p:1sst
+p:3llm
+p:5eef
+p:7778
+p:9001
+p:assu
+p:clln
+p:eeeg
+p:g779
+p:i002
+p:jssv
+p:lllo
+p:neeh
+p:p77a
+p:r003
+p:sssw
+p:ullp
+p:weei
+p:y77b
+p:~
+```
+
+The split points are used to create tablets in the Accumulo table used by Fluo.
+Data and computation will spread very evenly across these tablets.  The
+Balancing config will spread the tablets evenly across the tablet servers,
+which will spread the computation evenly. See the [table optimizations][2]
+documentation for information on how to apply the optimizations.
+ 
+[1]: https://github.com/fluo-io/webindex
+[2]: /docs/fluo-recipes/1.1.0-incubating/table-optimization/
diff --git a/docs/fluo-recipes/1.1.0-incubating/serialization.md b/docs/fluo-recipes/1.1.0-incubating/serialization.md
new file mode 100644
index 0000000..01c53fd
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/serialization.md
@@ -0,0 +1,76 @@
+---
+layout: recipes-doc
+title: Serializing Data
+version: 1.1.0-incubating
+---
+Various Fluo Recipes deal with POJOs and need to serialize them.  The
+serialization mechanism is configurable and defaults to using [Kryo][1].
+
+## Custom Serialization
+
+In order to use a custom serialization method, two steps need to be taken.  The
+first step is to implement [SimpleSerializer][2].  The second step is to
+configure Fluo Recipes to use the custom implementation.  This needs to be done
+before initializing Fluo.  Below is an example of how to do this.
+
+```java
+  FluoConfiguration fluoConfig = ...;
+  //assume MySerializer implements SimpleSerializer
+  SimpleSerializer.setSetserlializer(fluoConfig, MySerializer.class);
+  //initialize Fluo using fluoConfig
+```
+
+## Kryo Factory
+
+If using the default Kryo serializer implementation, then creating a
+KryoFactory implementation can lead to smaller serialization size.  When Kryo
+serializes an object graph, it will by default include the fully qualified
+names of the classes in the serialized data.  This can be avoided by
+[registering classes][3] that will be serialized.  Registration is done by
+creating a KryoFactory and then configuring Fluo Recipes to use it.   The
+example below shows how to do this.
+
+For example assume the POJOs named `Node` and `Edge` will be serialized and
+need to be registered with Kryo.  This could be done by creating a KryoFactory
+like the following.
+
+```java
+
+package com.foo;
+
+import com.esotericsoftware.kryo.Kryo;
+import com.esotericsoftware.kryo.pool.KryoFactory;
+
+import com.foo.data.Edge;
+import com.foo.data.Node;
+
+public class MyKryoFactory implements KryoFactory {
+  @Override
+  public Kryo create() {
+    Kryo kryo = new Kryo();
+    
+    //Explicitly assign each class a unique id here to ensure its stable over
+    //time and in different environments with different dependencies.
+    kryo.register(Node.class, 9);
+    kryo.register(Edge.class, 10);
+    
+    //instruct kryo that these are the only classes we expect to be serialized
+    kryo.setRegistrationRequired(true);
+    
+    return kryo;
+  }
+}
+```
+
+Fluo Recipes must be configured to use this factory.  The following code shows
+how to do this.
+
+```java
+  FluoConfiguration fluoConfig = ...;
+  KryoSimplerSerializer.setKryoFactory(fluoConfig, MyKryoFactory.class);
+  //initialize Fluo using fluoConfig
+```
+
+[1]: https://github.com/EsotericSoftware/kryo
+[2]: {{ site.api_static }}/fluo-recipes-core/1.1.0-incubating/org/apache/fluo/recipes/core/serialization/SimpleSerializer.html
+[3]: https://github.com/EsotericSoftware/kryo#registration
diff --git a/docs/fluo-recipes/1.1.0-incubating/spark.md b/docs/fluo-recipes/1.1.0-incubating/spark.md
new file mode 100644
index 0000000..9f93a3a
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/spark.md
@@ -0,0 +1,19 @@
+---
+layout: recipes-doc
+title: Apache Spark helper code
+version: 1.1.0-incubating
+---
+Fluo Recipes has some helper code for [Apache Spark][spark].  Most of the helper code is currently
+related to bulk importing data into Accumulo.  This is useful for initializing a new Fluo table with
+historical data via Spark.  The Spark helper code is found at
+[org/apache/fluo/recipes/spark/][sdir].
+
+For information on using Spark to load data into Fluo, check out this [blog post][blog].
+
+If you know of other Spark+Fluo integration code that would be useful, then please consider [opening
+an issue](https://github.com/apache/fluo-recipes/issues/new).
+
+[spark]: https://spark.apache.org
+[sdir]: {{ site.api_base }}/fluo-recipes-spark/1.1.0-incubating/
+[blog]: https://fluo.apache.org/blog/2016/12/22/spark-load/
+
diff --git a/docs/fluo-recipes/1.1.0-incubating/table-optimization.md b/docs/fluo-recipes/1.1.0-incubating/table-optimization.md
new file mode 100644
index 0000000..54f192a
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/table-optimization.md
@@ -0,0 +1,65 @@
+---
+layout: recipes-doc
+title: Fluo Table Optimization
+version: 1.1.0-incubating
+---
+## Background
+
+Recipes may need to make Accumulo specific table modifications for optimal
+performance.  Configuring the [Accumulo tablet balancer][3] and adding splits are
+two optimizations that are currently done.  Offering a standard way to do these
+optimizations makes it easier to use recipes correctly.  These optimizations
+are optional.  You could skip them for integration testing, but would probably
+want to use them in production.
+
+## Java Example
+
+```java
+FluoConfiguration fluoConf = ...
+
+//export queue configure method will return table optimizations it would like made
+ExportQueue.configure(fluoConf, ...);
+
+//CollisionFreeMap.configure() will return table optimizations it would like made
+CollisionFreeMap.configure(fluoConf, ...);
+
+//configure optimizations for a prefixed hash range of a table
+RowHasher.configure(fluoConf, ...);
+
+//initialize Fluo
+FluoFactory.newAdmin(fluoConf).initialize(...)
+
+//Automatically optimize the Fluo table for all configured recipes
+TableOperations.optimizeTable(fluoConf);
+```
+
+[TableOperations][2] is provided in the Accumulo module of Fluo Recipes.
+
+## Command Example
+
+Fluo Recipes provides an easy way to optimize a Fluo table for configured
+recipes from the command line.  This should be done after configuring reciped
+and initializing Fluo.  Below are example command for initializing in this way.
+
+```bash
+
+#create application 
+fluo new app1
+
+#configure application
+
+#initialize Fluo
+fluo init app1
+
+#optimize table for all configured recipes
+fluo exec app1 org.apache.fluo.recipes.accumulo.cmds.OptimizeTable
+```
+
+## Table optimization registry
+
+Recipes register themself by calling [TableOptimizations.registerOptimization()][1].  Anyone can use
+this mechanism, its not limited to use by exisitng recipes.
+
+[1]: {{ site.api_static }}/fluo-recipes-core/1.1.0-incubating/org/apache/fluo/recipes/core/common/TableOptimizations.html
+[2]: {{ site.api_static }}/fluo-recipes-accumulo/1.1.0-incubating/org/apache/fluo/recipes/accumulo/ops/TableOperations.html
+[3]: http://accumulo.apache.org/blog/2015/03/20/balancing-groups-of-tablets.html
diff --git a/docs/fluo-recipes/1.1.0-incubating/testing.md b/docs/fluo-recipes/1.1.0-incubating/testing.md
new file mode 100644
index 0000000..bc598d6
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/testing.md
@@ -0,0 +1,14 @@
+---
+layout: recipes-doc
+title: Testing
+version: 1.1.0-incubating
+---
+Fluo includes MiniFluo which makes it possible to write an integeration test that
+runs against a real Fluo instance.  Fluo Recipes provides the following utility
+code for writing an integration test.
+
+ * [FluoITHelper][1] A class with utility methods for comparing expected data with whats in Fluo.
+ * [AccumuloExportITBase][2] A base class for writing an integration test that exports data from Fluo to an external Accumulo table.
+
+[1]: {{ site.api_static }}/fluo-recipes-test/1.1.0-incubating/org/apache/fluo/recipes/test/FluoITHelper.html
+[2]: {{ site.api_static }}/fluo-recipes-test/1.1.0-incubating/org/apache/fluo/recipes/test/AccumuloExportITBase.html
diff --git a/docs/fluo-recipes/1.1.0-incubating/transient.md b/docs/fluo-recipes/1.1.0-incubating/transient.md
new file mode 100644
index 0000000..1930270
--- /dev/null
+++ b/docs/fluo-recipes/1.1.0-incubating/transient.md
@@ -0,0 +1,83 @@
+---
+layout: recipes-doc
+title: Transient data
+version: 1.1.0-incubating
+---
+## Background
+
+Some recipes store transient data in a portion of the Fluo table.  Transient
+data is data thats continually being added and deleted.  Also these transient
+data ranges contain no long term data.  The way Fluo works, when data is
+deleted a delete marker is inserted but the data is actually still there.  Over
+time these transient ranges of the table will have a lot more delete markers
+than actual data if nothing is done.  If nothing is done, then processing
+transient data will get increasingly slower over time.
+
+These delete markers can be cleaned up by forcing Accumulo to compact the
+Fluo table, which will run Fluos garbage collection iterator. However,
+compacting the entire table to clean up these ranges within a table is
+overkill. Alternatively,  Accumulo supports compacting ranges of a table.   So
+a good solution to the delete marker problem is to periodically compact just
+the transient ranges. 
+
+Fluo Recipes provides helper code to deal with transient data ranges in a
+standard way.
+
+## Registering Transient Ranges
+
+Recipes like [Export Queue](export-queue.md) will automatically register
+transient ranges when configured.  If you would like to register your own
+transient ranges, use [TransientRegistry][1].  Below is a simple example of
+using this.
+
+```java
+FluoConfiguration fluoConfig = ...;
+TransientRegistry transientRegistry = new TransientRegistry(fluoConfig.getAppConfiguration());
+transientRegistry.addTransientRange(new RowRange(startRow, endRow));
+
+//Initialize Fluo using fluoConfig. This will store the registered ranges in
+//zookeeper making them availiable on any node later.
+```
+
+## Compacting Transient Ranges
+
+Although you may never need to register transient ranges directly, you will
+need to periodically compact transient ranges if using a recipe that registers
+them.  Using [TableOperations][2] this can be done with one line of Java code
+like the following.
+
+```java
+FluoConfiguration fluoConfig = ...;
+TableOperations.compactTransient(fluoConfig);
+```
+
+Fluo recipes provides an easy way to compact transient ranges from the command line using the `fluo exec` command as follows:
+
+```
+fluo exec <app name> org.apache.fluo.recipes.accumulo.cmds.CompactTransient [<interval> [<multiplier>]]
+```
+
+If no arguments are specified the command will call `compactTransient()` once.
+If `<interval>` is specified the command will run forever compacting transient
+ranges sleeping `<interval>` seconds between compacting each transient ranges.
+
+In the case where Fluo is backed up in processing data a transient range could
+have a lot of data queued and compacting it too frequently would be
+counterproductive.  To avoid this the `CompactTransient` command will consider
+the time it took to compact a range when deciding when to compact that range
+next.  This is where the `<multiplier>` argument comes in, the time to sleep
+between compactions of a range is determined as follows.  If not specified, the
+multiplier defaults to 3.
+
+```java
+   sleepTime = Math.max(compactTime * multiplier, interval);
+```
+
+For example assume a Fluo application has two transient ranges.  Also assume
+CompactTransient is run with an interval of 600 and a multiplier of 10.  If the
+first range takes 20 seconds to compact, then it will be compacted again in 600
+seconds.  If the second range takes 80 seconds to compact, then it will be
+compacted again in 800 seconds.
+
+[1]: {{ site.api_static }}/fluo-recipes-core/1.1.0-incubating/org/apache/fluo/recipes/core/common/TransientRegistry.html
+[2]: {{ site.api_static }}/fluo-recipes-accumulo/1.1.0-incubating/org/apache/fluo/recipes/accumulo/ops/TableOperations.html
diff --git a/docs/index.md b/docs/index.md
index 63dab6d..b7ace5f 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -17,6 +17,7 @@ For a general overview of Fluo, take the [Fluo tour](/tour/).
 
 ### Apache Fluo Recipes documentation
 
+* [1.1.0-incubating][recipes-1.1] - June 22, 2017
 * [1.0.0-incubating][recipes-1.0] - October 28, 2016
 
 Documentation for releases before joining Apache have been [archived](archive).
@@ -25,4 +26,5 @@ Documentation for releases before joining Apache have been [archived](archive).
 [Apache Fluo Recipes]: https://github.com/apache/fluo-recipes
 [fluo-1.1]: /docs/fluo/1.1.0-incubating/
 [fluo-1.0]: /docs/fluo/1.0.0-incubating/
+[recipes-1.1]: /docs/fluo-recipes/1.1.0-incubating/
 [recipes-1.0]: /docs/fluo-recipes/1.0.0-incubating/

-- 
To stop receiving notification emails like this one, please contact
['"commits@fluo.apache.org" <commits@fluo.apache.org>'].

Mime
View raw message