Return-Path: X-Original-To: apmail-accumulo-commits-archive@www.apache.org Delivered-To: apmail-accumulo-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1422F1189B for ; Tue, 17 Jun 2014 06:11:58 +0000 (UTC) Received: (qmail 90829 invoked by uid 500); 17 Jun 2014 06:11:57 -0000 Delivered-To: apmail-accumulo-commits-archive@accumulo.apache.org Received: (qmail 90796 invoked by uid 500); 17 Jun 2014 06:11:57 -0000 Mailing-List: contact commits-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list commits@accumulo.apache.org Received: (qmail 90787 invoked by uid 99); 17 Jun 2014 06:11:57 -0000 Received: from tyr.zones.apache.org (HELO tyr.zones.apache.org) (140.211.11.114) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Jun 2014 06:11:57 +0000 Received: by tyr.zones.apache.org (Postfix, from userid 65534) id 93DE0941E82; Tue, 17 Jun 2014 06:11:57 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: elserj@apache.org To: commits@accumulo.apache.org Message-Id: <9d0404f021034af9ba0eb1298144f719@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: git commit: ACCUMULO-2901 Document cases where replication won't work well Date: Tue, 17 Jun 2014 06:11:57 +0000 (UTC) Repository: accumulo Updated Branches: refs/heads/master 358c13e66 -> 3ed4b2796 ACCUMULO-2901 Document cases where replication won't work well the architecture serves itself well for a certain set of problems, but is not a universal solution in its current implementation. provide a warning for some of those edge cases to manage user expectations and not mislead them. Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/3ed4b279 Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/3ed4b279 Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/3ed4b279 Branch: refs/heads/master Commit: 3ed4b2796ac95808af7aef62d93e6607cedc4d62 Parents: 358c13e Author: Josh Elser Authored: Mon Jun 16 23:10:24 2014 -0700 Committer: Josh Elser Committed: Mon Jun 16 23:10:24 2014 -0700 ---------------------------------------------------------------------- docs/src/main/asciidoc/chapters/replication.txt | 49 ++++++++++++++++++++ 1 file changed, 49 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/accumulo/blob/3ed4b279/docs/src/main/asciidoc/chapters/replication.txt ---------------------------------------------------------------------- diff --git a/docs/src/main/asciidoc/chapters/replication.txt b/docs/src/main/asciidoc/chapters/replication.txt index 03bf67b..8755e24 100644 --- a/docs/src/main/asciidoc/chapters/replication.txt +++ b/docs/src/main/asciidoc/chapters/replication.txt @@ -312,3 +312,52 @@ Finally, we can enable replication on this table. ---- root@primary> config -t my_table -s table.replication=true ---- + +=== Extra considerations for use + +While this feature is intended for general-purpose use, its implementation does carry some baggage. Like any software, +replication is a feature that operates well within some set of use cases but is not meant to support all use cases. +For the benefit of the users, we can enumerate these cases. + +==== Latency + +As previously mentioned, the replication feature uses the Write-Ahead Log files for a number of reasons, one of which +is to prevent the need for data to be written to RFiles before it is available to be replicated. While this can help +reduce the latency for a batch of Mutations that have been written to Accumulo, the latency is at least seconds to tens +of seconds for replication once ingest is active. For a table which replication has just been enabled on, this is likely +to take a few minutes before replication will begin. + +Once ingest is active and flowing into the system at a regular rate, replication should be occurring at a similar rate, +given sufficient computing resources. Replication attempts to copy data at a rate that is to be considered low latency +but is not a replacement for custom indexing code which can ensure near real-time referential integrity on secondary indexes. + +==== Table-Configured Iterators + +Accumulo Iterators tend to be a heavy hammer which can be used to solve a variety of problems. In general, it is highly +recommended that Iterators which are applied at major compaction time are both idempotent and associative due to the +non-determinism in which some set of files for a Tablet might be compacted. In practice, this translates to common patterns, +such as aggregation, which are implemented in a manner resilient to duplication (such as using a Set instead of a List). + +Due to the asynchronous nature of replication and the expectation that hardware failures and network partitions will exist, +it is generally not recommended to not configure replication on a table which has Iterators set which are not idempotent. +While the replication implementation can make some simple assertions to try to avoid re-replication of data, it is not +presently guaranteed that all data will only be sent to a peer once. Data will be replicated at least once. Typically, +this is not a problem as the VersioningIterator will automaticaly deduplicate this over-replication because they will +have the same timestamp; however, certain Combiners may result in inaccurate aggregations. + +As a concrete example, consider a table which has the SummingCombiner configured to sum all values for +multiple versions of the same Key. For some key, consider a set of numeric values that are written to a table on the +primary: [1, 2, 3]. On the primary, all of these are successfully written and thus the current value for the given key +would be 6, (1 + 2 + 3). Consider, however, that each of these updates to the peer were done independently (because +other data was also included in the write-ahead log that needed to be replicated). The update with a value of 1 was +successfully replicated, and then we attempted to replicate the update with a value of 2 but the remote server never +responded. The primary does not know whether the update with a value of 2 was actually applied or not, so the +only recourse is to re-send the update. After we receive confirmation that the update with a value of 2 was replicated, +we will then replicate the update with 3. If the peer did never apply the first update of '2', the summation is accurate. +If the update was applied but the acknowledgement was lost for some reason (system failure, network partition), the +update will be resent to the peer. Because addition is non-idempotent, we have created an inconsistency between the +primary and peer. As such, the SummingCombiner wouldn't be recommended on a table being replicated. + +While there are changes that could be made to the replication implementation which could attempt to mitigate this risk, +presently, it is not recommended to configure Iterators or Combiners which are not idempotent to support cases where +inaccuracy of aggregations is not acceptable.