Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A4F8A200BE4 for ; Tue, 6 Dec 2016 20:14:42 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id A3AD5160B17; Tue, 6 Dec 2016 19:14:42 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 84558160B2A for ; Tue, 6 Dec 2016 20:14:41 +0100 (CET) Received: (qmail 56179 invoked by uid 500); 6 Dec 2016 19:14:40 -0000 Mailing-List: contact commits-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@accumulo.apache.org Delivered-To: mailing list commits@accumulo.apache.org Received: (qmail 56075 invoked by uid 99); 6 Dec 2016 19:14:40 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Dec 2016 19:14:40 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 6F6BEF1712; Tue, 6 Dec 2016 19:14:40 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: mwalch@apache.org To: commits@accumulo.apache.org Date: Tue, 06 Dec 2016 19:14:45 -0000 Message-Id: In-Reply-To: References: X-Mailer: ASF-Git Admin Mailer Subject: [06/11] accumulo git commit: ACCUMULO-4531 Removed constraints.html, lgroups.html, timestamps.html archived-at: Tue, 06 Dec 2016 19:14:42 -0000 ACCUMULO-4531 Removed constraints.html, lgroups.html, timestamps.html * Similar docs exist in 'Table Configuration' section of user manual Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/89c03484 Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/89c03484 Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/89c03484 Branch: refs/heads/master Commit: 89c03484df5a1d6f2a9bc62e74b863ffdc921327 Parents: 52d526b Author: Mike Walch Authored: Tue Dec 6 09:49:52 2016 -0500 Committer: Mike Walch Committed: Tue Dec 6 14:04:53 2016 -0500 ---------------------------------------------------------------------- .../asciidoc/chapters/table_configuration.txt | 3 +- docs/src/main/resources/constraints.html | 50 ------ docs/src/main/resources/lgroups.html | 45 ------ docs/src/main/resources/timestamps.html | 160 ------------------- 4 files changed, 2 insertions(+), 256 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/accumulo/blob/89c03484/docs/src/main/asciidoc/chapters/table_configuration.txt ---------------------------------------------------------------------- diff --git a/docs/src/main/asciidoc/chapters/table_configuration.txt b/docs/src/main/asciidoc/chapters/table_configuration.txt index fa2b16c..a3e4dcd 100644 --- a/docs/src/main/asciidoc/chapters/table_configuration.txt +++ b/docs/src/main/asciidoc/chapters/table_configuration.txt @@ -214,7 +214,6 @@ has the following method [source,java] connector.tableOperations.create(String tableName, boolean limitVersion); - ===== Logical Time Accumulo 1.2 introduces the concept of logical time. This ensures that timestamps @@ -230,6 +229,7 @@ A table can be configured to use logical timestamps at creation time as follows: user@myinstance> createtable -tl logical ===== Deletes + Deletes are special keys in Accumulo that get sorted along will all the other data. When a delete key is inserted, Accumulo will not show anything that has a timestamp less than or equal to the delete key. During major compaction, any keys @@ -237,6 +237,7 @@ older than a delete key are omitted from the new file created, and the omitted k are removed from disk as part of the regular garbage collection process. ==== Filters + When scanning over a set of key-value pairs it is possible to apply an arbitrary filtering policy through the use of a Filter. Filters are types of iterators that return only key-value pairs that satisfy the filter logic. Accumulo has a few built-in filters http://git-wip-us.apache.org/repos/asf/accumulo/blob/89c03484/docs/src/main/resources/constraints.html ---------------------------------------------------------------------- diff --git a/docs/src/main/resources/constraints.html b/docs/src/main/resources/constraints.html deleted file mode 100644 index d6e5037..0000000 --- a/docs/src/main/resources/constraints.html +++ /dev/null @@ -1,50 +0,0 @@ - - - -Accumulo Constraints - - - - -

Apache Accumulo Documentation : Constraints

- -Accumulo supports constraints. Constraints are applied to mutations at ingest time. - -

Implementing a new constraint is a snap. Simply write some Java code that -implements org.apache.accumulo.core.constraints.Constraint. - -

To deploy a new constraint, jar it up and put the jar in accumulo/lib/ext. - -

After creating a constraint, set a table specific property to use it. The following example adds two constraints to table foo. In the example com.test.ExampleConstraint and com.test.AnotherConstraint are class names. - -

-user@instance:9999 perDayCounts> createtable foo
-user@instance:9999 foo> config -t foo -s table.constraint.1=com.test.ExampleConstraint
-user@instance:9999 foo> config -t foo -s table.constraint.2=com.test.AnotherConstraint
-user@instance:9999 foo> config -t foo -f constraint
----------+------------------------------------------+-----------------------------------------
-SCOPE    | NAME                                     | VALUE
----------+------------------------------------------+-----------------------------------------
-table    | table.constraint.1...................... | com.test.ExampleConstraint
-table    | table.constraint.2...................... | com.test.AnotherConstraint
----------+------------------------------------------+-----------------------------------------
-user@instance:9999 foo>
-
- - - http://git-wip-us.apache.org/repos/asf/accumulo/blob/89c03484/docs/src/main/resources/lgroups.html ---------------------------------------------------------------------- diff --git a/docs/src/main/resources/lgroups.html b/docs/src/main/resources/lgroups.html deleted file mode 100644 index 3d2bc0e..0000000 --- a/docs/src/main/resources/lgroups.html +++ /dev/null @@ -1,45 +0,0 @@ - - - -Accumulo Locality Groups - - - - -

Apache Accumulo Documentation : Locality Groups

- -

Accumulo supports locality groups similar to those described in the Big Table paper. Locality groups allow vertical partitioning of data by column family. This allows user to configure their tables such that scans over a subset of column families are much faster. The Accumulo locality group model has the following features. - -

    -
  • There is a default locality group that holds all column families not in a declared locality group. -
  • No requirement to declare locality groups or column families at table creation. -
  • Can change locality group configuration on the fly. -
- - -

When the locality group configuration for a table is changed it has no effect on existing data. All minor and major compactions that occur after the change will organize data into the new locality group structure. As data is written into a table, it will cause minor and major compactions to occur. Over time this will result in all data being organized according to the new locality groups. If all data must be reorganized into the new locality groups immediately, this can be accomplished by forcing a full major compaction of the table. Use the compact command in the shell to accomplish this. - -

There are two ways to manipulate locality groups, via the shell or through -the Java API. From the shell use the getgroups and setgroups commands. Through -the API, TableOperations has the methods setLocalityGroups() and getLocalityGroups(). - -

To limit scans to a set of locality groups, use the fetchColumnFamily() -function on Scanner or BatchScanner. From the shell use scan with the -c option. - - - http://git-wip-us.apache.org/repos/asf/accumulo/blob/89c03484/docs/src/main/resources/timestamps.html ---------------------------------------------------------------------- diff --git a/docs/src/main/resources/timestamps.html b/docs/src/main/resources/timestamps.html deleted file mode 100644 index 9c240d2..0000000 --- a/docs/src/main/resources/timestamps.html +++ /dev/null @@ -1,160 +0,0 @@ - - - -Accumulo Timestamps - - - - -

Apache Accumulo Documentation : Timestamps

- -

Everything inserted into accumulo has a timestamp. If the user does not -set it, then the system will set the timestamp. The timestamp is the last -thing accumulo sorts on. So when two keys have the same row, column family, -column qualifier, and column visibility then the timestamp of the two keys is -compared. - -

Timestamps are sorted in descending order, so the most recent data comes -first. When a table is created in accumulo, by default it has a versioning -iterator that only shows the most recent. In the example below two identical -things are inserted. The scan after that only shows the most recent version. -However when the versioning iterator configuration is changed, then both are -seen. When data is inserted with a lower timestamp than existing data, it will -fall behind the existing data and may not be seen depending on the versioning -settings. This is why the insert made with a timestamp of 500 is not seen in -the scan below. - -

-root@ac12> createtable foo
-root@ac12 foo>
-root@ac12 foo>
-root@ac12 foo> insert r1 cf1 cq1 value1
-root@ac12 foo> insert r1 cf1 cq1 value2
-root@ac12 foo> scan -st
-r1 cf1:cq1 [] 1279906856203    value2
-root@ac12 foo> config -t foo -f iterator
----------+---------------------------------------------+-----------------------------------------------------------------------------------------------------
-SCOPE    | NAME                                        | VALUE
----------+---------------------------------------------+-----------------------------------------------------------------------------------------------------
-table    | table.iterator.majc.vers .................. | 20,org.apache.accumulo.core.iterators.VersioningIterator
-table    | table.iterator.majc.vers.opt.maxVersions .. | 1
-table    | table.iterator.minc.vers .................. | 20,org.apache.accumulo.core.iterators.VersioningIterator
-table    | table.iterator.minc.vers.opt.maxVersions .. | 1
-table    | table.iterator.scan.vers .................. | 20,org.apache.accumulo.core.iterators.VersioningIterator
-table    | table.iterator.scan.vers.opt.maxVersions .. | 1
----------+---------------------------------------------+-----------------------------------------------------------------------------------------------------
-root@ac12 foo> config -t foo -s table.iterator.scan.vers.opt.maxVersions=3
-root@ac12 foo> config -t foo -s table.iterator.minc.vers.opt.maxVersions=3
-root@ac12 foo> config -t foo -s table.iterator.majc.vers.opt.maxVersions=3
-root@ac12 foo> scan -st
-r1 cf1:cq1 [] 1279906856203    value2
-r1 cf1:cq1 [] 1279906853170    value1
-root@ac12 foo> insert -t 600 r1 cf1 cq1 value3
-root@ac12 foo> insert -t 500 r1 cf1 cq1 value4
-root@ac12 foo> scan -st
-r1 cf1:cq1 [] 1279906856203    value2
-r1 cf1:cq1 [] 1279906853170    value1
-r1 cf1:cq1 [] 600    value3
-root@ac12 foo>
-
-
- -

Deletes are special keys in accumulo that get sorted along will all the other -data. When a delete key is inserted, accumulo will not show anything that has -a timestamp less than or equal to the delete key. In the example below an -insert is made with timestamp 5 and then a delete is inserted with timestamp 3. -The scan after that show that the delete marker does not hide the key. However -when a delete is inserted with timestamp 5, then nothing can be seen. Once a -delete marker is inserted, it is there until a full major compaction occurs. -That is why the insert made after the delete can not be seen. The insert after -the flush and compact commands can be seen because the delete marker is gone. -The flush forced a minor compaction and compact forced a full major compaction. - -

-root@ac12> createtable bar
-root@ac12 bar> insert -t 5 r1 cf1 cq1 val1
-root@ac12 bar> scan -st
-r1 cf1:cq1 [] 5    val1
-root@ac12 bar> delete -t 3 r1 cf1 cq1
-root@ac12 bar> scan
-r1 cf1:cq1 []    val1
-root@ac12 bar> scan -st
-r1 cf1:cq1 [] 5    val1
-root@ac12 bar> delete -t 5 r1 cf1 cq1
-root@ac12 bar> scan -st
-root@ac12 bar> insert -t 5 r1 cf1 cq1 val2
-root@ac12 bar> scan -st
-root@ac12 bar> flush -t bar
-23 14:01:36,587 [shell.Shell] INFO : Flush of table bar initiated...
-root@ac12 bar> compact -t bar
-23 14:02:00,042 [shell.Shell] INFO : Compaction of table bar scheduled for 20100723140200EDT
-root@ac12 bar> insert -t 5 r1 cf1 cq1 val1
-root@ac12 bar> scan
-r1 cf1:cq1 []    val1
-
- -

If two inserts are made into accumulo with the same row, column, and -timestamp, then the behavior is non-deterministic. - -

Accumulo 1.2 introduces the concept of logical time. This ensures that -timestamps set by accumulo always move forward. There have been many problems -caused by tablet servers with different system times. In the case where a -tablet servers time is in the future, tablets hosted on that tablet server and -then migrated will have future timestamps in their data. This can cause newer -keys to fall behind existing keys, which can result in seeing older data or not -seeing data if a new key falls behind on old delete. Logical time prevents -this by ensuring that accumulo set time stamps never go backwards, on a per -tablet basis. So if a tablet servers time is a year in the future, then any -tablet hosted there will generate timestamps a year in the future even when -later hosted on a server with correct time. Logical time can be configured on a -per table basis to either set time in millis or to use a per tablet counter. -The per tablet counter gives unique one up time stamps on a per mutation -basis. When using time in millis, if two things arrive within the same -millisecond then both receive the same timestamp. - -

The example below shows a table created using a per tablet counter for -timestamps. Two inserts are made, the first gets timestamp 0 the second 1. -After that the table is split into two tablets and two more inserts are made. -These inserts get the same timestamp because they are made on different -tablets. When the original tablet is split into two, the two child tablets -inherit the next timestamp of their parent and start from there. So do not -expect this configuration to offer unique timestamps across a table. Its only -purpose is to uniquely order events within a tablet. - -

-root@ac12 foo> createtable -tl logical
-root@ac12 logical> insert 000892 person name "John Doe"
-root@ac12 logical> insert 003042 person name "Jane Doe"
-root@ac12 logical> scan -st
-000892 person:name [] 0    John Doe
-003042 person:name [] 1    Jane Doe
-root@ac12 logical>
-root@ac12 logical> addsplits -t logical 002000
-root@ac12 logical> insert 003042 person address "123 Somewhere"
-root@ac12 logical> insert 000892 person address "123 Nowhere"
-root@ac12 logical> scan -st
-000892 person:address [] 2    123 Nowhere
-000892 person:name [] 0    John Doe
-003042 person:address [] 2    123 Somewhere
-003042 person:name [] 1    Jane Doe
-root@ac12 logical>
-
-
- - -