accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mwa...@apache.org
Subject [6/7] accumulo-examples git commit: ACCUMULO-4511 Adding examples from Accumulo repo
Date Fri, 09 Dec 2016 17:12:19 GMT
http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/mapred.md
----------------------------------------------------------------------
diff --git a/docs/mapred.md b/docs/mapred.md
new file mode 100644
index 0000000..2768a5d
--- /dev/null
+++ b/docs/mapred.md
@@ -0,0 +1,154 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo MapReduce Example
+
+This example uses MapReduce and Accumulo to compute word counts for a set of
+documents. This is accomplished using a map-only MapReduce job and a
+Accumulo table with combiners.
+
+To run this example, you will need a directory in HDFS containing text files.
+The accumulo readme will be used to show how to run this example.
+
+    $ hadoop fs -copyFromLocal /path/to/accumulo/README.md /user/username/wc/Accumulo.README
+    $ hadoop fs -ls /user/username/wc
+    Found 1 items
+    -rw-r--r--   2 username supergroup       9359 2009-07-15 17:54 /user/username/wc/Accumulo.README
+
+The first part of running this example is to create a table with a combiner
+for the column family count.
+
+    $ accumulo shell -u username -p password
+    Shell - Apache Accumulo Interactive Shell
+    - version: 1.5.0
+    - instance name: instance
+    - instance id: 00000000-0000-0000-0000-000000000000
+    -
+    - type 'help' for a list of available commands
+    -
+    username@instance> createtable wordCount
+    username@instance wordCount> setiter -class org.apache.accumulo.core.iterators.user.SummingCombiner -p 10 -t wordCount -majc -minc -scan
+    SummingCombiner interprets Values as Longs and adds them together. A variety of encodings (variable length, fixed length, or string) are available
+    ----------> set SummingCombiner parameter all, set to true to apply Combiner to every column, otherwise leave blank. if true, columns option will be ignored.: false
+    ----------> set SummingCombiner parameter columns, <col fam>[:<col qual>]{,<col fam>[:<col qual>]} escape non-alphanum chars using %<hex>.: count
+    ----------> set SummingCombiner parameter lossy, if true, failed decodes are ignored. Otherwise combiner will error on failed decodes (default false): <TRUE|FALSE>: false
+    ----------> set SummingCombiner parameter type, <VARLEN|FIXEDLEN|STRING|fullClassName>: STRING
+    username@instance wordCount> quit
+
+After creating the table, run the word count map reduce job.
+
+    $ tool.sh target/accumulo-examples.jar org.apache.accumulo.examples.mapreduce.WordCount -i instance -z zookeepers  --input /user/username/wc -t wordCount -u username -p password
+
+    11/02/07 18:20:11 INFO input.FileInputFormat: Total input paths to process : 1
+    11/02/07 18:20:12 INFO mapred.JobClient: Running job: job_201102071740_0003
+    11/02/07 18:20:13 INFO mapred.JobClient:  map 0% reduce 0%
+    11/02/07 18:20:20 INFO mapred.JobClient:  map 100% reduce 0%
+    11/02/07 18:20:22 INFO mapred.JobClient: Job complete: job_201102071740_0003
+    11/02/07 18:20:22 INFO mapred.JobClient: Counters: 6
+    11/02/07 18:20:22 INFO mapred.JobClient:   Job Counters
+    11/02/07 18:20:22 INFO mapred.JobClient:     Launched map tasks=1
+    11/02/07 18:20:22 INFO mapred.JobClient:     Data-local map tasks=1
+    11/02/07 18:20:22 INFO mapred.JobClient:   FileSystemCounters
+    11/02/07 18:20:22 INFO mapred.JobClient:     HDFS_BYTES_READ=10487
+    11/02/07 18:20:22 INFO mapred.JobClient:   Map-Reduce Framework
+    11/02/07 18:20:22 INFO mapred.JobClient:     Map input records=255
+    11/02/07 18:20:22 INFO mapred.JobClient:     Spilled Records=0
+    11/02/07 18:20:22 INFO mapred.JobClient:     Map output records=1452
+
+After the map reduce job completes, query the accumulo table to see word
+counts.
+
+    $ accumulo shell -u username -p password
+    username@instance> table wordCount
+    username@instance wordCount> scan -b the
+    the count:20080906 []    75
+    their count:20080906 []    2
+    them count:20080906 []    1
+    then count:20080906 []    1
+    there count:20080906 []    1
+    these count:20080906 []    3
+    this count:20080906 []    6
+    through count:20080906 []    1
+    time count:20080906 []    3
+    time. count:20080906 []    1
+    to count:20080906 []    27
+    total count:20080906 []    1
+    tserver, count:20080906 []    1
+    tserver.compaction.major.concurrent.max count:20080906 []    1
+    ...
+
+Another example to look at is
+org.apache.accumulo.examples.mapreduce.UniqueColumns. This example
+computes the unique set of columns in a table and shows how a map reduce job
+can directly read a tables files from HDFS.
+
+One more example available is
+org.apache.accumulo.examples.mapreduce.TokenFileWordCount.
+The TokenFileWordCount example works exactly the same as the WordCount example
+explained above except that it uses a token file rather than giving the
+password directly to the map-reduce job (this avoids having the password
+displayed in the job's configuration which is world-readable).
+
+To create a token file, use the create-token utility
+
+  $ accumulo create-token
+
+It defaults to creating a PasswordToken, but you can specify the token class
+with -tc (requires the fully qualified class name). Based on the token class,
+it will prompt you for each property required to create the token.
+
+The last value it prompts for is a local filename to save to. If this file
+exists, it will append the new token to the end. Multiple tokens can exist in
+a file, but only the first one for each user will be recognized.
+
+Rather than waiting for the prompts, you can specify some options when calling
+create-token, for example
+
+  $ accumulo create-token -u root -p secret -f root.pw
+
+would create a token file containing a PasswordToken for
+user 'root' with password 'secret' and saved to 'root.pw'
+
+This local file needs to be uploaded to hdfs to be used with the
+map-reduce job. For example, if the file were 'root.pw' in the local directory:
+
+  $ hadoop fs -put root.pw root.pw
+
+This would put 'root.pw' in the user's home directory in hdfs.
+
+Because the basic WordCount example uses Opts to parse its arguments
+(which extends ClientOnRequiredTable), you can use a token file with
+the basic WordCount example by calling the same command as explained above
+except replacing the password with the token file (rather than -p, use -tf).
+
+  $ tool.sh target/accumulo-examples.jar org.apache.accumulo.examples.mapreduce.WordCount -i instance -z zookeepers  --input /user/username/wc -t wordCount -u username -tf tokenfile
+
+In the above examples, username was 'root' and tokenfile was 'root.pw'
+
+However, if you don't want to use the Opts class to parse arguments,
+the TokenFileWordCount is an example of using the token file manually.
+
+  $ tool.sh target/accumulo-examples.jar org.apache.accumulo.examples.mapreduce.TokenFileWordCount instance zookeepers username tokenfile /user/username/wc wordCount
+
+The results should be the same as the WordCount example except that the
+authentication token was not stored in the configuration. It was instead
+stored in a file that the map-reduce job pulled into the distributed cache.
+(If you ran either of these on the same table right after the
+WordCount example, then the resulting counts should just double.)
+
+
+
+

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/maxmutation.md
----------------------------------------------------------------------
diff --git a/docs/maxmutation.md b/docs/maxmutation.md
new file mode 100644
index 0000000..d091adb
--- /dev/null
+++ b/docs/maxmutation.md
@@ -0,0 +1,49 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo MaxMutation Constraints Example
+
+This an example of how to limit the size of mutations that will be accepted into
+a table. Under the default configuration, accumulo does not provide a limitation
+on the size of mutations that can be ingested. Poorly behaved writers might
+inadvertently create mutations so large, that they cause the tablet servers to
+run out of memory. A simple contraint can be added to a table to reject very
+large mutations.
+
+    $ accumulo shell -u username -p password
+
+    Shell - Apache Accumulo Interactive Shell
+    -
+    - version: 1.5.0
+    - instance name: instance
+    - instance id: 00000000-0000-0000-0000-000000000000
+    -
+    - type 'help' for a list of available commands
+    -
+    username@instance> createtable test_ingest
+    username@instance test_ingest> config -t test_ingest -s table.constraint.1=org.apache.accumulo.examples.constraints.MaxMutationSize
+    username@instance test_ingest>
+
+
+Now the table will reject any mutation that is larger than 1/256th of the 
+working memory of the tablet server.  The following command attempts to ingest 
+a single row with 10000 columns, which exceeds the memory limit. Depending on the
+amount of Java heap your tserver(s) are given, you may have to increase the number
+of columns provided to see the failure.
+
+    $ accumulo org.apache.accumulo.test.TestIngest -i instance -z zookeepers -u username -p password --rows 1 --cols 10000 
+    ERROR : Constraint violates : ConstraintViolationSummary(constrainClass:org.apache.accumulo.examples.constraints.MaxMutationSize, violationCode:0, violationDescription:mutation exceeded maximum size of 188160, numberOfViolatingMutations:1)
+

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/regex.md
----------------------------------------------------------------------
diff --git a/docs/regex.md b/docs/regex.md
new file mode 100644
index 0000000..a53ec25
--- /dev/null
+++ b/docs/regex.md
@@ -0,0 +1,57 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo Regex Example
+
+This example uses mapreduce and accumulo to find items using regular expressions.
+This is accomplished using a map-only mapreduce job and a scan-time iterator.
+
+To run this example you will need some data in a table. The following will
+put a trivial amount of data into accumulo using the accumulo shell:
+
+    $ accumulo shell -u username -p password
+    Shell - Apache Accumulo Interactive Shell
+    - version: 1.5.0
+    - instance name: instance
+    - instance id: 00000000-0000-0000-0000-000000000000
+    -
+    - type 'help' for a list of available commands
+    -
+    username@instance> createtable input
+    username@instance> insert dogrow dogcf dogcq dogvalue
+    username@instance> insert catrow catcf catcq catvalue
+    username@instance> quit
+
+The RegexExample class sets an iterator on the scanner. This does pattern matching
+against each key/value in accumulo, and only returns matching items. It will do this
+in parallel and will store the results in files in hdfs.
+
+The following will search for any rows in the input table that starts with "dog":
+
+    $ tool.sh target/accumulo-examples.jar org.apache.accumulo.examples.mapreduce.RegexExample -u user -p passwd -i instance -t input --rowRegex 'dog.*' --output /tmp/output
+
+    $ hadoop fs -ls /tmp/output
+    Found 3 items
+    -rw-r--r--   1 username supergroup          0 2013-01-10 14:11 /tmp/output/_SUCCESS
+    drwxr-xr-x   - username supergroup          0 2013-01-10 14:10 /tmp/output/_logs
+    -rw-r--r--   1 username supergroup         51 2013-01-10 14:10 /tmp/output/part-m-00000
+
+We can see the output of our little map-reduce job:
+
+    $ hadoop fs -text /tmp/output/part-m-00000
+    dogrow dogcf:dogcq [] 1357844987994 false	dogvalue
+
+

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/reservations.md
----------------------------------------------------------------------
diff --git a/docs/reservations.md b/docs/reservations.md
new file mode 100644
index 0000000..2199fe2
--- /dev/null
+++ b/docs/reservations.md
@@ -0,0 +1,66 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo Reservations Example
+
+This example shows running a simple reservation system implemented using
+conditional mutations. This system guarantees that only one concurrent user can
+reserve a resource. The example's reserve command allows multiple users to be
+specified. When this is done, it creates a separate reservation thread for each
+user. In the example below threads are spun up for alice, bob, eve, mallory,
+and trent to reserve room06 on 20140101. Bob ends up getting the reservation
+and everyone else is put on a wait list. The example code will take any string
+for what, when and who.
+
+    $ ./bin/runex reservations.ARS
+    >connect test16 localhost root secret ars
+      connected
+    >
+      Commands :
+        reserve <what> <when> <who> {who}
+        cancel <what> <when> <who>
+        list <what> <when>
+    >reserve room06 20140101 alice bob eve mallory trent
+                       bob : RESERVED
+                   mallory : WAIT_LISTED
+                     alice : WAIT_LISTED
+                     trent : WAIT_LISTED
+                       eve : WAIT_LISTED
+    >list room06 20140101
+      Reservation holder : bob
+      Wait list : [mallory, alice, trent, eve]
+    >cancel room06 20140101 alice
+    >cancel room06 20140101 bob
+    >list room06 20140101
+      Reservation holder : mallory
+      Wait list : [trent, eve]
+    >quit
+
+Scanning the table in the Accumulo shell after running the example shows the
+following:
+
+    root@test16> table ars
+    root@test16 ars> scan
+    room06:20140101 res:0001 []    mallory
+    room06:20140101 res:0003 []    trent
+    room06:20140101 res:0004 []    eve
+    room06:20140101 tx:seq []    6
+
+The tx:seq column is incremented for each update to the row allowing for
+detection of concurrent changes. For an update to go through, the sequence
+number must not have changed since the data was read. If it does change,
+the conditional mutation will fail and the example code will retry.
+

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/rgbalancer.md
----------------------------------------------------------------------
diff --git a/docs/rgbalancer.md b/docs/rgbalancer.md
new file mode 100644
index 0000000..7f897c0
--- /dev/null
+++ b/docs/rgbalancer.md
@@ -0,0 +1,159 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo Balancer Example
+
+For some data access patterns, its important to spread groups of tablets within
+a table out evenly.  Accumulo has a balancer that can do this using a regular
+expression to group tablets. This example shows how this balancer spreads 4
+groups of tablets within a table evenly across 17 tablet servers.
+
+Below shows creating a table and adding splits.  For this example we would like
+all of the tablets where the split point has the same two digits to be on
+different tservers.  This gives us four groups of tablets: 01, 02, 03, and 04.   
+
+    root@accumulo> createtable testRGB
+    root@accumulo testRGB> addsplits -t testRGB 01b 01m 01r 01z  02b 02m 02r 02z 03b 03m 03r 03z 04a 04b 04c 04d 04e 04f 04g 04h 04i 04j 04k 04l 04m 04n 04o 04p
+    root@accumulo testRGB> tables -l
+    accumulo.metadata    =>        !0
+    accumulo.replication =>      +rep
+    accumulo.root        =>        +r
+    testRGB              =>         2
+    trace                =>         1
+
+After adding the splits we look at the locations in the metadata table.
+
+    root@accumulo testRGB> scan -t accumulo.metadata -b 2; -e 2< -c loc
+    2;01b loc:34a5f6e086b000c []    ip-10-1-2-25:9997
+    2;01m loc:34a5f6e086b000c []    ip-10-1-2-25:9997
+    2;01r loc:14a5f6e079d0011 []    ip-10-1-2-15:9997
+    2;01z loc:14a5f6e079d000f []    ip-10-1-2-13:9997
+    2;02b loc:34a5f6e086b000b []    ip-10-1-2-26:9997
+    2;02m loc:14a5f6e079d000c []    ip-10-1-2-28:9997
+    2;02r loc:14a5f6e079d0012 []    ip-10-1-2-27:9997
+    2;02z loc:14a5f6e079d0012 []    ip-10-1-2-27:9997
+    2;03b loc:14a5f6e079d000d []    ip-10-1-2-21:9997
+    2;03m loc:14a5f6e079d000e []    ip-10-1-2-20:9997
+    2;03r loc:14a5f6e079d000d []    ip-10-1-2-21:9997
+    2;03z loc:14a5f6e079d000e []    ip-10-1-2-20:9997
+    2;04a loc:34a5f6e086b000b []    ip-10-1-2-26:9997
+    2;04b loc:14a5f6e079d0010 []    ip-10-1-2-17:9997
+    2;04c loc:14a5f6e079d0010 []    ip-10-1-2-17:9997
+    2;04d loc:24a5f6e07d3000c []    ip-10-1-2-16:9997
+    2;04e loc:24a5f6e07d3000d []    ip-10-1-2-29:9997
+    2;04f loc:24a5f6e07d3000c []    ip-10-1-2-16:9997
+    2;04g loc:24a5f6e07d3000a []    ip-10-1-2-14:9997
+    2;04h loc:14a5f6e079d000c []    ip-10-1-2-28:9997
+    2;04i loc:34a5f6e086b000d []    ip-10-1-2-19:9997
+    2;04j loc:34a5f6e086b000d []    ip-10-1-2-19:9997
+    2;04k loc:24a5f6e07d30009 []    ip-10-1-2-23:9997
+    2;04l loc:24a5f6e07d3000b []    ip-10-1-2-22:9997
+    2;04m loc:24a5f6e07d30009 []    ip-10-1-2-23:9997
+    2;04n loc:24a5f6e07d3000b []    ip-10-1-2-22:9997
+    2;04o loc:34a5f6e086b000a []    ip-10-1-2-18:9997
+    2;04p loc:24a5f6e07d30008 []    ip-10-1-2-24:9997
+    2< loc:24a5f6e07d30008 []    ip-10-1-2-24:9997
+
+Below the information above was massaged to show which tablet groups are on
+each tserver.  The four tablets in group 03 are on two tservers, ideally those
+tablets would be spread across 4 tservers.  Note the default tablet (2<) was
+categorized as group 04 below.
+
+    ip-10-1-2-13:9997 01
+    ip-10-1-2-14:9997 04
+    ip-10-1-2-15:9997 01
+    ip-10-1-2-16:9997 04 04
+    ip-10-1-2-17:9997 04 04
+    ip-10-1-2-18:9997 04
+    ip-10-1-2-19:9997 04 04
+    ip-10-1-2-20:9997 03 03
+    ip-10-1-2-21:9997 03 03
+    ip-10-1-2-22:9997 04 04
+    ip-10-1-2-23:9997 04 04
+    ip-10-1-2-24:9997 04 04
+    ip-10-1-2-25:9997 01 01
+    ip-10-1-2-26:9997 02 04
+    ip-10-1-2-27:9997 02 02
+    ip-10-1-2-28:9997 02 04
+    ip-10-1-2-29:9997 04
+
+To remedy this situation, the RegexGroupBalancer is configured with the
+commands below.  The configured regular expression selects the first two digits
+from a tablets end row as the group id.  Tablets that don't match and the
+default tablet are configured to be in group 04.
+
+    root@accumulo testRGB> config -t testRGB -s table.custom.balancer.group.regex.pattern=(\\d\\d).*
+    root@accumulo testRGB> config -t testRGB -s table.custom.balancer.group.regex.default=04
+    root@accumulo testRGB> config -t testRGB -s table.balancer=org.apache.accumulo.server.master.balancer.RegexGroupBalancer
+
+After waiting a little bit, look at the tablet locations again and all is good.
+
+    root@accumulo testRGB> scan -t accumulo.metadata -b 2; -e 2< -c loc
+    2;01b loc:34a5f6e086b000a []    ip-10-1-2-18:9997
+    2;01m loc:34a5f6e086b000c []    ip-10-1-2-25:9997
+    2;01r loc:14a5f6e079d0011 []    ip-10-1-2-15:9997
+    2;01z loc:14a5f6e079d000f []    ip-10-1-2-13:9997
+    2;02b loc:34a5f6e086b000b []    ip-10-1-2-26:9997
+    2;02m loc:14a5f6e079d000c []    ip-10-1-2-28:9997
+    2;02r loc:34a5f6e086b000d []    ip-10-1-2-19:9997
+    2;02z loc:14a5f6e079d0012 []    ip-10-1-2-27:9997
+    2;03b loc:24a5f6e07d3000d []    ip-10-1-2-29:9997
+    2;03m loc:24a5f6e07d30009 []    ip-10-1-2-23:9997
+    2;03r loc:14a5f6e079d000d []    ip-10-1-2-21:9997
+    2;03z loc:14a5f6e079d000e []    ip-10-1-2-20:9997
+    2;04a loc:34a5f6e086b000b []    ip-10-1-2-26:9997
+    2;04b loc:34a5f6e086b000c []    ip-10-1-2-25:9997
+    2;04c loc:14a5f6e079d0010 []    ip-10-1-2-17:9997
+    2;04d loc:14a5f6e079d000e []    ip-10-1-2-20:9997
+    2;04e loc:24a5f6e07d3000d []    ip-10-1-2-29:9997
+    2;04f loc:24a5f6e07d3000c []    ip-10-1-2-16:9997
+    2;04g loc:24a5f6e07d3000a []    ip-10-1-2-14:9997
+    2;04h loc:14a5f6e079d000c []    ip-10-1-2-28:9997
+    2;04i loc:14a5f6e079d0011 []    ip-10-1-2-15:9997
+    2;04j loc:34a5f6e086b000d []    ip-10-1-2-19:9997
+    2;04k loc:14a5f6e079d0012 []    ip-10-1-2-27:9997
+    2;04l loc:14a5f6e079d000f []    ip-10-1-2-13:9997
+    2;04m loc:24a5f6e07d30009 []    ip-10-1-2-23:9997
+    2;04n loc:24a5f6e07d3000b []    ip-10-1-2-22:9997
+    2;04o loc:34a5f6e086b000a []    ip-10-1-2-18:9997
+    2;04p loc:14a5f6e079d000d []    ip-10-1-2-21:9997
+    2< loc:24a5f6e07d30008 []    ip-10-1-2-24:9997
+
+Once again, the data above is transformed to make it easier to see which groups
+are on tservers.  The transformed data below shows that all groups are now
+evenly spread.
+
+    ip-10-1-2-13:9997 01 04
+    ip-10-1-2-14:9997    04
+    ip-10-1-2-15:9997 01 04
+    ip-10-1-2-16:9997    04
+    ip-10-1-2-17:9997    04
+    ip-10-1-2-18:9997 01 04
+    ip-10-1-2-19:9997 02 04
+    ip-10-1-2-20:9997 03 04
+    ip-10-1-2-21:9997 03 04
+    ip-10-1-2-22:9997    04
+    ip-10-1-2-23:9997 03 04
+    ip-10-1-2-24:9997    04
+    ip-10-1-2-25:9997 01 04
+    ip-10-1-2-26:9997 02 04
+    ip-10-1-2-27:9997 02 04
+    ip-10-1-2-28:9997 02 04
+    ip-10-1-2-29:9997 03 04
+
+If you need this functionality, but a regular expression does not meet your
+needs then extend GroupBalancer.  This allows you to specify a partitioning
+function in Java.  Use the RegexGroupBalancer source as an example.

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/rowhash.md
----------------------------------------------------------------------
diff --git a/docs/rowhash.md b/docs/rowhash.md
new file mode 100644
index 0000000..57a5383
--- /dev/null
+++ b/docs/rowhash.md
@@ -0,0 +1,59 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo RowHash Example
+
+This example shows a simple map/reduce job that reads from an accumulo table and
+writes back into that table.
+
+To run this example you will need some data in a table. The following will
+put a trivial amount of data into accumulo using the accumulo shell:
+
+    $ accumulo shell -u username -p password
+    Shell - Apache Accumulo Interactive Shell
+    - version: 1.5.0
+    - instance name: instance
+    - instance id: 00000000-0000-0000-0000-000000000000
+    -
+    - type 'help' for a list of available commands
+    -
+    username@instance> createtable input
+    username@instance> insert a-row cf cq value
+    username@instance> insert b-row cf cq value
+    username@instance> quit
+
+The RowHash class will insert a hash for each row in the database if it contains a
+specified colum. Here's how you run the map/reduce job
+
+    $ tool.sh target/accumulo-examples.jar org.apache.accumulo.examples.mapreduce.RowHash -u user -p passwd -i instance -t input --column cf:cq
+
+Now we can scan the table and see the hashes:
+
+    $ accumulo shell -u username -p password
+    Shell - Apache Accumulo Interactive Shell
+    - version: 1.5.0
+    - instance name: instance
+    - instance id: 00000000-0000-0000-0000-000000000000
+    -
+    - type 'help' for a list of available commands
+    -
+    username@instance> scan -t input
+    a-row cf:cq []    value
+    a-row cf-HASHTYPE:cq-MD5BASE64 []    IGPBYI1uC6+AJJxC4r5YBA==
+    b-row cf:cq []    value
+    b-row cf-HASHTYPE:cq-MD5BASE64 []    IGPBYI1uC6+AJJxC4r5YBA==
+    username@instance>
+

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/sample.md
----------------------------------------------------------------------
diff --git a/docs/sample.md b/docs/sample.md
new file mode 100644
index 0000000..9e5d429
--- /dev/null
+++ b/docs/sample.md
@@ -0,0 +1,191 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo Sampling Example
+
+Basic Sampling Example
+----------------------
+
+Accumulo supports building a set of sample data that can be efficiently
+accessed by scanners.  What data is included in the sample set is configurable.
+Below, some data representing documents are inserted.  
+
+    root@instance sampex> createtable sampex
+    root@instance sampex> insert 9255 doc content 'abcde'
+    root@instance sampex> insert 9255 doc url file://foo.txt
+    root@instance sampex> insert 8934 doc content 'accumulo scales'
+    root@instance sampex> insert 8934 doc url file://accumulo_notes.txt
+    root@instance sampex> insert 2317 doc content 'milk, eggs, bread, parmigiano-reggiano'
+    root@instance sampex> insert 2317 doc url file://groceries/9.txt
+    root@instance sampex> insert 3900 doc content 'EC2 ate my homework'
+    root@instance sampex> insert 3900 doc uril file://final_project.txt
+
+Below the table sampex is configured to build a sample set.  The configuration
+causes Accumulo to include any row where `murmur3_32(row) % 3 ==0` in the
+tables sample data.
+
+    root@instance sampex> config -t sampex -s table.sampler.opt.hasher=murmur3_32
+    root@instance sampex> config -t sampex -s table.sampler.opt.modulus=3
+    root@instance sampex> config -t sampex -s table.sampler=org.apache.accumulo.core.client.sample.RowSampler
+
+Below, attempting to scan the sample returns an error.  This is because data
+was inserted before the sample set was configured.
+
+    root@instance sampex> scan --sample
+    2015-09-09 12:21:50,643 [shell.Shell] ERROR: org.apache.accumulo.core.client.SampleNotPresentException: Table sampex(ID:2) does not have sampling configured or built
+
+To remedy this problem, the following command will flush in memory data and
+compact any files that do not contain the correct sample data.   
+
+    root@instance sampex> compact -t sampex --sf-no-sample
+
+After the compaction, the sample scan works.  
+
+    root@instance sampex> scan --sample
+    2317 doc:content []    milk, eggs, bread, parmigiano-reggiano
+    2317 doc:url []    file://groceries/9.txt
+
+The commands below show that updates to data in the sample are seen when
+scanning the sample.
+
+    root@instance sampex> insert 2317 doc content 'milk, eggs, bread, parmigiano-reggiano, butter'
+    root@instance sampex> scan --sample
+    2317 doc:content []    milk, eggs, bread, parmigiano-reggiano, butter
+    2317 doc:url []    file://groceries/9.txt
+
+Inorder to make scanning the sample fast, sample data is partitioned as data is
+written to Accumulo.  This means if the sample configuration is changed, that
+data written previously is partitioned using a different criteria.  Accumulo
+will detect this situation and fail sample scans.  The commands below show this
+failure and fixiing the problem with a compaction.
+
+    root@instance sampex> config -t sampex -s table.sampler.opt.modulus=2
+    root@instance sampex> scan --sample
+    2015-09-09 12:22:51,058 [shell.Shell] ERROR: org.apache.accumulo.core.client.SampleNotPresentException: Table sampex(ID:2) does not have sampling configured or built
+    root@instance sampex> compact -t sampex --sf-no-sample
+    2015-09-09 12:23:07,242 [shell.Shell] INFO : Compaction of table sampex started for given range
+    root@instance sampex> scan --sample
+    2317 doc:content []    milk, eggs, bread, parmigiano-reggiano
+    2317 doc:url []    file://groceries/9.txt
+    3900 doc:content []    EC2 ate my homework
+    3900 doc:uril []    file://final_project.txt
+    9255 doc:content []    abcde
+    9255 doc:url []    file://foo.txt
+
+The example above is replicated in a java program using the Accumulo API.
+Below is the program name and the command to run it.
+
+    ./bin/runex sample.SampleExample -i instance -z localhost -u root -p secret
+
+The commands below look under the hood to give some insight into how this
+feature works.  The commands determine what files the sampex table is using.
+
+    root@instance sampex> tables -l
+    accumulo.metadata    =>        !0
+    accumulo.replication =>      +rep
+    accumulo.root        =>        +r
+    sampex               =>         2
+    trace                =>         1
+    root@instance sampex> scan -t accumulo.metadata -c file -b 2 -e 2<
+    2< file:hdfs://localhost:10000/accumulo/tables/2/default_tablet/A000000s.rf []    702,8
+
+Below shows running `accumulo rfile-info` on the file above.  This shows the
+rfile has a normal default locality group and a sample default locality group.
+The output also shows the configuration used to create the sample locality
+group.  The sample configuration within a rfile must match the tables sample
+configuration for sample scan to work.
+
+    $ accumulo rfile-info hdfs://localhost:10000/accumulo/tables/2/default_tablet/A000000s.rf
+    Reading file: hdfs://localhost:10000/accumulo/tables/2/default_tablet/A000000s.rf
+    RFile Version            : 8
+    
+    Locality group           : <DEFAULT>
+    	Start block            : 0
+    	Num   blocks           : 1
+    	Index level 0          : 35 bytes  1 blocks
+    	First key              : 2317 doc:content [] 1437672014986 false
+    	Last key               : 9255 doc:url [] 1437672014875 false
+    	Num entries            : 8
+    	Column families        : [doc]
+    
+    Sample Configuration     :
+    	Sampler class          : org.apache.accumulo.core.client.sample.RowSampler
+    	Sampler options        : {hasher=murmur3_32, modulus=2}
+
+    Sample Locality group    : <DEFAULT>
+    	Start block            : 0
+    	Num   blocks           : 1
+    	Index level 0          : 36 bytes  1 blocks
+    	First key              : 2317 doc:content [] 1437672014986 false
+    	Last key               : 9255 doc:url [] 1437672014875 false
+    	Num entries            : 6
+    	Column families        : [doc]
+    
+    Meta block     : BCFile.index
+          Raw size             : 4 bytes
+          Compressed size      : 12 bytes
+          Compression type     : gz
+
+    Meta block     : RFile.index
+          Raw size             : 309 bytes
+          Compressed size      : 176 bytes
+          Compression type     : gz
+
+
+Shard Sampling Example
+----------------------
+
+The [shard example][shard] shows how to index and search files using Accumulo.  That
+example indexes documents into a table named `shard`.  The indexing scheme used
+in that example places the document name in the column qualifier.  A useful
+sample of this indexing scheme should contain all data for any document in the
+sample.   To accomplish this, the following commands build a sample for the
+shard table based on the column qualifier.
+
+    root@instance shard> config -t shard -s table.sampler.opt.hasher=murmur3_32
+    root@instance shard> config -t shard -s table.sampler.opt.modulus=101
+    root@instance shard> config -t shard -s table.sampler.opt.qualifier=true
+    root@instance shard> config -t shard -s table.sampler=org.apache.accumulo.core.client.sample.RowColumnSampler
+    root@instance shard> compact -t shard --sf-no-sample -w
+    2015-07-23 15:00:09,280 [shell.Shell] INFO : Compacting table ...
+    2015-07-23 15:00:10,134 [shell.Shell] INFO : Compaction of table shard completed for given range
+
+After enabling sampling, the command below counts the number of documents in
+the sample containing the words `import` and `int`.     
+
+    $ ./bin/runex shard.Query --sample -i instance16 -z localhost -t shard -u root -p secret import int | fgrep '.java' | wc
+         11      11    1246
+
+The command below counts the total number of documents containing the words
+`import` and `int`.
+
+    $ ./bin/runex shard.Query -i instance16 -z localhost -t shard -u root -p secret import int | fgrep '.java' | wc
+       1085    1085  118175
+
+The counts 11 out of 1085 total are around what would be expected for a modulus
+of 101.  Querying the sample first provides a quick way to estimate how much data
+the real query will bring back. 
+
+Another way sample data could be used with the shard example is with a
+specialized iterator.  In the examples source code there is an iterator named
+CutoffIntersectingIterator.  This iterator first checks how many documents are
+found in the sample data.  If too many documents are found in the sample data,
+then it returns nothing.   Otherwise it proceeds to query the full data set.
+To experiment with this iterator, use the following command.  The
+`--sampleCutoff` option below will cause the query to return nothing if based
+on the sample it appears a query would return more than 1000 documents.
+
+    $ ./bin/runex shard.Query --sampleCutoff 1000 -i instance16 -z localhost -t shard -u root -p secret import int | fgrep '.java' | wc

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/shard.md
----------------------------------------------------------------------
diff --git a/docs/shard.md b/docs/shard.md
new file mode 100644
index 0000000..b9460c7
--- /dev/null
+++ b/docs/shard.md
@@ -0,0 +1,66 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo Shard Example
+
+Accumulo has an iterator called the intersecting iterator which supports querying a term index that is partitioned by
+document, or "sharded". This example shows how to use the intersecting iterator through these four programs:
+
+ * Index.java - Indexes a set of text files into an Accumulo table
+ * Query.java - Finds documents containing a given set of terms.
+ * Reverse.java - Reads the index table and writes a map of documents to terms into another table.
+ * ContinuousQuery.java  Uses the table populated by Reverse.java to select N random terms per document. Then it continuously and randomly queries those terms.
+
+To run these example programs, create two tables like below.
+
+    username@instance> createtable shard
+    username@instance shard> createtable doc2term
+
+After creating the tables, index some files. The following command indexes all of the java files in the Accumulo source code.
+
+    $ cd /local/username/workspace/accumulo/
+    $ find core/src server/src -name "*.java" | xargs ./bin/runex shard.Index -i instance -z zookeepers -t shard -u username -p password --partitions 30
+
+The following command queries the index to find all files containing 'foo' and 'bar'.
+
+    $ ./bin/runex shard.Query -i instance -z zookeepers -t shard -u username -p password foo bar
+    /local/username/workspace/accumulo/src/core/src/test/java/accumulo/core/security/ColumnVisibilityTest.java
+    /local/username/workspace/accumulo/src/core/src/test/java/accumulo/core/client/mock/MockConnectorTest.java
+    /local/username/workspace/accumulo/src/core/src/test/java/accumulo/core/security/VisibilityEvaluatorTest.java
+    /local/username/workspace/accumulo/src/server/src/main/java/accumulo/test/functional/RowDeleteTest.java
+    /local/username/workspace/accumulo/src/server/src/test/java/accumulo/server/logger/TestLogWriter.java
+    /local/username/workspace/accumulo/src/server/src/main/java/accumulo/test/functional/DeleteEverythingTest.java
+    /local/username/workspace/accumulo/src/core/src/test/java/accumulo/core/data/KeyExtentTest.java
+    /local/username/workspace/accumulo/src/server/src/test/java/accumulo/server/constraints/MetadataConstraintsTest.java
+    /local/username/workspace/accumulo/src/core/src/test/java/accumulo/core/iterators/WholeRowIteratorTest.java
+    /local/username/workspace/accumulo/src/server/src/test/java/accumulo/server/util/DefaultMapTest.java
+    /local/username/workspace/accumulo/src/server/src/test/java/accumulo/server/tabletserver/InMemoryMapTest.java
+
+In order to run ContinuousQuery, we need to run Reverse.java to populate doc2term.
+
+    $ ./bin/runex shard.Reverse -i instance -z zookeepers --shardTable shard --doc2Term doc2term -u username -p password
+
+Below ContinuousQuery is run using 5 terms. So it selects 5 random terms from each document, then it continually
+randomly selects one set of 5 terms and queries. It prints the number of matching documents and the time in seconds.
+
+    $ ./bin/runex shard.ContinuousQuery -i instance -z zookeepers --shardTable shard --doc2Term doc2term -u username -p password --terms 5
+    [public, core, class, binarycomparable, b] 2  0.081
+    [wordtodelete, unindexdocument, doctablename, putdelete, insert] 1  0.041
+    [import, columnvisibilityinterpreterfactory, illegalstateexception, cv, columnvisibility] 1  0.049
+    [getpackage, testversion, util, version, 55] 1  0.048
+    [for, static, println, public, the] 55  0.211
+    [sleeptime, wrappingiterator, options, long, utilwaitthread] 1  0.057
+    [string, public, long, 0, wait] 12  0.132

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/tabletofile.md
----------------------------------------------------------------------
diff --git a/docs/tabletofile.md b/docs/tabletofile.md
new file mode 100644
index 0000000..af69114
--- /dev/null
+++ b/docs/tabletofile.md
@@ -0,0 +1,59 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo Table-to-File Example
+
+This example uses mapreduce to extract specified columns from an existing table.
+
+To run this example you will need some data in a table. The following will
+put a trivial amount of data into accumulo using the accumulo shell:
+
+    $ accumulo shell -u username -p password
+    Shell - Apache Accumulo Interactive Shell
+    - version: 1.5.0
+    - instance name: instance
+    - instance id: 00000000-0000-0000-0000-000000000000
+    -
+    - type 'help' for a list of available commands
+    -
+    username@instance> createtable input
+    username@instance> insert dog cf cq dogvalue
+    username@instance> insert cat cf cq catvalue
+    username@instance> insert junk family qualifier junkvalue
+    username@instance> quit
+
+The TableToFile class configures a map-only job to read the specified columns and
+write the key/value pairs to a file in HDFS.
+
+The following will extract the rows containing the column "cf:cq":
+
+    $ tool.sh target/accumulo-examples.jar org.apache.accumulo.examples.mapreduce.TableToFile -u user -p passwd -i instance -t input --columns cf:cq --output /tmp/output
+
+    $ hadoop fs -ls /tmp/output
+    -rw-r--r--   1 username supergroup          0 2013-01-10 14:44 /tmp/output/_SUCCESS
+    drwxr-xr-x   - username supergroup          0 2013-01-10 14:44 /tmp/output/_logs
+    drwxr-xr-x   - username supergroup          0 2013-01-10 14:44 /tmp/output/_logs/history
+    -rw-r--r--   1 username supergroup       9049 2013-01-10 14:44 /tmp/output/_logs/history/job_201301081658_0011_1357847072863_username_TableToFile%5F1357847071434
+    -rw-r--r--   1 username supergroup      26172 2013-01-10 14:44 /tmp/output/_logs/history/job_201301081658_0011_conf.xml
+    -rw-r--r--   1 username supergroup         50 2013-01-10 14:44 /tmp/output/part-m-00000
+
+We can see the output of our little map-reduce job:
+
+    $ hadoop fs -text /tmp/output/output/part-m-00000
+    catrow cf:cq []	catvalue
+    dogrow cf:cq []	dogvalue
+    $
+

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/terasort.md
----------------------------------------------------------------------
diff --git a/docs/terasort.md b/docs/terasort.md
new file mode 100644
index 0000000..6038b97
--- /dev/null
+++ b/docs/terasort.md
@@ -0,0 +1,50 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo Terasort Example
+
+This example uses map/reduce to generate random input data that will
+be sorted by storing it into accumulo. It uses data very similar to the
+hadoop terasort benchmark.
+
+To run this example you run it with arguments describing the amount of data:
+
+    $ tool.sh target/accumulo-examples.jar org.apache.accumulo.examples.mapreduce.TeraSortIngest \
+    -i instance -z zookeepers -u user -p password \
+    --count 10 \
+    --minKeySize 10 \
+    --maxKeySize 10 \
+    --minValueSize 78 \
+    --maxValueSize 78 \
+    --table sort \
+    --splits 10 \
+
+After the map reduce job completes, scan the data:
+
+    $ accumulo shell -u username -p password
+    username@instance> scan -t sort
+    +l-$$OE/ZH c:         4 []    GGGGGGGGGGWWWWWWWWWWMMMMMMMMMMCCCCCCCCCCSSSSSSSSSSIIIIIIIIIIYYYYYYYYYYOOOOOOOO
+    ,C)wDw//u= c:        10 []    CCCCCCCCCCSSSSSSSSSSIIIIIIIIIIYYYYYYYYYYOOOOOOOOOOEEEEEEEEEEUUUUUUUUUUKKKKKKKK
+    75@~?'WdUF c:         1 []    IIIIIIIIIIYYYYYYYYYYOOOOOOOOOOEEEEEEEEEEUUUUUUUUUUKKKKKKKKKKAAAAAAAAAAQQQQQQQQ
+    ;L+!2rT~hd c:         8 []    MMMMMMMMMMCCCCCCCCCCSSSSSSSSSSIIIIIIIIIIYYYYYYYYYYOOOOOOOOOOEEEEEEEEEEUUUUUUUU
+    LsS8)|.ZLD c:         5 []    OOOOOOOOOOEEEEEEEEEEUUUUUUUUUUKKKKKKKKKKAAAAAAAAAAQQQQQQQQQQGGGGGGGGGGWWWWWWWW
+    M^*dDE;6^< c:         9 []    UUUUUUUUUUKKKKKKKKKKAAAAAAAAAAQQQQQQQQQQGGGGGGGGGGWWWWWWWWWWMMMMMMMMMMCCCCCCCC
+    ^Eu)<n#kdP c:         3 []    YYYYYYYYYYOOOOOOOOOOEEEEEEEEEEUUUUUUUUUUKKKKKKKKKKAAAAAAAAAAQQQQQQQQQQGGGGGGGG
+    le5awB.$sm c:         6 []    WWWWWWWWWWMMMMMMMMMMCCCCCCCCCCSSSSSSSSSSIIIIIIIIIIYYYYYYYYYYOOOOOOOOOOEEEEEEEE
+    q__[fwhKFg c:         7 []    EEEEEEEEEEUUUUUUUUUUKKKKKKKKKKAAAAAAAAAAQQQQQQQQQQGGGGGGGGGGWWWWWWWWWWMMMMMMMM
+    w[o||:N&H, c:         2 []    QQQQQQQQQQGGGGGGGGGGWWWWWWWWWWMMMMMMMMMMCCCCCCCCCCSSSSSSSSSSIIIIIIIIIIYYYYYYYY
+
+Of course, a real benchmark would ingest millions of entries.

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/docs/visibility.md
----------------------------------------------------------------------
diff --git a/docs/visibility.md b/docs/visibility.md
new file mode 100644
index 0000000..0f6f35c
--- /dev/null
+++ b/docs/visibility.md
@@ -0,0 +1,131 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+# Apache Accumulo Visibility, Authorizations, and Permissions Example
+
+## Creating a new user
+
+    root@instance> createuser username
+    Enter new password for 'username': ********
+    Please confirm new password for 'username': ********
+    root@instance> user username
+    Enter password for user username: ********
+    username@instance> createtable vistest
+    06 10:48:47,931 [shell.Shell] ERROR: org.apache.accumulo.core.client.AccumuloSecurityException: Error PERMISSION_DENIED - User does not have permission to perform this action
+    username@instance> userpermissions
+    System permissions:
+
+    Table permissions (accumulo.metadata): Table.READ
+    username@instance>
+
+A user does not by default have permission to create a table.
+
+## Granting permissions to a user
+
+    username@instance> user root
+    Enter password for user root: ********
+    root@instance> grant -s System.CREATE_TABLE -u username
+    root@instance> user username
+    Enter password for user username: ********
+    username@instance> createtable vistest
+    username@instance> userpermissions
+    System permissions: System.CREATE_TABLE
+
+    Table permissions (accumulo.metadata): Table.READ
+    Table permissions (vistest): Table.READ, Table.WRITE, Table.BULK_IMPORT, Table.ALTER_TABLE, Table.GRANT, Table.DROP_TABLE
+    username@instance vistest>
+
+## Inserting data with visibilities
+
+Visibilities are boolean AND (&) and OR (|) combinations of authorization
+tokens. Authorization tokens are arbitrary strings taken from a restricted
+ASCII character set. Parentheses are required to specify order of operations
+in visibilities.
+
+    username@instance vistest> insert row f1 q1 v1 -l A
+    username@instance vistest> insert row f2 q2 v2 -l A&B
+    username@instance vistest> insert row f3 q3 v3 -l apple&carrot|broccoli|spinach
+    06 11:19:01,432 [shell.Shell] ERROR: org.apache.accumulo.core.util.BadArgumentException: cannot mix | and & near index 12
+    apple&carrot|broccoli|spinach
+                ^
+    username@instance vistest> insert row f3 q3 v3 -l (apple&carrot)|broccoli|spinach
+    username@instance vistest>
+
+## Scanning with authorizations
+
+Authorizations are sets of authorization tokens. Each Accumulo user has
+authorizations and each Accumulo scan has authorizations. Scan authorizations
+are only allowed to be a subset of the user's authorizations. By default, a
+user's authorizations set is empty.
+
+    username@instance vistest> scan
+    username@instance vistest> scan -s A
+    06 11:43:14,951 [shell.Shell] ERROR: java.lang.RuntimeException: org.apache.accumulo.core.client.AccumuloSecurityException: Error BAD_AUTHORIZATIONS - The user does not have the specified authorizations assigned
+    username@instance vistest>
+
+## Setting authorizations for a user
+
+    username@instance vistest> setauths -s A
+    06 11:53:42,056 [shell.Shell] ERROR: org.apache.accumulo.core.client.AccumuloSecurityException: Error PERMISSION_DENIED - User does not have permission to perform this action
+    username@instance vistest>
+
+A user cannot set authorizations unless the user has the System.ALTER_USER permission.
+The root user has this permission.
+
+    username@instance vistest> user root
+    Enter password for user root: ********
+    root@instance vistest> setauths -s A -u username
+    root@instance vistest> user username
+    Enter password for user username: ********
+    username@instance vistest> scan -s A
+    row f1:q1 [A]    v1
+    username@instance vistest> scan
+    row f1:q1 [A]    v1
+    username@instance vistest>
+
+The default authorizations for a scan are the user's entire set of authorizations.
+
+    username@instance vistest> user root
+    Enter password for user root: ********
+    root@instance vistest> setauths -s A,B,broccoli -u username
+    root@instance vistest> user username
+    Enter password for user username: ********
+    username@instance vistest> scan
+    row f1:q1 [A]    v1
+    row f2:q2 [A&B]    v2
+    row f3:q3 [(apple&carrot)|broccoli|spinach]    v3
+    username@instance vistest> scan -s B
+    username@instance vistest>
+
+If you want, you can limit a user to only be able to insert data which they can read themselves.
+It can be set with the following constraint.
+
+    username@instance vistest> user root
+    Enter password for user root: ******
+    root@instance vistest> config -t vistest -s table.constraint.1=org.apache.accumulo.core.security.VisibilityConstraint
+    root@instance vistest> user username
+    Enter password for user username: ********
+    username@instance vistest> insert row f4 q4 v4 -l spinach
+        Constraint Failures:
+            ConstraintViolationSummary(constrainClass:org.apache.accumulo.core.security.VisibilityConstraint, violationCode:2, violationDescription:User does not have authorization on column visibility, numberOfViolatingMutations:1)
+    username@instance vistest> insert row f4 q4 v4 -l spinach|broccoli
+    username@instance vistest> scan
+    row f1:q1 [A]    v1
+    row f2:q2 [A&B]    v2
+    row f3:q3 [(apple&carrot)|broccoli|spinach]    v3
+    row f4:q4 [spinach|broccoli]    v4
+    username@instance vistest>
+

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/pom.xml
----------------------------------------------------------------------
diff --git a/pom.xml b/pom.xml
new file mode 100644
index 0000000..133b740
--- /dev/null
+++ b/pom.xml
@@ -0,0 +1,185 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+
+  <parent>
+    <groupId>org.apache</groupId>
+    <artifactId>apache</artifactId>
+    <version>18</version>
+  </parent>
+
+  <groupId>org.apache.accumulo</groupId>
+  <artifactId>accumulo-examples</artifactId>
+  <version>2.0.0-SNAPSHOT</version>
+  <packaging>jar</packaging>
+
+  <name>Apache Accumulo Examples</name>
+  <description>Example code and corresponding documentation for using Apache Accumulo</description>
+
+  <properties>
+    <accumulo.version>1.8.0</accumulo.version>
+    <hadoop.version>2.6.4</hadoop.version>
+    <maven.compiler.source>1.8</maven.compiler.source>
+    <maven.compiler.target>1.8</maven.compiler.target>
+  </properties>
+
+  <dependencyManagement>
+    <dependencies>
+      <dependency>
+        <groupId>com.google.guava</groupId>
+        <artifactId>guava</artifactId>
+        <version>14.0.1</version>
+      </dependency>
+    </dependencies>
+  </dependencyManagement>
+
+  <build>
+    <pluginManagement>
+      <plugins>
+        <plugin>
+          <!-- Allows us to get the apache-ds bundle artifacts -->
+          <groupId>org.apache.felix</groupId>
+          <artifactId>maven-bundle-plugin</artifactId>
+          <version>3.0.1</version>
+        </plugin>
+      </plugins>
+    </pluginManagement>
+    <plugins>
+      <plugin>
+        <artifactId>maven-compiler-plugin</artifactId>
+        <version>3.1</version>
+        <configuration>
+          <source>${maven.compiler.source}</source>
+          <target>${maven.compiler.target}</target>
+          <optimize>true</optimize>
+          <encoding>UTF-8</encoding>
+        </configuration>
+      </plugin>
+      <plugin>
+        <!-- Allows us to get the apache-ds bundle artifacts -->
+        <groupId>org.apache.felix</groupId>
+        <artifactId>maven-bundle-plugin</artifactId>
+        <extensions>true</extensions>
+        <inherited>true</inherited>
+      </plugin>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-failsafe-plugin</artifactId>
+        <executions>
+          <execution>
+            <id>run-integration-tests</id>
+            <goals>
+              <goal>integration-test</goal>
+              <goal>verify</goal>
+            </goals>
+          </execution>
+        </executions>
+      </plugin>
+      <plugin>
+        <groupId>org.codehaus.mojo</groupId>
+        <artifactId>exec-maven-plugin</artifactId>
+        <version>1.5.0</version>
+        <configuration>
+          <cleanupDaemonThreads>false</cleanupDaemonThreads>
+        </configuration>
+      </plugin>  
+    </plugins>
+  </build>
+
+  <dependencies>
+    <dependency>
+      <groupId>com.beust</groupId>
+      <artifactId>jcommander</artifactId>
+      <version>1.48</version>
+    </dependency>
+    <dependency>
+      <groupId>com.google.auto.service</groupId>
+      <artifactId>auto-service</artifactId>
+      <version>1.0-rc2</version>
+      <optional>true</optional>
+    </dependency>
+    <dependency>
+      <groupId>commons-cli</groupId>
+      <artifactId>commons-cli</artifactId>
+      <version>1.2</version>
+    </dependency>
+    <dependency>
+      <groupId>commons-configuration</groupId>
+      <artifactId>commons-configuration</artifactId>
+      <version>1.6</version>
+    </dependency>
+    <dependency>
+      <groupId>jline</groupId>
+      <artifactId>jline</artifactId>
+      <version>2.11</version>
+    </dependency>
+    <dependency>
+      <groupId>log4j</groupId>
+      <artifactId>log4j</artifactId>
+      <version>1.2.17</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.accumulo</groupId>
+      <artifactId>accumulo-core</artifactId>
+      <version>${accumulo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.accumulo</groupId>
+      <artifactId>accumulo-fate</artifactId>
+      <version>${accumulo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.accumulo</groupId>
+      <artifactId>accumulo-shell</artifactId>
+      <version>${accumulo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.accumulo</groupId>
+      <artifactId>accumulo-test</artifactId>
+      <version>${accumulo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.accumulo</groupId>
+      <artifactId>accumulo-tracer</artifactId>
+      <version>${accumulo.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.hadoop</groupId>
+      <artifactId>hadoop-client</artifactId>
+      <version>${hadoop.version}</version>
+    </dependency>
+    <dependency>
+      <groupId>org.apache.htrace</groupId>
+      <artifactId>htrace-core</artifactId>
+      <version>3.1.0-incubating</version>
+    </dependency>
+    <dependency>
+      <groupId>commons-httpclient</groupId>
+      <artifactId>commons-httpclient</artifactId>
+      <version>3.1</version>
+      <scope>test</scope>
+    </dependency>
+    <dependency>
+      <groupId>junit</groupId>
+      <artifactId>junit</artifactId>
+      <version>4.12</version>
+      <scope>test</scope>
+    </dependency>
+  </dependencies>
+</project>

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/src/main/java/org/apache/accumulo/examples/client/CountingVerifyingReceiver.java
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/accumulo/examples/client/CountingVerifyingReceiver.java b/src/main/java/org/apache/accumulo/examples/client/CountingVerifyingReceiver.java
new file mode 100644
index 0000000..51fc370
--- /dev/null
+++ b/src/main/java/org/apache/accumulo/examples/client/CountingVerifyingReceiver.java
@@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.examples.client;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+
+import java.util.Arrays;
+import java.util.HashMap;
+
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Value;
+import org.apache.hadoop.io.Text;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Internal class used to verify validity of data read.
+ */
+class CountingVerifyingReceiver {
+  private static final Logger log = LoggerFactory.getLogger(CountingVerifyingReceiver.class);
+
+  long count = 0;
+  int expectedValueSize = 0;
+  HashMap<Text,Boolean> expectedRows;
+
+  CountingVerifyingReceiver(HashMap<Text,Boolean> expectedRows, int expectedValueSize) {
+    this.expectedRows = expectedRows;
+    this.expectedValueSize = expectedValueSize;
+  }
+
+  public void receive(Key key, Value value) {
+
+    String row = key.getRow().toString();
+    long rowid = Integer.parseInt(row.split("_")[1]);
+
+    byte expectedValue[] = RandomBatchWriter.createValue(rowid, expectedValueSize);
+
+    if (!Arrays.equals(expectedValue, value.get())) {
+      log.error("Got unexpected value for " + key + " expected : " + new String(expectedValue, UTF_8) + " got : " + new String(value.get(), UTF_8));
+    }
+
+    if (!expectedRows.containsKey(key.getRow())) {
+      log.error("Got unexpected key " + key);
+    } else {
+      expectedRows.put(key.getRow(), true);
+    }
+
+    count++;
+  }
+}

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/src/main/java/org/apache/accumulo/examples/client/Flush.java
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/accumulo/examples/client/Flush.java b/src/main/java/org/apache/accumulo/examples/client/Flush.java
new file mode 100644
index 0000000..1227b36
--- /dev/null
+++ b/src/main/java/org/apache/accumulo/examples/client/Flush.java
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.examples.client;
+
+import org.apache.accumulo.core.cli.ClientOnRequiredTable;
+import org.apache.accumulo.core.client.Connector;
+
+/**
+ * Simple example for using tableOperations() (like create, delete, flush, etc).
+ */
+public class Flush {
+
+  public static void main(String[] args) {
+    ClientOnRequiredTable opts = new ClientOnRequiredTable();
+    opts.parseArgs(Flush.class.getName(), args);
+    try {
+      Connector connector = opts.getConnector();
+      connector.tableOperations().flush(opts.getTableName(), null, null, true);
+    } catch (Exception e) {
+      throw new RuntimeException(e);
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/src/main/java/org/apache/accumulo/examples/client/RandomBatchScanner.java
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/accumulo/examples/client/RandomBatchScanner.java b/src/main/java/org/apache/accumulo/examples/client/RandomBatchScanner.java
new file mode 100644
index 0000000..9b0c519
--- /dev/null
+++ b/src/main/java/org/apache/accumulo/examples/client/RandomBatchScanner.java
@@ -0,0 +1,194 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.examples.client;
+
+import static org.apache.accumulo.examples.client.RandomBatchWriter.abs;
+
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map.Entry;
+import java.util.Random;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.accumulo.core.cli.BatchScannerOpts;
+import org.apache.accumulo.core.cli.ClientOnRequiredTable;
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.BatchScanner;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Range;
+import org.apache.accumulo.core.data.Value;
+import org.apache.hadoop.io.Text;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.beust.jcommander.Parameter;
+
+/**
+ * Simple example for reading random batches of data from Accumulo.
+ */
+public class RandomBatchScanner {
+  private static final Logger log = LoggerFactory.getLogger(RandomBatchScanner.class);
+
+  /**
+   * Generate a number of ranges, each covering a single random row.
+   *
+   * @param num
+   *          the number of ranges to generate
+   * @param min
+   *          the minimum row that will be generated
+   * @param max
+   *          the maximum row that will be generated
+   * @param r
+   *          a random number generator
+   * @param ranges
+   *          a set in which to store the generated ranges
+   * @param expectedRows
+   *          a map in which to store the rows covered by the ranges (initially mapped to false)
+   */
+  static void generateRandomQueries(int num, long min, long max, Random r, HashSet<Range> ranges, HashMap<Text,Boolean> expectedRows) {
+    log.info(String.format("Generating %,d random queries...", num));
+    while (ranges.size() < num) {
+      long rowid = (abs(r.nextLong()) % (max - min)) + min;
+
+      Text row1 = new Text(String.format("row_%010d", rowid));
+
+      Range range = new Range(new Text(row1));
+      ranges.add(range);
+      expectedRows.put(row1, false);
+    }
+
+    log.info("finished");
+  }
+
+  /**
+   * Prints a count of the number of rows mapped to false.
+   *
+   * @return boolean indicating "were all the rows found?"
+   */
+  private static boolean checkAllRowsFound(HashMap<Text,Boolean> expectedRows) {
+    int count = 0;
+    boolean allFound = true;
+    for (Entry<Text,Boolean> entry : expectedRows.entrySet())
+      if (!entry.getValue())
+        count++;
+
+    if (count > 0) {
+      log.warn("Did not find " + count + " rows");
+      allFound = false;
+    }
+    return allFound;
+  }
+
+  /**
+   * Generates a number of random queries, verifies that the key/value pairs returned were in the queried ranges and that the values were generated by
+   * {@link RandomBatchWriter#createValue(long, int)}. Prints information about the results.
+   *
+   * @param num
+   *          the number of queries to generate
+   * @param min
+   *          the min row to query
+   * @param max
+   *          the max row to query
+   * @param evs
+   *          the expected size of the values
+   * @param r
+   *          a random number generator
+   * @param tsbr
+   *          a batch scanner
+   * @return boolean indicating "did the queries go fine?"
+   */
+  static boolean doRandomQueries(int num, long min, long max, int evs, Random r, BatchScanner tsbr) {
+
+    HashSet<Range> ranges = new HashSet<>(num);
+    HashMap<Text,Boolean> expectedRows = new java.util.HashMap<>();
+
+    generateRandomQueries(num, min, max, r, ranges, expectedRows);
+
+    tsbr.setRanges(ranges);
+
+    CountingVerifyingReceiver receiver = new CountingVerifyingReceiver(expectedRows, evs);
+
+    long t1 = System.currentTimeMillis();
+
+    for (Entry<Key,Value> entry : tsbr) {
+      receiver.receive(entry.getKey(), entry.getValue());
+    }
+
+    long t2 = System.currentTimeMillis();
+
+    log.info(String.format("%6.2f lookups/sec %6.2f secs%n", num / ((t2 - t1) / 1000.0), ((t2 - t1) / 1000.0)));
+    log.info(String.format("num results : %,d%n", receiver.count));
+
+    return checkAllRowsFound(expectedRows);
+  }
+
+  public static class Opts extends ClientOnRequiredTable {
+    @Parameter(names = "--min", description = "miniumum row that will be generated")
+    long min = 0;
+    @Parameter(names = "--max", description = "maximum ow that will be generated")
+    long max = 0;
+    @Parameter(names = "--num", required = true, description = "number of ranges to generate")
+    int num = 0;
+    @Parameter(names = "--size", required = true, description = "size of the value to write")
+    int size = 0;
+    @Parameter(names = "--seed", description = "seed for pseudo-random number generator")
+    Long seed = null;
+  }
+
+  /**
+   * Scans over a specified number of entries to Accumulo using a {@link BatchScanner}. Completes scans twice to compare times for a fresh query with those for
+   * a repeated query which has cached metadata and connections already established.
+   */
+  public static void main(String[] args) throws AccumuloException, AccumuloSecurityException, TableNotFoundException {
+    Opts opts = new Opts();
+    BatchScannerOpts bsOpts = new BatchScannerOpts();
+    opts.parseArgs(RandomBatchScanner.class.getName(), args, bsOpts);
+
+    Connector connector = opts.getConnector();
+    BatchScanner batchReader = connector.createBatchScanner(opts.getTableName(), opts.auths, bsOpts.scanThreads);
+    batchReader.setTimeout(bsOpts.scanTimeout, TimeUnit.MILLISECONDS);
+
+    Random r;
+    if (opts.seed == null)
+      r = new Random();
+    else
+      r = new Random(opts.seed);
+
+    // do one cold
+    boolean status = doRandomQueries(opts.num, opts.min, opts.max, opts.size, r, batchReader);
+
+    System.gc();
+    System.gc();
+    System.gc();
+
+    if (opts.seed == null)
+      r = new Random();
+    else
+      r = new Random(opts.seed);
+
+    // do one hot (connections already established, metadata table cached)
+    status = status && doRandomQueries(opts.num, opts.min, opts.max, opts.size, r, batchReader);
+
+    batchReader.close();
+    if (!status) {
+      System.exit(1);
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/src/main/java/org/apache/accumulo/examples/client/RandomBatchWriter.java
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/accumulo/examples/client/RandomBatchWriter.java b/src/main/java/org/apache/accumulo/examples/client/RandomBatchWriter.java
new file mode 100644
index 0000000..b1f0d74
--- /dev/null
+++ b/src/main/java/org/apache/accumulo/examples/client/RandomBatchWriter.java
@@ -0,0 +1,178 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.examples.client;
+
+import java.util.HashMap;
+import java.util.HashSet;
+import java.util.Map.Entry;
+import java.util.Random;
+import java.util.Set;
+
+import org.apache.accumulo.core.cli.BatchWriterOpts;
+import org.apache.accumulo.core.cli.ClientOnRequiredTable;
+import org.apache.accumulo.core.client.AccumuloException;
+import org.apache.accumulo.core.client.AccumuloSecurityException;
+import org.apache.accumulo.core.client.BatchWriter;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.MutationsRejectedException;
+import org.apache.accumulo.core.client.TableNotFoundException;
+import org.apache.accumulo.core.client.security.SecurityErrorCode;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.TabletId;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.security.ColumnVisibility;
+import org.apache.hadoop.io.Text;
+
+import com.beust.jcommander.Parameter;
+
+/**
+ * Simple example for writing random data to Accumulo.
+ *
+ * The rows of the entries will be randomly generated numbers between a specified min and max (prefixed by "row_"). The column families will be "foo" and column
+ * qualifiers will be "1". The values will be random byte arrays of a specified size.
+ */
+public class RandomBatchWriter {
+
+  /**
+   * Creates a random byte array of specified size using the specified seed.
+   *
+   * @param rowid
+   *          the seed to use for the random number generator
+   * @param dataSize
+   *          the size of the array
+   * @return a random byte array
+   */
+  public static byte[] createValue(long rowid, int dataSize) {
+    Random r = new Random(rowid);
+    byte value[] = new byte[dataSize];
+
+    r.nextBytes(value);
+
+    // transform to printable chars
+    for (int j = 0; j < value.length; j++) {
+      value[j] = (byte) (((0xff & value[j]) % 92) + ' ');
+    }
+
+    return value;
+  }
+
+  /**
+   * Creates a mutation on a specified row with column family "foo", column qualifier "1", specified visibility, and a random value of specified size.
+   *
+   * @param rowid
+   *          the row of the mutation
+   * @param dataSize
+   *          the size of the random value
+   * @param visibility
+   *          the visibility of the entry to insert
+   * @return a mutation
+   */
+  public static Mutation createMutation(long rowid, int dataSize, ColumnVisibility visibility) {
+    Text row = new Text(String.format("row_%010d", rowid));
+
+    Mutation m = new Mutation(row);
+
+    // create a random value that is a function of the
+    // row id for verification purposes
+    byte value[] = createValue(rowid, dataSize);
+
+    m.put(new Text("foo"), new Text("1"), visibility, new Value(value));
+
+    return m;
+  }
+
+  static class Opts extends ClientOnRequiredTable {
+    @Parameter(names = "--num", required = true)
+    int num = 0;
+    @Parameter(names = "--min")
+    long min = 0;
+    @Parameter(names = "--max")
+    long max = Long.MAX_VALUE;
+    @Parameter(names = "--size", required = true, description = "size of the value to write")
+    int size = 0;
+    @Parameter(names = "--vis", converter = VisibilityConverter.class)
+    ColumnVisibility visiblity = new ColumnVisibility("");
+    @Parameter(names = "--seed", description = "seed for pseudo-random number generator")
+    Long seed = null;
+  }
+
+  public static long abs(long l) {
+    l = Math.abs(l); // abs(Long.MIN_VALUE) == Long.MIN_VALUE...
+    if (l < 0)
+      return 0;
+    return l;
+  }
+
+  /**
+   * Writes a specified number of entries to Accumulo using a {@link BatchWriter}.
+   */
+  public static void main(String[] args) throws AccumuloException, AccumuloSecurityException, TableNotFoundException {
+    Opts opts = new Opts();
+    BatchWriterOpts bwOpts = new BatchWriterOpts();
+    opts.parseArgs(RandomBatchWriter.class.getName(), args, bwOpts);
+    if ((opts.max - opts.min) < 1L * opts.num) { // right-side multiplied by 1L to convert to long in a way that doesn't trigger FindBugs
+      System.err.println(String.format("You must specify a min and a max that allow for at least num possible values. "
+          + "For example, you requested %d rows, but a min of %d and a max of %d (exclusive), which only allows for %d rows.", opts.num, opts.min, opts.max,
+          (opts.max - opts.min)));
+      System.exit(1);
+    }
+    Random r;
+    if (opts.seed == null)
+      r = new Random();
+    else {
+      r = new Random(opts.seed);
+    }
+    Connector connector = opts.getConnector();
+    BatchWriter bw = connector.createBatchWriter(opts.getTableName(), bwOpts.getBatchWriterConfig());
+
+    // reuse the ColumnVisibility object to improve performance
+    ColumnVisibility cv = opts.visiblity;
+
+    // Generate num unique row ids in the given range
+    HashSet<Long> rowids = new HashSet<>(opts.num);
+    while (rowids.size() < opts.num) {
+      rowids.add((abs(r.nextLong()) % (opts.max - opts.min)) + opts.min);
+    }
+    for (long rowid : rowids) {
+      Mutation m = createMutation(rowid, opts.size, cv);
+      bw.addMutation(m);
+    }
+
+    try {
+      bw.close();
+    } catch (MutationsRejectedException e) {
+      if (e.getSecurityErrorCodes().size() > 0) {
+        HashMap<String,Set<SecurityErrorCode>> tables = new HashMap<>();
+        for (Entry<TabletId,Set<SecurityErrorCode>> ke : e.getSecurityErrorCodes().entrySet()) {
+          String tableId = ke.getKey().getTableId().toString();
+          Set<SecurityErrorCode> secCodes = tables.get(tableId);
+          if (secCodes == null) {
+            secCodes = new HashSet<>();
+            tables.put(tableId, secCodes);
+          }
+          secCodes.addAll(ke.getValue());
+        }
+        System.err.println("ERROR : Not authorized to write to tables : " + tables);
+      }
+
+      if (e.getConstraintViolationSummaries().size() > 0) {
+        System.err.println("ERROR : Constraint violations occurred : " + e.getConstraintViolationSummaries());
+      }
+      System.exit(1);
+    }
+  }
+}

http://git-wip-us.apache.org/repos/asf/accumulo-examples/blob/d96c6d96/src/main/java/org/apache/accumulo/examples/client/ReadWriteExample.java
----------------------------------------------------------------------
diff --git a/src/main/java/org/apache/accumulo/examples/client/ReadWriteExample.java b/src/main/java/org/apache/accumulo/examples/client/ReadWriteExample.java
new file mode 100644
index 0000000..0e63370
--- /dev/null
+++ b/src/main/java/org/apache/accumulo/examples/client/ReadWriteExample.java
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.accumulo.examples.client;
+
+import java.util.Map.Entry;
+import java.util.SortedSet;
+import java.util.TreeSet;
+
+import org.apache.accumulo.core.cli.ClientOnDefaultTable;
+import org.apache.accumulo.core.cli.ScannerOpts;
+import org.apache.accumulo.core.client.BatchWriter;
+import org.apache.accumulo.core.client.BatchWriterConfig;
+import org.apache.accumulo.core.client.Connector;
+import org.apache.accumulo.core.client.Durability;
+import org.apache.accumulo.core.client.Scanner;
+import org.apache.accumulo.core.client.impl.DurabilityImpl;
+import org.apache.accumulo.core.data.Key;
+import org.apache.accumulo.core.data.Mutation;
+import org.apache.accumulo.core.data.Value;
+import org.apache.accumulo.core.security.Authorizations;
+import org.apache.accumulo.core.security.ColumnVisibility;
+import org.apache.accumulo.core.util.ByteArraySet;
+import org.apache.hadoop.io.Text;
+
+import com.beust.jcommander.IStringConverter;
+import com.beust.jcommander.Parameter;
+
+public class ReadWriteExample {
+  // defaults
+  private static final String DEFAULT_AUTHS = "LEVEL1,GROUP1";
+  private static final String DEFAULT_TABLE_NAME = "test";
+
+  private Connector conn;
+
+  static class DurabilityConverter implements IStringConverter<Durability> {
+    @Override
+    public Durability convert(String value) {
+      return DurabilityImpl.fromString(value);
+    }
+  }
+
+  static class Opts extends ClientOnDefaultTable {
+    @Parameter(names = {"-C", "--createtable"}, description = "create table before doing anything")
+    boolean createtable = false;
+    @Parameter(names = {"-D", "--deletetable"}, description = "delete table when finished")
+    boolean deletetable = false;
+    @Parameter(names = {"-c", "--create"}, description = "create entries before any deletes")
+    boolean createEntries = false;
+    @Parameter(names = {"-r", "--read"}, description = "read entries after any creates/deletes")
+    boolean readEntries = false;
+    @Parameter(names = {"-d", "--delete"}, description = "delete entries after any creates")
+    boolean deleteEntries = false;
+    @Parameter(names = {"--durability"}, description = "durability used for writes (none, log, flush or sync)", converter = DurabilityConverter.class)
+    Durability durability = Durability.DEFAULT;
+
+    public Opts() {
+      super(DEFAULT_TABLE_NAME);
+      auths = new Authorizations(DEFAULT_AUTHS.split(","));
+    }
+  }
+
+  // hidden constructor
+  private ReadWriteExample() {}
+
+  private void execute(Opts opts, ScannerOpts scanOpts) throws Exception {
+    conn = opts.getConnector();
+
+    // add the authorizations to the user
+    Authorizations userAuthorizations = conn.securityOperations().getUserAuthorizations(opts.getPrincipal());
+    ByteArraySet auths = new ByteArraySet(userAuthorizations.getAuthorizations());
+    auths.addAll(opts.auths.getAuthorizations());
+    if (!auths.isEmpty())
+      conn.securityOperations().changeUserAuthorizations(opts.getPrincipal(), new Authorizations(auths));
+
+    // create table
+    if (opts.createtable) {
+      SortedSet<Text> partitionKeys = new TreeSet<>();
+      for (int i = Byte.MIN_VALUE; i < Byte.MAX_VALUE; i++)
+        partitionKeys.add(new Text(new byte[] {(byte) i}));
+      conn.tableOperations().create(opts.getTableName());
+      conn.tableOperations().addSplits(opts.getTableName(), partitionKeys);
+    }
+
+    // send mutations
+    createEntries(opts);
+
+    // read entries
+    if (opts.readEntries) {
+      // Note that the user needs to have the authorizations for the specified scan authorizations
+      // by an administrator first
+      Scanner scanner = conn.createScanner(opts.getTableName(), opts.auths);
+      scanner.setBatchSize(scanOpts.scanBatchSize);
+      for (Entry<Key,Value> entry : scanner)
+        System.out.println(entry.getKey().toString() + " -> " + entry.getValue().toString());
+    }
+
+    // delete table
+    if (opts.deletetable)
+      conn.tableOperations().delete(opts.getTableName());
+  }
+
+  private void createEntries(Opts opts) throws Exception {
+    if (opts.createEntries || opts.deleteEntries) {
+      BatchWriterConfig cfg = new BatchWriterConfig();
+      cfg.setDurability(opts.durability);
+      BatchWriter writer = conn.createBatchWriter(opts.getTableName(), cfg);
+      ColumnVisibility cv = new ColumnVisibility(opts.auths.toString().replace(',', '|'));
+
+      Text cf = new Text("datatypes");
+      Text cq = new Text("xml");
+      byte[] row = {'h', 'e', 'l', 'l', 'o', '\0'};
+      byte[] value = {'w', 'o', 'r', 'l', 'd', '\0'};
+
+      for (int i = 0; i < 10; i++) {
+        row[row.length - 1] = (byte) i;
+        Mutation m = new Mutation(new Text(row));
+        if (opts.deleteEntries) {
+          m.putDelete(cf, cq, cv);
+        }
+        if (opts.createEntries) {
+          value[value.length - 1] = (byte) i;
+          m.put(cf, cq, cv, new Value(value));
+        }
+        writer.addMutation(m);
+      }
+      writer.close();
+    }
+  }
+
+  public static void main(String[] args) throws Exception {
+    ReadWriteExample rwe = new ReadWriteExample();
+    Opts opts = new Opts();
+    ScannerOpts scanOpts = new ScannerOpts();
+    opts.parseArgs(ReadWriteExample.class.getName(), args, scanOpts);
+    rwe.execute(opts, scanOpts);
+  }
+}


Mime
View raw message