accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mwa...@apache.org
Subject [accumulo-website] branch master updated: ACCUMULO-4784 Updating docs with Connector builder (#56)
Date Tue, 20 Feb 2018 19:04:19 GMT
This is an automated email from the ASF dual-hosted git repository.

mwalch pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/accumulo-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 9ce0c29  ACCUMULO-4784 Updating docs with Connector builder (#56)
9ce0c29 is described below

commit 9ce0c2901d6342354ca5a0180e544e0bea30969f
Author: Mike Walch <mwalch@apache.org>
AuthorDate: Tue Feb 20 14:04:17 2018 -0500

    ACCUMULO-4784 Updating docs with Connector builder (#56)
---
 .../administration/configuration-management.md     |  22 +-
 _docs-2-0/administration/in-depth-install.md       |  28 +--
 _docs-2-0/administration/kerberos.md               |  20 +-
 _docs-2-0/administration/properties.md             |  78 +++----
 _docs-2-0/administration/ssl.md                    |  28 +--
 _docs-2-0/administration/tracing.md                |  13 +-
 _docs-2-0/development/client-properties.md         |  40 ++++
 _docs-2-0/development/development_tools.md         |  10 +-
 _docs-2-0/development/mapreduce.md                 |  30 +--
 _docs-2-0/development/proxy.md                     |   4 +-
 _docs-2-0/development/security.md                  |   6 +-
 _docs-2-0/getting-started/clients.md               | 236 +++++++++++++--------
 12 files changed, 300 insertions(+), 215 deletions(-)

diff --git a/_docs-2-0/administration/configuration-management.md b/_docs-2-0/administration/configuration-management.md
index 0e6ffe2..69584f3 100644
--- a/_docs-2-0/administration/configuration-management.md
+++ b/_docs-2-0/administration/configuration-management.md
@@ -4,9 +4,17 @@ category: administration
 order: 2
 ---
 
-## Setting Configuration
+Configuration is managed differently for Accumulo clients and servers.
 
-Accumulo is configured using [properties][props] whose values can be set in the following locations (with increasing precedence):
+## Client Configuration
+
+Accumulo clients are configured when [the Connector is built][client-conn] using builder methods or `accumulo-client.properties`
+which is configured using [client properties][client-props].
+
+## Server Configuration
+
+Accumulo services (i.e master, tablet server, monitor, etc) are configured using [server properties][props] whose values can be
+set in the following locations (with increasing precedence):
 
 1. Default values
 2. accumulo-site.xml (overrides defaults)
@@ -18,19 +26,19 @@ The configuration locations above are described in detail below.
 
 ### Default values
 
-All [properties][props] have a default value that is listed for each property on the [properties][props] page. Default values are set in the source code.
+All [server properties][props] have a default value that is listed for each property on the [properties][props] page. Default values are set in the source code.
 While default values have the lowest precedence, they are usually optimal.  However, there are cases where a change can increase query and ingest performance.
 
 ### accumulo-site.xml
 
-Setting [properties][props] in accumulo-site.xml will override their default value. If you are running Accumulo on a cluster, any updates to accumulo-site.xml must
+Setting [server properties][props] in accumulo-site.xml will override their default value. If you are running Accumulo on a cluster, any updates to accumulo-site.xml must
 be synced across the cluster. Accumulo processes (master, tserver, etc) read their local accumulo-site.xml on start up so processes must be restarted to apply changes.
 Certain properties can only be set in accumulo-site.xml. These properties have **zk mutable: no** in their description. Setting properties in accumulo-site.xml allows you
 to configure tablet servers with different settings.
 
 ### Zookeeper
 
-Many [properties][props] can be set in Zookeeper using the Accumulo API or shell. These properties can identified by **zk mutable: yes** in their description on
+Many [server properties][props] can be set in Zookeeper using the Accumulo API or shell. These properties can identified by **zk mutable: yes** in their description on
 the [properties page][props]. Zookeeper properties can be applied on a per-table or system-wide basis. Per-table properties take precedence over system-wide
 properties. While most properties set in Zookeeper take effect immediately, some require a restart of the process which is indicated in **zk mutable** section
 of their description.
@@ -68,7 +76,7 @@ are two areas in which there aren't any fail safes built into the API that can p
 While these properties have the ability to add some much needed dynamic configuration tools, use cases which might fall
 into these warnings should be reconsidered.
 
-## Viewing Configuration
+## Viewing Server Configuration
 
 Accumulo's current configuration can be viewed in the shell using the `config` command.
 
@@ -104,5 +112,7 @@ default  | table.compaction.minor.logs.threshold ..... | 3
 default  | table.failures.ignore ..................... | false
 ```
 
+[client-conn]: {{ page.docs_baseurl }}/getting-started/clients#connecting
+[client-props]: {{ page.docs_baseurl }}/development/client-properties
 [props]: {{ page.docs_baseurl }}/administration/properties
 [tableprops]: {{ page.docs_baseurl }}/administration/properties#table_prefix
diff --git a/_docs-2-0/administration/in-depth-install.md b/_docs-2-0/administration/in-depth-install.md
index dab8c74..94bcc14 100644
--- a/_docs-2-0/administration/in-depth-install.md
+++ b/_docs-2-0/administration/in-depth-install.md
@@ -294,27 +294,11 @@ will expect the KeyStore in the same location.
 
 ### Client Configuration
 
-In version 1.6.0, Accumulo included a new type of configuration file known as a client
-configuration file. One problem with the traditional "site.xml" file that is prevalent
-through Hadoop is that it is a single file used by both clients and servers. This makes
-it very difficult to protect secrets that are only meant for the server processes while
-allowing the clients to connect to the servers.
+Accumulo clients are configured in a different way than Accumulo servers. Clients are
+configured when [an Accumulo Connnector is created][client-conn] using Java builder methods
+or a `accumulo-client.properties` file containing [client properties][client-props].
 
-The client configuration file is a subset of the information stored in accumulo-site.xml
-meant only for consumption by clients of Accumulo. By default, Accumulo checks a number
-of locations for a client configuration by default:
-
-* `/path/to/accumulo/conf/client.conf`
-* `/etc/accumulo/client.conf`
-* `/etc/accumulo/conf/client.conf`
-* `~/.accumulo/config`
-
-These files are [Java Properties files](https://en.wikipedia.org/wiki/.properties). These files
-can currently contain information about ZooKeeper servers, RPC properties (such as SSL or SASL
-connectors), distributed tracing properties. Valid properties are defined by the [ClientProperty](https://github.com/apache/accumulo/blob/f1d0ec93d9f13ff84844b5ac81e4a7b383ced467/core/src/main/java/org/apache/accumulo/core/client/ClientConfiguration.java#L54)
-enum contained in the client API.
-
-#### Custom Table Tags
+### Custom Table Tags
 
 Accumulo has the ability for users to add custom tags to tables.  This allows
 applications to set application-level metadata about a table.  These tags can be
@@ -328,7 +312,7 @@ very sensitive to an excessive number of nodes and the sizes of the nodes. Appli
 which leverage the user of custom properties should take these warnings into
 consideration. There is no enforcement of these warnings via the API.
 
-#### Configuring the ClassLoader
+### Configuring the ClassLoader
 
 Accumulo builds its Java classpath in `accumulo-env.sh`.  After an Accumulo application has started, it will load classes from the locations
 specified in the deprecated [general.classpaths] property. Additionally, Accumulo will load classes from the locations specified in the
@@ -724,3 +708,5 @@ mailing lists at https://accumulo.apache.org for more info.
 [general.classpaths]: {{ page.docs_baseurl }}/administration/properties#general_classpaths
 [general.dynamic.classpaths]: {{ page.docs_baseurl }}/administration/properties#general_dynamic_classpaths
 [general.vfs.classpaths]: {{ page.docs_baseurl }}/administration/properties#general_vfs_classpaths
+[client-conn]: {{ page.docs_baseurl }}/getting-started/clients#connecting
+[client-props]: {{ page.docs_baseurl }}/development/client-properties
diff --git a/_docs-2-0/administration/kerberos.md b/_docs-2-0/administration/kerberos.md
index 29529e4..dc89a43 100644
--- a/_docs-2-0/administration/kerberos.md
+++ b/_docs-2-0/administration/kerberos.md
@@ -342,16 +342,14 @@ Valid starting       Expires              Service principal
 
 #### Configuration
 
-The second thing clients need to do is to set up their client configuration file. By
-default, this file is stored in `~/.accumulo/config` or `/path/to/accumulo/client.conf`.
-Accumulo utilities also allow you to provide your own copy of this file in any location
-using the `--config-file` command line option.
+The second thing clients need to do is to configure kerberos when an Accumulo Connector is
+created.  This can be done using Connector builder methods or by setting the properties
+below in `accumulo-client.properties` which can be provided to Accumulo utilities using
+the `--config-file` command line option.
 
-Three items need to be set to enable access to Accumulo:
-
-* `instance.rpc.sasl.enabled`=_true_
-* `rpc.sasl.qop`=_auth_
-* `kerberos.server.primary`=_accumulo_
+* [sasl.enabled] = true
+* [sasl.qop] = auth
+* [sasl.kerberos.server.primary] = accumulo
 
 Each of these properties *must* match the configuration of the accumulo servers; this is
 required to set up the SASL transport.
@@ -603,3 +601,7 @@ java.lang.AssertionError: AuthenticationToken should not be null
 **A**: This indicates that the Monitor has not been able to successfully log in a client-side user to read from the `trace` table. Accumulo allows the TraceServer to rely on the property `general.kerberos.keytab` as a fallback when logging in the trace user if the `trace.token.property.keytab` property isn't defined. Some earlier versions of Accumulo did not do this same fallback for the Monitor's use of the trace user. The end result is that if you configure `general.kerberos.keytab` an [...]
 
 Ensure you have set `trace.token.property.keytab` to point to a keytab for the principal defined in `trace.user` in the `accumulo-site.xml` file for the Monitor, since that should work in all versions of Accumulo.
+
+[sasl.enabled]: {{ page.docs_baseurl }}/development/client-properties#sasl_enabled
+[sasl.qop]: {{ page.docs_baseurl }}/development/client-properties#sasl_qop
+[sasl.kerberos.server.primary]: {{ page.docs_baseurl }}/development/client-properties#sasl_kerberos_server_primary
diff --git a/_docs-2-0/administration/properties.md b/_docs-2-0/administration/properties.md
index 815eb5c..c0d567b 100644
--- a/_docs-2-0/administration/properties.md
+++ b/_docs-2-0/administration/properties.md
@@ -1,19 +1,21 @@
 ---
-title: Configuration Properties
+title: Server Properties
 category: administration
 order: 3
 ---
 
 <!-- WARNING: Do not edit this file. It is a generated file that is copied from Accumulo build (from core/target/generated-docs) -->
 
+Below are properties set in `accumulo-site.xml` or the Accumulo shell that configure Accumulo servers (i.e tablet server, master, etc):
+
 | Property | Description |
 |--------------|-------------|
 | <a name="gc_prefix" class="prop"></a> **gc.*** | Properties in this category affect the behavior of the accumulo garbage collector. |
-| <a name="gc_cycle_delay" class="prop"></a> gc.cycle.delay | Time between garbage collection cycles. In each cycle, old files no longer in use are removed from the filesystem.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `5m` |
-| <a name="gc_cycle_start" class="prop"></a> gc.cycle.start | Time to wait before attempting to garbage collect any old files.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `30s` |
+| <a name="gc_cycle_delay" class="prop"></a> gc.cycle.delay | Time between garbage collection cycles. In each cycle, old RFiles or write-ahead logs no longer in use are removed from the filesystem.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `5m` |
+| <a name="gc_cycle_start" class="prop"></a> gc.cycle.start | Time to wait before attempting to garbage collect any old RFiles or write-ahead logs.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `30s` |
 | <a name="gc_file_archive" class="prop"></a> gc.file.archive | Archive any files/directories instead of moving to the HDFS trash or deleting.<br>**type:** BOOLEAN, **zk mutable:** yes, **default value:** `false` |
 | <a name="gc_port_client" class="prop"></a> gc.port.client | The listening port for the garbage collector's monitor service<br>**type:** PORT, **zk mutable:** yes but requires restart of the gc, **default value:** `9998` |
-| <a name="gc_threads_delete" class="prop"></a> gc.threads.delete | The number of threads used to delete files<br>**type:** COUNT, **zk mutable:** yes, **default value:** `16` |
+| <a name="gc_threads_delete" class="prop"></a> gc.threads.delete | The number of threads used to delete RFiles and write-ahead logs<br>**type:** COUNT, **zk mutable:** yes, **default value:** `16` |
 | <a name="gc_trace_percent" class="prop"></a> gc.trace.percent | Percent of gc cycles to trace<br>**type:** FRACTION, **zk mutable:** yes, **default value:** `0.01` |
 | <a name="gc_trash_ignore" class="prop"></a> gc.trash.ignore | Do not use the Trash, even if it is configured.<br>**type:** BOOLEAN, **zk mutable:** yes, **default value:** `false` |
 | <a name="general_prefix" class="prop"></a> **general.*** | Properties in this category affect the behavior of accumulo overall, but do not have to be consistent throughout a cloud. |
@@ -53,12 +55,12 @@ order: 3
 | <a name="instance_zookeeper_timeout" class="prop"></a> instance.zookeeper.timeout | Zookeeper session timeout; max value when represented as milliseconds should be no larger than 2147483647<br>**type:** TIMEDURATION, **zk mutable:** no, **default value:** `30s` |
 | <a name="master_prefix" class="prop"></a> **master.*** | Properties in this category affect the behavior of the master server |
 | <a name="master_bulk_rename_threadpool_size" class="prop"></a> master.bulk.rename.threadpool.size | The number of threads to use when moving user files to bulk ingest directories under accumulo control<br>**type:** COUNT, **zk mutable:** yes, **default value:** `20` |
-| <a name="master_bulk_retries" class="prop"></a> master.bulk.retries | The number of attempts to bulk-load a file before giving up.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `3` |
-| <a name="master_bulk_threadpool_size" class="prop"></a> master.bulk.threadpool.size | The number of threads to use when coordinating a bulk-import.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `5` |
+| <a name="master_bulk_retries" class="prop"></a> master.bulk.retries | The number of attempts to bulk import a RFile before giving up.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `3` |
+| <a name="master_bulk_threadpool_size" class="prop"></a> master.bulk.threadpool.size | The number of threads to use when coordinating a bulk import.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `5` |
 | <a name="master_bulk_timeout" class="prop"></a> master.bulk.timeout | The time to wait for a tablet server to process a bulk import request<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `5m` |
 | <a name="master_bulk_tserver_regex" class="prop"></a> master.bulk.tserver.regex | Regular expression that defines the set of Tablet Servers that will perform bulk imports<br>**type:** STRING, **zk mutable:** yes, **default value:** empty |
-| <a name="master_fate_threadpool_size" class="prop"></a> master.fate.threadpool.size | The number of threads used to run FAult-Tolerant Executions. These are primarily table operations like merge.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `4` |
-| <a name="master_lease_recovery_interval" class="prop"></a> master.lease.recovery.interval | The amount of time to wait after requesting a WAL file to be recovered<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `5s` |
+| <a name="master_fate_threadpool_size" class="prop"></a> master.fate.threadpool.size | The number of threads used to run fault-tolerant executions (FATE). These are primarily table operations like merge.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `4` |
+| <a name="master_lease_recovery_interval" class="prop"></a> master.lease.recovery.interval | The amount of time to wait after requesting a write-ahead log to be recovered<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `5s` |
 | <a name="master_metadata_suspendable" class="prop"></a> master.metadata.suspendable | Allow tablets for the accumulo.metadata table to be suspended via table.suspend.duration.<br>**type:** BOOLEAN, **zk mutable:** yes, **default value:** `false` |
 | <a name="master_port_client" class="prop"></a> master.port.client | The port used for handling client connections on the master<br>**type:** PORT, **zk mutable:** yes but requires restart of the master, **default value:** `9999` |
 | <a name="master_recovery_delay" class="prop"></a> master.recovery.delay | When a tablet server's lock is deleted, it takes time for it to completely quit. This delay gives it time before log recoveries begin.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `10s` |
@@ -72,7 +74,7 @@ order: 3
 | <a name="master_server_threads_minimum" class="prop"></a> master.server.threads.minimum | The minimum number of threads to use to handle incoming requests.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `20` |
 | <a name="master_status_threadpool_size" class="prop"></a> master.status.threadpool.size | The number of threads to use when fetching the tablet server status for balancing.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `1` |
 | <a name="master_tablet_balancer" class="prop"></a> master.tablet.balancer | The balancer class that accumulo will use to make tablet assignment and migration decisions.<br>**type:** CLASSNAME, **zk mutable:** yes, **default value:** `org.apache.accumulo.server.master.balancer.TableLoadBalancer` |
-| <a name="master_walog_closer_implementation" class="prop"></a> master.walog.closer.implementation | A class that implements a mechansim to steal write access to a file<br>**type:** CLASSNAME, **zk mutable:** yes, **default value:** `org.apache.accumulo.server.master.recovery.HadoopLogCloser` |
+| <a name="master_walog_closer_implementation" class="prop"></a> master.walog.closer.implementation | A class that implements a mechanism to steal write access to a write-ahead log<br>**type:** CLASSNAME, **zk mutable:** yes, **default value:** `org.apache.accumulo.server.master.recovery.HadoopLogCloser` |
 | <a name="monitor_prefix" class="prop"></a> **monitor.*** | Properties in this category affect the behavior of the monitor web server. |
 | <a name="monitor_banner_background" class="prop"></a> monitor.banner.background | **Deprecated.** ~~The background color of the banner text displayed on the monitor page.~~<br>~~**type:** STRING~~, ~~**zk mutable:** yes~~, ~~**default value:** `#304065`~~ |
 | <a name="monitor_banner_color" class="prop"></a> monitor.banner.color | **Deprecated.** ~~The color of the banner text displayed on the monitor page.~~<br>~~**type:** STRING~~, ~~**zk mutable:** yes~~, ~~**default value:** `#c4c4c4`~~ |
@@ -129,27 +131,27 @@ order: 3
 | <a name="table_bloom_error_rate" class="prop"></a> table.bloom.error.rate | Bloom filter error rate.<br>**type:** FRACTION, **zk mutable:** yes, **default value:** `0.5%` |
 | <a name="table_bloom_hash_type" class="prop"></a> table.bloom.hash.type | The bloom filter hash type<br>**type:** STRING, **zk mutable:** yes, **default value:** `murmur` |
 | <a name="table_bloom_key_functor" class="prop"></a> table.bloom.key.functor | A function that can transform the key prior to insertion and check of bloom filter. org.apache.accumulo.core.file.keyfunctor.RowFunctor,,org.apache.accumulo.core.file.keyfunctor.ColumnFamilyFunctor, and org.apache.accumulo.core.file.keyfunctor.ColumnQualifierFunctor are allowable values. One can extend any of the above mentioned classes to perform specialized parsing of the key. <br>**type:** CLASSNAME, **zk  [...]
-| <a name="table_bloom_load_threshold" class="prop"></a> table.bloom.load.threshold | This number of seeks that would actually use a bloom filter must occur before a file's bloom filter is loaded. Set this to zero to initiate loading of bloom filters when a file is opened.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `1` |
+| <a name="table_bloom_load_threshold" class="prop"></a> table.bloom.load.threshold | This number of seeks that would actually use a bloom filter must occur before a RFile's bloom filter is loaded. Set this to zero to initiate loading of bloom filters when a RFile is opened.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `1` |
 | <a name="table_bloom_size" class="prop"></a> table.bloom.size | Bloom filter size, as number of keys.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `1048576` |
-| <a name="table_cache_block_enable" class="prop"></a> table.cache.block.enable | Determines whether file block cache is enabled.<br>**type:** BOOLEAN, **zk mutable:** yes, **default value:** `false` |
-| <a name="table_cache_index_enable" class="prop"></a> table.cache.index.enable | Determines whether index cache is enabled.<br>**type:** BOOLEAN, **zk mutable:** yes, **default value:** `true` |
+| <a name="table_cache_block_enable" class="prop"></a> table.cache.block.enable | Determines whether data block cache is enabled for a table.<br>**type:** BOOLEAN, **zk mutable:** yes, **default value:** `false` |
+| <a name="table_cache_index_enable" class="prop"></a> table.cache.index.enable | Determines whether index block cache is enabled for a table.<br>**type:** BOOLEAN, **zk mutable:** yes, **default value:** `true` |
 | <a name="table_classpath_context" class="prop"></a> table.classpath.context | Per table classpath context<br>**type:** STRING, **zk mutable:** yes, **default value:** empty |
-| <a name="table_compaction_major_everything_idle" class="prop"></a> table.compaction.major.everything.idle | After a tablet has been idle (no mutations) for this time period it may have all of its files compacted into one. There is no guarantee an idle tablet will be compacted. Compactions of idle tablets are only started when regular compactions are not running. Idle compactions only take place for tablets that have one or more files.<br>**type:** TIMEDURATION, **zk mutable:** yes, **d [...]
-| <a name="table_compaction_major_ratio" class="prop"></a> table.compaction.major.ratio | minimum ratio of total input size to maximum input file size for running a major compactionWhen adjusting this property you may want to also adjust table.file.max. Want to avoid the situation where only merging minor compactions occur.<br>**type:** FRACTION, **zk mutable:** yes, **default value:** `3` |
+| <a name="table_compaction_major_everything_idle" class="prop"></a> table.compaction.major.everything.idle | After a tablet has been idle (no mutations) for this time period it may have all of its RFiles compacted into one. There is no guarantee an idle tablet will be compacted. Compactions of idle tablets are only started when regular compactions are not running. Idle compactions only take place for tablets that have one or more RFiles.<br>**type:** TIMEDURATION, **zk mutable:** yes, * [...]
+| <a name="table_compaction_major_ratio" class="prop"></a> table.compaction.major.ratio | Minimum ratio of total input size to maximum input RFile size for running a major compaction. When adjusting this property you may want to also adjust table.file.max. Want to avoid the situation where only merging minor compactions occur.<br>**type:** FRACTION, **zk mutable:** yes, **default value:** `3` |
 | <a name="table_compaction_minor_idle" class="prop"></a> table.compaction.minor.idle | After a tablet has been idle (no mutations) for this time period it may have its in-memory map flushed to disk in a minor compaction. There is no guarantee an idle tablet will be compacted.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `5m` |
 | <a name="table_compaction_minor_logs_threshold" class="prop"></a> table.compaction.minor.logs.threshold | When there are more than this many write-ahead logs against a tablet, it will be minor compacted. See comment for property tserver.memory.maps.max<br>**type:** COUNT, **zk mutable:** yes, **default value:** `3` |
-| <a name="table_compaction_minor_merge_file_size_max" class="prop"></a> table.compaction.minor.merge.file.size.max | The max file size used for a merging minor compaction. The default value of 0 disables a max file size.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `0` |
+| <a name="table_compaction_minor_merge_file_size_max" class="prop"></a> table.compaction.minor.merge.file.size.max | The max RFile size used for a merging minor compaction. The default value of 0 disables a max file size.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `0` |
 | <a name="table_constraint_prefix" class="prop"></a> **table.constraint.*** | Properties in this category are per-table properties that add constraints to a table. These properties start with the category prefix, followed by a number, and their values correspond to a fully qualified Java class that implements the Constraint interface.<br>For example:<br>table.constraint.1 = org.apache.accumulo.core.constraints.MyCustomConstraint<br>and:<br>table.constraint.2 = my.package.constraints.MyS [...]
 | <a name="table_custom_prefix" class="prop"></a> **table.custom.*** | Prefix to be used for user defined arbitrary properties. |
 | <a name="table_durability" class="prop"></a> table.durability | The durability used to write to the write-ahead log. Legal values are: none, which skips the write-ahead log; log, which sends the data to the write-ahead log, but does nothing to make it durable; flush, which pushes data to the file system; and sync, which ensures the data is written to disk.<br>**type:** DURABILITY, **zk mutable:** yes, **default value:** `sync` |
 | <a name="table_failures_ignore" class="prop"></a> table.failures.ignore | If you want queries for your table to hang or fail when data is missing from the system, then set this to false. When this set to true missing data will be reported but queries will still run possibly returning a subset of the data.<br>**type:** BOOLEAN, **zk mutable:** yes, **default value:** `false` |
-| <a name="table_file_blocksize" class="prop"></a> table.file.blocksize | Overrides the hadoop dfs.block.size setting so that files have better query performance. The maximum value for this is 2147483647<br>**type:** BYTES, **zk mutable:** yes, **default value:** `0B` |
-| <a name="table_file_compress_blocksize" class="prop"></a> table.file.compress.blocksize | Similar to the hadoop io.seqfile.compress.blocksize setting, so that files have better query performance. The maximum value for this is 2147483647. (This setting is the size threshold prior to compression, and applies even compression is disabled.)<br>**type:** BYTES, **zk mutable:** yes, **default value:** `100K` |
-| <a name="table_file_compress_blocksize_index" class="prop"></a> table.file.compress.blocksize.index | Determines how large index blocks can be in files that support multilevel indexes. The maximum value for this is 2147483647. (This setting is the size threshold prior to compression, and applies even compression is disabled.)<br>**type:** BYTES, **zk mutable:** yes, **default value:** `128K` |
-| <a name="table_file_compress_type" class="prop"></a> table.file.compress.type | One of gz,snappy,lzo,none<br>**type:** STRING, **zk mutable:** yes, **default value:** `gz` |
-| <a name="table_file_max" class="prop"></a> table.file.max | Determines the max # of files each tablet in a table can have. When adjusting this property you may want to consider adjusting table.compaction.major.ratio also. Setting this property to 0 will make it default to tserver.scan.files.open.max-1, this will prevent a tablet from having more files than can be opened. Setting this property low may throttle ingest and increase query performance.<br>**type:** COUNT, **zk mutable:** ye [...]
-| <a name="table_file_replication" class="prop"></a> table.file.replication | Determines how many replicas to keep of a tables' files in HDFS. When this value is LTE 0, HDFS defaults are used.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `0` |
-| <a name="table_file_summary_maxSize" class="prop"></a> table.file.summary.maxSize | The maximum size summary that will be stored. The number of files that had summary data exceeding this threshold is reported by Summary.getFileStatistics().getLarge().  When adjusting this consider the expected number files with summaries on each tablet server and the summary cache size.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `256K` |
+| <a name="table_file_blocksize" class="prop"></a> table.file.blocksize | The HDFS block size used when writing RFiles. When set to 0B, the value/defaults of HDFS property 'dfs.block.size' will be used.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `0B` |
+| <a name="table_file_compress_blocksize" class="prop"></a> table.file.compress.blocksize | The maximum size of data blocks in RFiles before they are compressed and written.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `100K` |
+| <a name="table_file_compress_blocksize_index" class="prop"></a> table.file.compress.blocksize.index | The maximum size of index blocks in RFiles before they are compressed and written.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `128K` |
+| <a name="table_file_compress_type" class="prop"></a> table.file.compress.type | Compression algorithm used on index and data blocks before they are written. Possible values: gz, snappy, lzo, none<br>**type:** STRING, **zk mutable:** yes, **default value:** `gz` |
+| <a name="table_file_max" class="prop"></a> table.file.max | The maximum number of RFiles each tablet in a table can have. When adjusting this property you may want to consider adjusting table.compaction.major.ratio also. Setting this property to 0 will make it default to tserver.scan.files.open.max-1, this will prevent a tablet from having more RFiles than can be opened. Setting this property low may throttle ingest and increase query performance.<br>**type:** COUNT, **zk mutable:** ye [...]
+| <a name="table_file_replication" class="prop"></a> table.file.replication | The number of replicas for a table's RFiles in HDFS. When set to 0, HDFS defaults are used.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `0` |
+| <a name="table_file_summary_maxSize" class="prop"></a> table.file.summary.maxSize | The maximum size summary that will be stored. The number of RFiles that had summary data exceeding this threshold is reported by Summary.getFileStatistics().getLarge().  When adjusting this consider the expected number RFiles with summaries on each tablet server and the summary cache size.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `256K` |
 | <a name="table_file_type" class="prop"></a> table.file.type | Change the type of file a table writes<br>**type:** STRING, **zk mutable:** yes, **default value:** `rf` |
 | <a name="table_formatter" class="prop"></a> table.formatter | The Formatter class to apply on results in the shell<br>**type:** STRING, **zk mutable:** yes, **default value:** `org.apache.accumulo.core.util.format.DefaultFormatter` |
 | <a name="table_group_prefix" class="prop"></a> **table.group.*** | Properties in this category are per-table properties that define locality groups in a table. These properties start with the category prefix, followed by a name, followed by a period, and followed by a property for that group.<br>For example table.group.group1=x,y,z sets the column families for a group called group1. Once configured, group1 can be enabled by adding it to the list of groups in the table.groups.enabled pr [...]
@@ -163,12 +165,12 @@ order: 3
 | <a name="table_majc_compaction_strategy_opts_prefix" class="prop"></a> **table.majc.compaction.strategy.opts.*** | Properties in this category are used to configure the compaction strategy. |
 | <a name="table_replication" class="prop"></a> table.replication | Is replication enabled for the given table<br>**type:** BOOLEAN, **zk mutable:** yes, **default value:** `false` |
 | <a name="table_replication_target_prefix" class="prop"></a> **table.replication.target.*** | Enumerate a mapping of other systems which this table should replicate their data to. The key suffix is the identifying cluster name and the value is an identifier for a location on the target system, e.g. the ID of the table on the target to replicate to |
-| <a name="table_sampler" class="prop"></a> table.sampler | The name of a class that implements org.apache.accumulo.core.Sampler.  Setting this option enables storing a sample of data which can be scanned.  Always having a current sample can useful for query optimization and data comprehension.   After enabling sampling for an existing table, a compaction is needed to compute the sample for existing data.  The compact command in the shell has an option to only compact files without sampl [...]
+| <a name="table_sampler" class="prop"></a> table.sampler | The name of a class that implements org.apache.accumulo.core.Sampler.  Setting this option enables storing a sample of data which can be scanned.  Always having a current sample can useful for query optimization and data comprehension.   After enabling sampling for an existing table, a compaction is needed to compute the sample for existing data.  The compact command in the shell has an option to only compact RFiles without samp [...]
 | <a name="table_sampler_opt_prefix" class="prop"></a> **table.sampler.opt.*** | The property is used to set options for a sampler.  If a sample had two options like hasher and modulous, then the two properties table.sampler.opt.hasher=${hash algorithm} and table.sampler.opt.modulous=${mod} would be set. |
 | <a name="table_scan_max_memory" class="prop"></a> table.scan.max.memory | The maximum amount of memory that will be used to cache results of a client query/scan. Once this limit is reached, the buffered data is sent to the client.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `512K` |
 | <a name="table_security_scan_visibility_default" class="prop"></a> table.security.scan.visibility.default | The security label that will be assumed at scan time if an entry does not have a visibility set.<br>Note: An empty security label is displayed as []. The scan results will show an empty visibility even if the visibility from this setting is applied to the entry.<br>CAUTION: If a particular key has an empty security label AND its table's default visibility is also empty, access wi [...]
 | <a name="table_split_endrow_size_max" class="prop"></a> table.split.endrow.size.max | Maximum size of end row<br>**type:** BYTES, **zk mutable:** yes, **default value:** `10K` |
-| <a name="table_split_threshold" class="prop"></a> table.split.threshold | When combined size of files exceeds this amount a tablet is split.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `1G` |
+| <a name="table_split_threshold" class="prop"></a> table.split.threshold | A tablet is split when the combined size of RFiles exceeds this amount.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `1G` |
 | <a name="table_summarizer_prefix" class="prop"></a> **table.summarizer.*** | Prefix for configuring summarizers for a table.  Using this prefix multiple summarizers can be configured with options for each one. Each summarizer configured should have a unique id, this id can be anything. To add a summarizer set table.summarizer.<unique id>=<summarizer class name>.  If the summarizer has options, then for each option set table.summarizer.<unique id>.opt.<key>=<value>. |
 | <a name="table_suspend_duration" class="prop"></a> table.suspend.duration | For tablets belonging to this table: When a tablet server dies, allow the tablet server this duration to revive before reassigning its tabletsto other tablet servers.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `0s` |
 | <a name="table_walog_enabled" class="prop"></a> table.walog.enabled | **Deprecated.** ~~This setting is deprecated.  Use table.durability=none instead.~~<br>~~**type:** BOOLEAN~~, ~~**zk mutable:** yes~~, ~~**default value:** `true`~~ |
@@ -187,18 +189,18 @@ order: 3
 | <a name="tserver_assignment_concurrent_max" class="prop"></a> tserver.assignment.concurrent.max | The number of threads available to load tablets. Recoveries are still performed serially.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `2` |
 | <a name="tserver_assignment_duration_warning" class="prop"></a> tserver.assignment.duration.warning | The amount of time an assignment can run  before the server will print a warning along with the current stack trace. Meant to help debug stuck assignments<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `10m` |
 | <a name="tserver_bloom_load_concurrent_max" class="prop"></a> tserver.bloom.load.concurrent.max | The number of concurrent threads that will load bloom filters in the background. Setting this to zero will make bloom filters load in the foreground.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `4` |
-| <a name="tserver_bulk_assign_threads" class="prop"></a> tserver.bulk.assign.threads | The master delegates bulk file processing and assignment to tablet servers. After the bulk file has been processed, the tablet server will assign the file to the appropriate tablets on all servers. This property controls the number of threads used to communicate to the other servers.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `1` |
-| <a name="tserver_bulk_process_threads" class="prop"></a> tserver.bulk.process.threads | The master will task a tablet server with pre-processing a bulk file prior to assigning it to the appropriate tablet servers. This configuration value controls the number of threads used to process the files.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `1` |
-| <a name="tserver_bulk_retry_max" class="prop"></a> tserver.bulk.retry.max | The number of times the tablet server will attempt to assign a file to a tablet as it migrates and splits.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `5` |
+| <a name="tserver_bulk_assign_threads" class="prop"></a> tserver.bulk.assign.threads | The master delegates bulk import RFile processing and assignment to tablet servers. After file has been processed, the tablet server will assign the file to the appropriate tablets on all servers. This property controls the number of threads used to communicate to the other servers.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `1` |
+| <a name="tserver_bulk_process_threads" class="prop"></a> tserver.bulk.process.threads | The master will task a tablet server with pre-processing a bulk import RFile prior to assigning it to the appropriate tablet servers. This configuration value controls the number of threads used to process the files.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `1` |
+| <a name="tserver_bulk_retry_max" class="prop"></a> tserver.bulk.retry.max | The number of times the tablet server will attempt to assign a RFile to a tablet as it migrates and splits.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `5` |
 | <a name="tserver_bulk_timeout" class="prop"></a> tserver.bulk.timeout | The time to wait for a tablet server to process a bulk import request.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `5m` |
-| <a name="tserver_cache_data_size" class="prop"></a> tserver.cache.data.size | Specifies the size of the cache for file data blocks.<br>**type:** MEMORY, **zk mutable:** yes, **default value:** `10%` |
-| <a name="tserver_cache_index_size" class="prop"></a> tserver.cache.index.size | Specifies the size of the cache for file indices.<br>**type:** MEMORY, **zk mutable:** yes, **default value:** `25%` |
+| <a name="tserver_cache_data_size" class="prop"></a> tserver.cache.data.size | Specifies the size of the cache for RFile data blocks.<br>**type:** MEMORY, **zk mutable:** yes, **default value:** `10%` |
+| <a name="tserver_cache_index_size" class="prop"></a> tserver.cache.index.size | Specifies the size of the cache for RFile index blocks.<br>**type:** MEMORY, **zk mutable:** yes, **default value:** `25%` |
 | <a name="tserver_cache_manager_class" class="prop"></a> tserver.cache.manager.class | Specifies the class name of the block cache factory implementation. Alternative implementation is org.apache.accumulo.core.file.blockfile.cache.tinylfu.TinyLfuBlockCacheManager<br>**type:** STRING, **zk mutable:** yes, **default value:** `org.apache.accumulo.core.file.blockfile.cache.lru.LruBlockCacheManager` |
 | <a name="tserver_cache_summary_size" class="prop"></a> tserver.cache.summary.size | Specifies the size of the cache for summary data on each tablet server.<br>**type:** MEMORY, **zk mutable:** yes, **default value:** `10%` |
 | <a name="tserver_client_timeout" class="prop"></a> tserver.client.timeout | Time to wait for clients to continue scans before closing a session.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `3s` |
 | <a name="tserver_compaction_major_concurrent_max" class="prop"></a> tserver.compaction.major.concurrent.max | The maximum number of concurrent major compactions for a tablet server<br>**type:** COUNT, **zk mutable:** yes, **default value:** `3` |
 | <a name="tserver_compaction_major_delay" class="prop"></a> tserver.compaction.major.delay | Time a tablet server will sleep between checking which tablets need compaction.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `30s` |
-| <a name="tserver_compaction_major_thread_files_open_max" class="prop"></a> tserver.compaction.major.thread.files.open.max | Max number of files a major compaction thread can open at once. <br>**type:** COUNT, **zk mutable:** yes, **default value:** `10` |
+| <a name="tserver_compaction_major_thread_files_open_max" class="prop"></a> tserver.compaction.major.thread.files.open.max | Max number of RFiles a major compaction thread can open at once. <br>**type:** COUNT, **zk mutable:** yes, **default value:** `10` |
 | <a name="tserver_compaction_major_throughput" class="prop"></a> tserver.compaction.major.throughput | Maximum number of bytes to read or write per second over all major compactions on a TabletServer, or 0B for unlimited.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `0B` |
 | <a name="tserver_compaction_major_trace_percent" class="prop"></a> tserver.compaction.major.trace.percent | The percent of major compactions to trace<br>**type:** FRACTION, **zk mutable:** yes, **default value:** `0.1` |
 | <a name="tserver_compaction_minor_concurrent_max" class="prop"></a> tserver.compaction.minor.concurrent.max | The maximum number of concurrent minor compactions for a tablet server<br>**type:** COUNT, **zk mutable:** yes, **default value:** `4` |
@@ -206,7 +208,7 @@ order: 3
 | <a name="tserver_compaction_warn_time" class="prop"></a> tserver.compaction.warn.time | When a compaction has not made progress for this time period, a warning will be logged<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `10m` |
 | <a name="tserver_default_blocksize" class="prop"></a> tserver.default.blocksize | Specifies a default blocksize for the tserver caches<br>**type:** BYTES, **zk mutable:** yes, **default value:** `1M` |
 | <a name="tserver_dir_memdump" class="prop"></a> tserver.dir.memdump | A long running scan could possibly hold memory that has been minor compacted. To prevent this, the in memory map is dumped to a local file and the scan is switched to that local file. We can not switch to the minor compacted file because it may have been modified by iterators. The file dumped to the local dir is an exact copy of what was in memory.<br>**type:** PATH, **zk mutable:** yes, **default value:** `/tmp` |
-| <a name="tserver_files_open_idle" class="prop"></a> tserver.files.open.idle | Tablet servers leave previously used files open for future queries. This setting determines how much time an unused file should be kept open until it is closed.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `1m` |
+| <a name="tserver_files_open_idle" class="prop"></a> tserver.files.open.idle | Tablet servers leave previously used RFiles open for future queries. This setting determines how much time an unused RFile should be kept open until it is closed.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `1m` |
 | <a name="tserver_hold_time_max" class="prop"></a> tserver.hold.time.max | The maximum time for a tablet server to be in the "memory full" state. If the tablet server cannot write out memory in this much time, it will assume there is some failure local to its node, and quit. A value of zero is equivalent to forever.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `5m` |
 | <a name="tserver_memory_manager" class="prop"></a> tserver.memory.manager | An implementation of MemoryManger that accumulo will use.<br>**type:** CLASSNAME, **zk mutable:** yes, **default value:** `org.apache.accumulo.server.tabletserver.LargestFirstMemoryManager` |
 | <a name="tserver_memory_maps_max" class="prop"></a> tserver.memory.maps.max | Maximum amount of memory that can be used to buffer data written to a tablet server. There are two other properties that can effectively limit memory usage table.compaction.minor.logs.threshold and tserver.walog.max.size. Ensure that table.compaction.minor.logs.threshold * tserver.walog.max.size >= this property.<br>**type:** MEMORY, **zk mutable:** yes, **default value:** `33%` |
@@ -222,7 +224,7 @@ order: 3
 | <a name="tserver_replication_batchwriter_replayer_memory" class="prop"></a> tserver.replication.batchwriter.replayer.memory | Memory to provide to batchwriter to replay mutations for replication<br>**type:** BYTES, **zk mutable:** yes, **default value:** `50M` |
 | <a name="tserver_replication_default_replayer" class="prop"></a> tserver.replication.default.replayer | Default AccumuloReplicationReplayer implementation<br>**type:** CLASSNAME, **zk mutable:** yes, **default value:** `org.apache.accumulo.tserver.replication.BatchWriterReplicationReplayer` |
 | <a name="tserver_replication_replayer_prefix" class="prop"></a> **tserver.replication.replayer.*** | Allows configuration of implementation used to apply replicated data |
-| <a name="tserver_scan_files_open_max" class="prop"></a> tserver.scan.files.open.max | Maximum total files that all tablets in a tablet server can open for scans. <br>**type:** COUNT, **zk mutable:** yes but requires restart of the tserver, **default value:** `100` |
+| <a name="tserver_scan_files_open_max" class="prop"></a> tserver.scan.files.open.max | Maximum total RFiles that all tablets in a tablet server can open for scans. <br>**type:** COUNT, **zk mutable:** yes but requires restart of the tserver, **default value:** `100` |
 | <a name="tserver_server_message_size_max" class="prop"></a> tserver.server.message.size.max | The maximum size of a message that can be sent to a tablet server.<br>**type:** BYTES, **zk mutable:** yes, **default value:** `1G` |
 | <a name="tserver_server_threadcheck_time" class="prop"></a> tserver.server.threadcheck.time | The time between adjustments of the server thread pool.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `1s` |
 | <a name="tserver_server_threads_minimum" class="prop"></a> tserver.server.threads.minimum | The minimum number of threads to use to handle incoming requests.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `20` |
@@ -230,10 +232,10 @@ order: 3
 | <a name="tserver_session_update_idle_max" class="prop"></a> tserver.session.update.idle.max | When a tablet server's SimpleTimer thread triggers to check idle sessions, this configurable option will be used to evaluate update sessions to determine if they can be closed due to inactivity<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `1m` |
 | <a name="tserver_slow_flush_time" class="prop"></a> tserver.slow.flush.time | If a flush to the write-ahead log takes longer than this period of time, debugging information will written, and may result in a log rollover.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `100ms` |
 | <a name="tserver_sort_buffer_size" class="prop"></a> tserver.sort.buffer.size | The amount of memory to use when sorting logs during recovery.<br>**type:** MEMORY, **zk mutable:** yes, **default value:** `10%` |
-| <a name="tserver_summary_partition_threads" class="prop"></a> tserver.summary.partition.threads | Summary data must be retrieved from files.  For a large number of files, the files are broken into partitions of 100K files.  This setting determines how many of these groups of 100K files will be processed concurrently.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `10` |
-| <a name="tserver_summary_remote_threads" class="prop"></a> tserver.summary.remote.threads | For a partitioned group of 100K files, those files are grouped by tablet server.  Then a remote tablet server is asked to gather summary data.  This setting determines how many concurrent request are made per partition.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `128` |
+| <a name="tserver_summary_partition_threads" class="prop"></a> tserver.summary.partition.threads | Summary data must be retrieved from RFiles.  For a large number of RFiles, the files are broken into partitions of 100K files.  This setting determines how many of these groups of 100K RFiles will be processed concurrently.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `10` |
+| <a name="tserver_summary_remote_threads" class="prop"></a> tserver.summary.remote.threads | For a partitioned group of 100K RFiles, those files are grouped by tablet server.  Then a remote tablet server is asked to gather summary data.  This setting determines how many concurrent request are made per partition.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `128` |
 | <a name="tserver_summary_retrieval_threads" class="prop"></a> tserver.summary.retrieval.threads | The number of threads on each tablet server available to retrieve summary data, that is not currently in cache, from RFiles.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `10` |
-| <a name="tserver_tablet_split_midpoint_files_max" class="prop"></a> tserver.tablet.split.midpoint.files.max | To find a tablets split points, all index files are opened. This setting determines how many index files can be opened at once. When there are more index files than this setting multiple passes must be made, which is slower. However opening too many files at once can cause problems.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `300` |
+| <a name="tserver_tablet_split_midpoint_files_max" class="prop"></a> tserver.tablet.split.midpoint.files.max | To find a tablets split points, all RFiles are opened and their indexes are read. This setting determines how many RFiles can be opened at once. When there are more RFiles than this setting multiple passes must be made, which is slower. However opening too many RFiles at once can cause problems.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `300` |
 | <a name="tserver_total_mutation_queue_max" class="prop"></a> tserver.total.mutation.queue.max | The amount of memory used to store write-ahead-log mutations before flushing them.<br>**type:** MEMORY, **zk mutable:** yes, **default value:** `5%` |
 | <a name="tserver_wal_blocksize" class="prop"></a> tserver.wal.blocksize | The size of the HDFS blocks used to write to the Write-Ahead log. If zero, it will be 110% of tserver.walog.max.size (that is, try to use just one block)<br>**type:** BYTES, **zk mutable:** yes, **default value:** `0` |
 | <a name="tserver_wal_replication" class="prop"></a> tserver.wal.replication | The replication to use when writing the Write-Ahead log to HDFS. If zero, it will use the HDFS default replication setting.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `0` |
@@ -241,10 +243,10 @@ order: 3
 | <a name="tserver_wal_sync_method" class="prop"></a> tserver.wal.sync.method | **Deprecated.** ~~This property is deprecated. Use table.durability instead.~~<br>~~**type:** STRING~~, ~~**zk mutable:** yes~~, ~~**default value:** `hsync`~~ |
 | <a name="tserver_walog_max_age" class="prop"></a> tserver.walog.max.age | The maximum age for each write-ahead log.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `24h` |
 | <a name="tserver_walog_max_size" class="prop"></a> tserver.walog.max.size | The maximum size for each write-ahead log. See comment for property tserver.memory.maps.max<br>**type:** BYTES, **zk mutable:** yes, **default value:** `1g` |
-| <a name="tserver_walog_maximum_wait_duration" class="prop"></a> tserver.walog.maximum.wait.duration | The maximum amount of time to wait after a failure to create a WAL file.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `5m` |
-| <a name="tserver_walog_tolerated_creation_failures" class="prop"></a> tserver.walog.tolerated.creation.failures | The maximum number of failures tolerated when creating a new WAL file within the period specified by tserver.walog.failures.period. Exceeding this number of failures in the period causes the TabletServer to exit.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `50` |
+| <a name="tserver_walog_maximum_wait_duration" class="prop"></a> tserver.walog.maximum.wait.duration | The maximum amount of time to wait after a failure to create a write-ahead log.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `5m` |
+| <a name="tserver_walog_tolerated_creation_failures" class="prop"></a> tserver.walog.tolerated.creation.failures | The maximum number of failures tolerated when creating a new write-ahead log within the period specified by tserver.walog.failures.period. Exceeding this number of failures in the period causes the TabletServer to exit.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `50` |
 | <a name="tserver_walog_tolerated_wait_increment" class="prop"></a> tserver.walog.tolerated.wait.increment | The amount of time to wait between failures to create a WALog.<br>**type:** TIMEDURATION, **zk mutable:** yes, **default value:** `1000ms` |
-| <a name="tserver_workq_threads" class="prop"></a> tserver.workq.threads | The number of threads for the distributed work queue. These threads are used for copying failed bulk files.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `2` |
+| <a name="tserver_workq_threads" class="prop"></a> tserver.workq.threads | The number of threads for the distributed work queue. These threads are used for copying failed bulk import RFiles.<br>**type:** COUNT, **zk mutable:** yes, **default value:** `2` |
 
 ### Property Types
 
diff --git a/_docs-2-0/administration/ssl.md b/_docs-2-0/administration/ssl.md
index 3cb10cf..b2a3e5d 100644
--- a/_docs-2-0/administration/ssl.md
+++ b/_docs-2-0/administration/ssl.md
@@ -47,21 +47,19 @@ their own certificate.
 ## Client configuration
 
 To establish a connection to Accumulo servers, each client must also have
-special configuration. This is typically accomplished through the use of
-the client configuration file whose default location is `~/.accumulo/config`.
+special configuration. This is typically accomplished by [creating Accumulo
+clients][clients] using `accumulo-client.properties` and setting the following
+the properties to connect to an Accumulo instance using SSL:
 
-The following properties must be set to connect to an Accumulo instance using SSL:
+* [ssl.enabled] to `true`
+* [ssl.truststore.path]
+* [ssl.truststore.password]
 
-* [rpc.javax.net.ssl.trustStore] = _The path on the local filesystem to the keystore containing the certificate authority's public key_
-* [rpc.javax.net.ssl.trustStorePassword] = _The password for the keystore containing the certificate authority's public key_
-* [instance.rpc.ssl.enabled] = _true_
+If two-way SSL is enabled for the Accumulo instance (by setting [instance.rpc.ssl.clientAuth] to `true` in `accumulo-site.xml`),
+Accumulo clients must also define their own certificate by setting the following properties:
 
-If two-way SSL if enabled (`instance.rpc.ssl.clientAuth=true`) for the instance, the client must also define
-their own certificate and enable client authenticate as well.
-
-* [rpc.javax.net.ssl.keyStore] =_The path on the local filesystem to the keystore containing the server's certificate_
-* [rpc.javax.net.ssl.keyStorePassword] = _The password for the keystore containing the server's certificate_
-* [instance.rpc.ssl.clientAuth] = _true_
+* [ssl.keystore.path]
+* [ssl.keystore.password]
 
 ## Generating SSL material using OpenSSL
 
@@ -123,6 +121,12 @@ keytool -import -trustcacerts -alias server-crt -file server.crt -keystore serve
 The `server.jks` file is the Java keystore containing the certificate for a given host. The above
 methods are equivalent whether the certificate is generate for an Accumulo server or a client.
 
+[clients]: {{ page.docs_baseurl }}/getting-started/clients#connecting
+[ssl.enabled]: {{ page.docs_baseurl }}/development/client-properties#ssl_enabled
+[ssl.truststore.path]: {{ page.docs_baseurl }}/development/client-properties#ssl_truststore_path
+[ssl.truststore.password]: {{ page.docs_baseurl }}/development/client-properties#ssl_truststore_password
+[ssl.keystore.path]: {{ page.docs_baseurl }}/development/client-properties#ssl_keystore_path
+[ssl.keystore.password]: {{ page.docs_baseurl }}/development/client-properties#ssl_keystore_password
 [instance.secret]: {{ page.docs_baseurl }}/administration/properties#instance_secret
 [rpc.javax.net.ssl.trustStore]: {{ page.docs_baseurl }}/administration/properties#rpc_javax_net_ssl_trustStore
 [rpc.javax.net.ssl.trustStorePassword]: {{ page.docs_baseurl }}/administration/properties#rpc_javax_net_ssl_trustStorePassword
diff --git a/_docs-2-0/administration/tracing.md b/_docs-2-0/administration/tracing.md
index 1b78aa5..5d698d5 100644
--- a/_docs-2-0/administration/tracing.md
+++ b/_docs-2-0/administration/tracing.md
@@ -67,10 +67,9 @@ trace.span.receiver. when set in the Accumulo configuration.
     tracer.queue.size - max queue size (default 5000)
     tracer.span.min.ms - minimum span length to store (in ms, default 1)
 
-Note that to configure an Accumulo client for tracing, including
-the Accumulo shell, the client configuration must be given the same
-[trace.span.receivers], [trace.span.receiver.*], and [trace.zookeeper.path]
-properties as the servers have.
+To configure an Accumulo client for tracing, set [trace.span.receivers][receivers-client] and [trace.zookeeper.path][zk-path-client]
+in `accumulo-client.properties`. Also, any [trace.span.receiver.*] properties set in `accumulo-site.xml` should be set in
+`accumulo-client.properties`.
 
 Hadoop can also be configured to send traces to Accumulo, as of
 Hadoop 2.6.0, by setting properties in Hadoop's core-site.xml
@@ -138,14 +137,14 @@ this is easily done by adding to your client's pom.xml (taking care to specify a
           <scope>runtime</scope>
         </dependency>
 
-2. Add the following to your client configuration.
+2. Add the following to your `accumulo-client.properties`.
 
         trace.span.receivers=org.apache.accumulo.tracer.ZooTraceClient,org.apache.htrace.impl.ZipkinSpanReceiver
 
 3. Instrument your client as in the next section.
 
 Your SpanReceiver may require additional properties, and if so these should likewise
-be placed in the ClientConfiguration (if applicable) and Accumulo's `accumulo-site.xml`.
+be placed in `accumulo-client.properties` (if applicable) and Accumulo's `accumulo-site.xml`.
 Two such properties for ZipkinSpanReceiver, listed with their default values, are
 
 ```xml
@@ -351,3 +350,5 @@ Time  Start  Service@Location       Name
 [trace.zookeeper.path]: {{ page.docs_baseurl }}/administration/properties#trace_zookeeper_path
 [trace.span.receivers]: {{ page.docs_baseurl }}/administration/properties#trace_span_receivers
 [trace.span.receiver.*]: {{ page.docs_baseurl }}/administration/properties#trace_span_receiver_prefix
+[zk-path-client]: {{ page.docs_baseurl }}/development/client-properties#trace_zookeeper_path
+[receivers-client]: {{ page.docs_baseurl }}/development/client-properties#trace_span_receivers
diff --git a/_docs-2-0/development/client-properties.md b/_docs-2-0/development/client-properties.md
new file mode 100644
index 0000000..498590d
--- /dev/null
+++ b/_docs-2-0/development/client-properties.md
@@ -0,0 +1,40 @@
+---
+title: Client Properties
+category: development
+order: 9
+---
+
+<!-- WARNING: Do not edit this file. It is a generated file that is copied from Accumulo build (from core/target/generated-docs) -->
+<!-- Generated by : org.apache.accumulo.core.conf.ClientConfigGenerate$Markdown -->
+
+Below are properties set in `accumulo-client.properties` that configure [Accumulo clients]({{ page.docs_baseurl }}/getting-started/clients#connecting). All properties have been part of the API since 2.0.0 (unless otherwise specified):
+
+| Property | Default value | Since | Description |
+|----------|---------------|-------|-------------|
+| <a name="instance_name" class="prop"></a> instance.name | *empty* |  | Name of Accumulo instance to connect to |
+| <a name="instance_zookeepers" class="prop"></a> instance.zookeepers | localhost:2181 |  | Zookeeper connection information for Accumulo instance |
+| <a name="instance_zookeepers_timeout_sec" class="prop"></a> instance.zookeepers.timeout.sec | 30 |  | Zookeeper session timeout (in seconds) |
+| <a name="auth_method" class="prop"></a> auth.method | password |  | Authentication method (i.e password, kerberos, provider). Set more properties for chosen method below. |
+| <a name="auth_username" class="prop"></a> auth.username | *empty* |  | Accumulo username/principal for chosen authentication method |
+| <a name="auth_kerberos_keytab_path" class="prop"></a> auth.kerberos.keytab.path | *empty* |  | Path to Kerberos keytab |
+| <a name="auth_password" class="prop"></a> auth.password | *empty* |  | Accumulo user password |
+| <a name="auth_provider_name" class="prop"></a> auth.provider.name | *empty* |  | Alias used to extract Accumulo user password from CredentialProvider |
+| <a name="auth_provider_urls" class="prop"></a> auth.provider.urls | *empty* |  | Comma separated list of URLs defining CredentialProvider(s) |
+| <a name="batch_writer_durability" class="prop"></a> batch.writer.durability | default |  | Change the durability for the BatchWriter session. To use the table's durability setting. use "default" which is the table's durability setting. |
+| <a name="batch_writer_max_latency_sec" class="prop"></a> batch.writer.max.latency.sec | 120 |  | Max amount of time (in seconds) to hold data in memory before flushing it |
+| <a name="batch_writer_max_memory_bytes" class="prop"></a> batch.writer.max.memory.bytes | 52428800 |  | Max memory (in bytes) to batch before writing |
+| <a name="batch_writer_max_timeout_sec" class="prop"></a> batch.writer.max.timeout.sec | 0 |  | Max amount of time (in seconds) an unresponsive server will be re-tried. An exception is thrown when this timeout is exceeded. Set to zero for no timeout. |
+| <a name="batch_writer_max_write_threads" class="prop"></a> batch.writer.max.write.threads | 3 |  | Maximum number of threads to use for writing data to tablet servers. |
+| <a name="ssl_enabled" class="prop"></a> ssl.enabled | false |  | Enable SSL for client RPC |
+| <a name="ssl_keystore_password" class="prop"></a> ssl.keystore.password | *empty* |  | Password used to encrypt keystore |
+| <a name="ssl_keystore_path" class="prop"></a> ssl.keystore.path | *empty* |  | Path to SSL keystore file |
+| <a name="ssl_keystore_type" class="prop"></a> ssl.keystore.type | jks |  | Type of SSL keystore |
+| <a name="ssl_truststore_password" class="prop"></a> ssl.truststore.password | *empty* |  | Password used to encrypt truststore |
+| <a name="ssl_truststore_path" class="prop"></a> ssl.truststore.path | *empty* |  | Path to SSL truststore file |
+| <a name="ssl_truststore_type" class="prop"></a> ssl.truststore.type | jks |  | Type of SSL truststore |
+| <a name="ssl_use_jsse" class="prop"></a> ssl.use.jsse | false |  | Use JSSE system properties to configure SSL |
+| <a name="sasl_enabled" class="prop"></a> sasl.enabled | false |  | Enable SASL for client RPC |
+| <a name="sasl_kerberos_server_primary" class="prop"></a> sasl.kerberos.server.primary | accumulo |  | Kerberos principal/primary that Accumulo servers use to login |
+| <a name="sasl_qop" class="prop"></a> sasl.qop | auth |  | SASL quality of protection. Valid values are 'auth', 'auth-int', and 'auth-conf' |
+| <a name="trace_span_receivers" class="prop"></a> trace.span.receivers | org.apache.accumulo.tracer.ZooTraceClient |  | A list of span receiver classes to send trace spans |
+| <a name="trace_zookeeper_path" class="prop"></a> trace.zookeeper.path | /tracers |  | The zookeeper node where tracers are registered |
diff --git a/_docs-2-0/development/development_tools.md b/_docs-2-0/development/development_tools.md
index 01bceb9..6dc3548 100644
--- a/_docs-2-0/development/development_tools.md
+++ b/_docs-2-0/development/development_tools.md
@@ -30,21 +30,21 @@ To start it up, you will need to supply an empty directory and a root password a
 
 ```java
 File tempDirectory = // JUnit and Guava supply mechanisms for creating temp directories
-MiniAccumuloCluster accumulo = new MiniAccumuloCluster(tempDirectory, "password");
-accumulo.start();
+MiniAccumuloCluster mac = new MiniAccumuloCluster(tempDirectory, "password");
+mac.start();
 ```
 
 Once we have our mini cluster running, we will want to interact with the Accumulo client API:
 
 ```java
-Instance instance = new ZooKeeperInstance(accumulo.getInstanceName(), accumulo.getZooKeepers());
-Connector conn = instance.getConnector("root", new PasswordToken("password"));
+Connector conn = Connector.builder().forInstance(mac.getInstanceName(), mac.getZooKeepers())
+                    .usingPassword("root", "password").build();
 ```
 
 Upon completion of our development code, we will want to shutdown our MiniAccumuloCluster:
 
 ```java
-accumulo.stop();
+mac.stop();
 // delete your temporary folder
 ```
 
diff --git a/_docs-2-0/development/mapreduce.md b/_docs-2-0/development/mapreduce.md
index b3465ad..ee372fb 100644
--- a/_docs-2-0/development/mapreduce.md
+++ b/_docs-2-0/development/mapreduce.md
@@ -50,16 +50,15 @@ options.
 
 ## AccumuloInputFormat options
 
+The following code shows how to set up Accumulo
+
 ```java
 Job job = new Job(getConf());
-AccumuloInputFormat.setInputInfo(job,
-        "user",
-        "passwd".getBytes(),
-        "table",
-        new Authorizations());
-
-AccumuloInputFormat.setZooKeeperInstance(job, "myinstance",
-        "zooserver-one,zooserver-two");
+ConnectionInfo info = Connector.builder().forInstance("myinstance","zoo1,zoo2")
+                        .usingPassword("user", "passwd").info()
+AccumuloInputFormat.setConnectionInfo(job, info);
+AccumuloInputFormat.setInputTableName(job, table);
+AccumuloInputFormat.setScanAuthorizations(job, new Authorizations());
 ```
 
 **Optional Settings:**
@@ -155,17 +154,10 @@ class MyMapper extends Mapper<Key,Value,WritableComparable,Writable> {
 ## AccumuloOutputFormat options
 
 ```java
-boolean createTables = true;
-String defaultTable = "mytable";
-
-AccumuloOutputFormat.setOutputInfo(job,
-        "user",
-        "passwd".getBytes(),
-        createTables,
-        defaultTable);
-
-AccumuloOutputFormat.setZooKeeperInstance(job, "myinstance",
-        "zooserver-one,zooserver-two");
+ConnectionInfo info = Connector.builder().forInstance("myinstance","zoo1,zoo2")
+                        .usingPassword("user", "passwd").info()
+AccumuloOutputFormat.setConnectionInfo(job, info);
+AccumuloOutputFormat.setDefaultTableName(job, "mytable");
 ```
 
 **Optional Settings:**
diff --git a/_docs-2-0/development/proxy.md b/_docs-2-0/development/proxy.md
index 8a33b8f..2fd7b4d 100644
--- a/_docs-2-0/development/proxy.md
+++ b/_docs-2-0/development/proxy.md
@@ -6,7 +6,7 @@ order: 3
 
 The proxy API allows the interaction with Accumulo with languages other than Java.
 A proxy server is provided in the codebase and a client can further be generated.
-The proxy API can also be used instead of the traditional [ZooKeeperInstance] class to
+The proxy API can also be used instead of the traditional [Connector] class to
 provide a single TCP port in which clients can be securely routed through a firewall,
 without requiring access to all tablet servers in the cluster.
 
@@ -380,5 +380,5 @@ if __name__ == "__main__":
     main()
 ```
 
-[ZookeeperInstance]: {{ page.javadoc_core }}/org/apache/accumulo/core/client/ZooKeeperInstance.html
+[Connector]: {{ page.javadoc_core }}/org/apache/accumulo/core/client/Connector.html
 [tutorial]: https://thrift.apache.org/tutorial/
diff --git a/_docs-2-0/development/security.md b/_docs-2-0/development/security.md
index 951d213..cc42537 100644
--- a/_docs-2-0/development/security.md
+++ b/_docs-2-0/development/security.md
@@ -115,14 +115,14 @@ Accumulo has a pluggable security mechanism. It can be broken into three actions
 authorization, and permission handling.
 
 Authentication verifies the identity of a user. In Accumulo, authentication occurs when
-the `getConnector` method of [Instance] is called with a principal (i.e username)
+the `usingToken'` method of the [Connector] builder is called with a principal (i.e username)
 and an [AuthenticationToken] which is an interface with multiple implementations. The most
 common implementation is [PasswordToken] which is the default authentication method for Accumulo
 out of the box.
 
 ```java
-Instance instance = new ZooKeeperInstance("myinstance", "zookeeper1,zookeeper2");
-Connector conn = instance.getConnector("user", new PasswordToken("passwd"));
+Connector conn = Connector.builder().forInstance("myinstance", "zookeeper1,zookeper2")
+                    .usingToken("user", new PasswordToken("passwd")).build();
 ```
 
 Once a user is authenticated by the Authenticator, the user has access to the other actions within
diff --git a/_docs-2-0/getting-started/clients.md b/_docs-2-0/getting-started/clients.md
index 20a57b5..3c5526a 100644
--- a/_docs-2-0/getting-started/clients.md
+++ b/_docs-2-0/getting-started/clients.md
@@ -16,92 +16,79 @@ If you are using Maven to create Accumulo client code, add the following to your
 </dependency>
 ```
 
-## Running Client Code
-
-There are multiple ways to run Java code that use Accumulo. Below is a list
-of the different ways to execute client code.
-
-* build and execute an uber jar
-* add `accumulo classpath` to your Java classpath
-* use the `accumulo` command
-* use the `accumulo-util hadoop-jar` command
-
-### Build and execute an uber jar
-
-If you have included `accumulo-core` as dependency in your pom, you can build an uber jar
-using the Maven assembly or shade plugin and use it to run Accumulo client code. When building
-an uber jar, you should set the versions of any Hadoop dependencies in your pom to match the
-version running on your cluster.
-
-### Add 'accumulo classpath' to your Java classpath
-
-To run Accumulo client code using the `java` command, use the `accumulo classpath` command 
-to include all of Accumulo's dependencies on your classpath:
-
-    java -classpath /path/to/my.jar:/path/to/dep.jar:$(accumulo classpath) com.my.Main arg1 arg2
-
-If you would like to review which jars are included, the `accumulo classpath` command can
-output a more human readable format using the `-d` option which enables debugging:
-
-    accumulo classpath -d
-
-### Use the accumulo command
-
-Another option for running your code is to use the Accumulo script which can execute a
-main class (if it exists on its classpath):
-
-    accumulo com.foo.Client arg1 arg2
-
-While the Accumulo script will add all of Accumulo's dependencies to the classpath, you
-will need to add any jars that your create or depend on beyond what Accumulo already
-depends on. This can be accomplished by either adding the jars to the `lib/ext` directory
-of your Accumulo installation or by adding jars to the CLASSPATH variable before calling
-the accumulo command.
-
-    export CLASSPATH=/path/to/my.jar:/path/to/dep.jar; accumulo com.foo.Client arg1 arg2
-
-### Use the 'accumulo-util hadoop-jar' command
-
-If you are writing map reduce job that accesses Accumulo, then you can use
-`accumulo-util hadoop-jar` to run those jobs. See the [MapReduce example][mapred-example]
-for more information.
-
 ## Connecting
 
-All clients must first identify the Accumulo instance to which they will be
-communicating. Code to do this is as follows:
+Before writing Accumulo client code, you will need the following information.
+
+ * Accumulo instance name
+ * Zookeeper connection string
+ * Accumulo username & password
+
+The [Connector] object is the main entry point for Accumulo clients. It can be created using one
+of the following methods:
+
+1. Using the `accumulo-client.properties` file (a template can be found in the `conf/` directory
+   of the tarball distribution):
+    ```java
+    Connector conn = Connector.builder()
+                        .usingProperties("/path/to/accumulo-client.properties").build();
+    ```
+1. Using the builder methods of [Connector]:
+    ```java
+    Connector conn = Connector.builder().forInstance("myinstance", "zookeeper1,zookeper2")
+                        .usingPassword("myuser", "mypassword").build();
+    ```
+1. Using a Java Properties object.
+    ```java
+    Properties props = new Properties()
+    props.put("instance.name", "myinstance")
+    props.put("instance.zookeepers", "zookeeper1,zookeeper2")
+    props.put("auth.method", "password")
+    props.put("auth.username", "myuser")
+    props.put("auth.password", "mypassword")
+    Connector conn = Connector.builder().usingProperties(props).build();
+    ```
+
+If a `accumulo-client.properties` file or a Java Properties object is used to create a [Connector], the following
+[client properties][client-props] must be set:
+
+* [instance.name]
+* [instance.zookeepers]
+* [auth.method]
+* [auth.username]
+* [auth.password]
+
+# Authentication
+
+When creating a [Connector], the user must be authenticated using one of the following
+implementations of [AuthenticationToken] below:
+
+1. [PasswordToken] is the must commonly used implementation.
+1. [CredentialProviderToken] leverages the Hadoop CredentialProviders (new in Hadoop 2.6).
+   For example, the [CredentialProviderToken] can be used in conjunction with a Java KeyStore to
+   alleviate passwords stored in cleartext. When stored in HDFS, a single KeyStore can be used across
+   an entire instance. Be aware that KeyStores stored on the local filesystem must be made available
+   to all nodes in the Accumulo cluster.
+1. [KerberosToken] can be provided to use the authentication provided by Kerberos. Using Kerberos
+   requires external setup and additional configuration, but provides a single point of authentication
+   through HDFS, YARN and ZooKeeper and allowing for password-less authentication with Accumulo.
+
+    ```java
+    KerberosToken token = new KerberosToken();
+    Connector conn = Connector.builder().forInstance("myinstance", "zookeeper1,zookeper2")
+                        .usingToken(token.getPrincipal(), token).build();
+    ```
 
-```java
-String instanceName = "myinstance";
-String zooServers = "zooserver-one,zooserver-two"
-Instance inst = new ZooKeeperInstance(instanceName, zooServers);
-
-Connector conn = inst.getConnector("user", new PasswordToken("passwd"));
-```
-
-The [PasswordToken] is the most common implementation of an [AuthenticationToken].
-This general interface allow authentication as an Accumulo user to come from
-a variety of sources or means. The [CredentialProviderToken] leverages the Hadoop
-CredentialProviders (new in Hadoop 2.6).
+## Writing Data
 
-For example, the [CredentialProviderToken] can be used in conjunction with a Java
-KeyStore to alleviate passwords stored in cleartext. When stored in HDFS, a single
-KeyStore can be used across an entire instance. Be aware that KeyStores stored on
-the local filesystem must be made available to all nodes in the Accumulo cluster.
+With a [Connector] created, it can be used to create objects (like the [BatchWriter]) for
+reading and writing from Accumulo:
 
 ```java
-KerberosToken token = new KerberosToken();
-Connector conn = inst.getConnector(token.getPrincipal(), token);
+BatchWriter writer = conn.createBatchWriter("table");
 ```
 
-The [KerberosToken] can be provided to use the authentication provided by Kerberos.
-Using Kerberos requires external setup and additional configuration, but provides
-a single point of authentication through HDFS, YARN and ZooKeeper and allowing
-for password-less authentication with Accumulo.
-
-## Writing Data
-
-Data are written to Accumulo by creating [Mutation] objects that represent all the
+Data is written to Accumulo by creating [Mutation] objects that represent all the
 changes to the columns of a single row. The changes are made atomically in the
 TabletServer. Clients then add Mutations to a [BatchWriter] which submits them to
 the appropriate TabletServers.
@@ -179,25 +166,28 @@ replicas, and waiting for a permanent sync to disk can significantly write speed
 Accumulo allows users to use less tolerant forms of durability when writing.
 These levels are:
 
-* none: no durability guarantees are made, the WAL is not used
-* log: the WAL is used, but not flushed; loss of the server probably means recent writes are lost
-* flush: updates are written to the WAL, and flushed out to replicas; loss of a single server is unlikely to result in data loss.
-* sync: updates are written to the WAL, and synced to disk on all replicas before the write is acknowledge. Data will not be lost even if the entire cluster suddenly loses power.
+* `none` - no durability guarantees are made, the WAL is not used
+* `log` - the WAL is used, but not flushed; loss of the server probably means recent writes are lost
+* `flush` - updates are written to the WAL, and flushed out to replicas; loss of a single server is unlikely to result in data loss.
+* `sync` - updates are written to the WAL, and synced to disk on all replicas before the write is acknowledge. Data will not be lost even if the entire cluster suddenly loses power.
 
-The user can set the default durability of a table in the shell.  When
-writing, the user can configure the BatchWriter or ConditionalWriter to use
-a different level of durability for the session. This will override the
-default durability setting.
+Durability can be set in multiple ways:
 
-```java
-BatchWriterConfig cfg = new BatchWriterConfig();
-// We don't care about data loss with these writes:
-// This is DANGEROUS:
-cfg.setDurability(Durability.NONE);
+1. The default durability of a table can be set in the Accumulo shell
+2. When creating a [Connector], the default durability can be overriden using `withBatchWriterConfig()`
+   or by setting [batch.writer.durability] in `accumulo-client.properties`.
+3. When a BatchWriter or ConditionalWriter is created, the durability settings above will be overriden
+   by the `BatchWriterConfig` that is passed in.
 
-Connection conn = ... ;
-BatchWriter bw = conn.createBatchWriter(table, cfg);
-```
+    ```java
+    BatchWriterConfig cfg = new BatchWriterConfig();
+    // We don't care about data loss with these writes:
+    // This is DANGEROUS:
+    cfg.setDurability(Durability.NONE);
+
+    Connection conn = ... ;
+    BatchWriter bw = conn.createBatchWriter(table, cfg);
+    ```
 
 ## Reading Data
 
@@ -279,6 +269,56 @@ You may consider using the [WholeRowIterator] with the BatchScanner to achieve
 isolation. The drawback of this approach is that entire rows are read into
 memory on the server side. If a row is too big, it may crash a tablet server.
 
+## Running Client Code
+
+There are multiple ways to run Java code that use Accumulo. Below is a list
+of the different ways to execute client code.
+
+* build and execute an uber jar
+* add `accumulo classpath` to your Java classpath
+* use the `accumulo` command
+* use the `accumulo-util hadoop-jar` command
+
+### Build and execute an uber jar
+
+If you have included `accumulo-core` as dependency in your pom, you can build an uber jar
+using the Maven assembly or shade plugin and use it to run Accumulo client code. When building
+an uber jar, you should set the versions of any Hadoop dependencies in your pom to match the
+version running on your cluster.
+
+### Add 'accumulo classpath' to your Java classpath
+
+To run Accumulo client code using the `java` command, use the `accumulo classpath` command
+to include all of Accumulo's dependencies on your classpath:
+
+    java -classpath /path/to/my.jar:/path/to/dep.jar:$(accumulo classpath) com.my.Main arg1 arg2
+
+If you would like to review which jars are included, the `accumulo classpath` command can
+output a more human readable format using the `-d` option which enables debugging:
+
+    accumulo classpath -d
+
+### Use the accumulo command
+
+Another option for running your code is to use the Accumulo script which can execute a
+main class (if it exists on its classpath):
+
+    accumulo com.foo.Client arg1 arg2
+
+While the Accumulo script will add all of Accumulo's dependencies to the classpath, you
+will need to add any jars that your create or depend on beyond what Accumulo already
+depends on. This can be accomplished by either adding the jars to the `lib/ext` directory
+of your Accumulo installation or by adding jars to the CLASSPATH variable before calling
+the accumulo command.
+
+    export CLASSPATH=/path/to/my.jar:/path/to/dep.jar; accumulo com.foo.Client arg1 arg2
+
+### Use the 'accumulo-util hadoop-jar' command
+
+If you are writing map reduce job that accesses Accumulo, then you can use
+`accumulo-util hadoop-jar` to run those jobs. See the [MapReduce example][mapred-example]
+for more information.
+
 ## Additional Documentation
 
 This page covers Accumulo client basics.  Below are links to additional documentation that may be useful when creating Accumulo clients:
@@ -287,6 +327,14 @@ This page covers Accumulo client basics.  Below are links to additional document
 * [Proxy] - Documentation for interacting with Accumulo using non-Java languages through a proxy server
 * [MapReduce] - Documentation for reading and writing to Accumulo using MapReduce.
 
+[Connector]: {{ page.javadoc_core }}/org/apache/accumulo/core/client/Connector.html
+[client-props]: {{ page.docs_baseurl }}/development/client-properties
+[auth.method]: {{ page.docs_baseurl }}/development/client-properties#auth_method
+[auth.username]: {{ page.docs_baseurl }}/development/client-properties#auth_username
+[auth.password]: {{ page.docs_baseurl }}/development/client-properties#auth_password
+[instance.name]: {{ page.docs_baseurl }}/development/client-properties#instance_name
+[instance.zookeepers]: {{ page.docs_baseurl }}/development/client-properties#instance_zookeepers
+[batch.writer.durability]: {{ page.docs_baseurl }}/development/client-properties#batch_writer_durability
 [PasswordToken]: {{ page.javadoc_core }}/org/apache/accumulo/core/client/security/tokens/PasswordToken.html
 [AuthenticationToken]: {{ page.javadoc_core }}/org/apache/accumulo/core/client/security/tokens/AuthenticationToken.html
 [CredentialProviderToken]: {{ page.javadoc_core }}/org/apache/accumulo/core/client/security/tokens/CredentialProviderToken.html

-- 
To stop receiving notification emails like this one, please contact
mwalch@apache.org.

Mime
View raw message