accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject [accumulo-website] branch master updated: Blog post for upgrading to 2.0 (#188)
Date Mon, 12 Aug 2019 19:46:04 GMT
This is an automated email from the ASF dual-hosted git repository.

mmiller pushed a commit to branch master
in repository

The following commit(s) were added to refs/heads/master by this push:
     new 072984c  Blog post for upgrading to 2.0 (#188)
072984c is described below

commit 072984cb692f6c09eb51a79f4ee327a4b3a2d975
Author: Mike Miller <>
AuthorDate: Mon Aug 12 15:45:58 2019 -0400

    Blog post for upgrading to 2.0 (#188)
 _posts/blog/ | 138 ++++++++++++++++++++++++++++++++++
 1 file changed, 138 insertions(+)

diff --git a/_posts/blog/ b/_posts/blog/
new file mode 100644
index 0000000..12c30f1
--- /dev/null
+++ b/_posts/blog/
@@ -0,0 +1,138 @@
+title: "Top 10 Reasons to Upgrade"
+author: Mike Miller
+reviewers: Keith Turner, Christopher Tubbs
+Accumulo 2.0 has been in development for quite some time now and is packed with new features,
+fixes, performance improvements and redesigned components.  All of these changes bring challenges
+when upgrading your production cluster so you may be wondering... why should I upgrade?
+My top 10 reasons to upgrade. For all changes see the [release notes][rel]
+* [Summaries](#summaries)
+* [New Bulk Import](#new-bulk-import)
+* [Simplified Scripts and Config](#simplified-scripts-and-config)
+* [New Monitor](#new-monitor)
+* [New APIs](#new-apis)
+* [Offline creation](#offline-creation)
+* [Search Documentation](#search-documentation)
+* [On disk encryption](#new-crypto)
+* [ZStandard Compression](#zstandard-compression)
+* [New Scan Executors](#new-scan-executors)
+### Summaries
+This feature allows detailed stats about Tables to be written directly into Accumulo files
+Summaries can be used to make precise decisions about your data. Once configured, summaries
become a 
+part of your Tables, so they won't impact ingest or query performance of your cluster.
+Here are some example use cases:
+* A compaction could automatically run if deletes compose more than 25% of the data
+* An admin could optimize compactions by configuring specific age off of data
+* An admin could analyze R-File summaries for better performance tuning of a cluster
+For more info check out the [summary docs for 2.0][summary]
+### New Bulk Import
+Bulk Ingest was completely redone for 2.0.  Previously, Bulk Ingest relied on expensive inspections
+R-Files across multiple Tablet Servers. With enough data, an old Bulk Ingest operation could
+hold up simpler Table operations and critical compactions of files.
+The new Bulk Ingest gives the user control over the R-File inspection, allows for offline
+ingesting and provides performance [improvements][new-bulk].
+## Simplified Scripts and Config
+Many improvements were done to the scripts and configuration. See Mike's description of the
+## New Monitor
+The Monitor has been re-written using REST, Javascript and more modern Web Tech.  It is faster,

+cleaner and more maintainable than the previous version. Here is a screen shot:
+<img src="{{ site.baseurl }}/images/accumulo-monitor-1.png" width="50%"/>
+## New APIs
+Connecting to Accumulo is now easier with a single point of entry for clients. It can now
be done with 
+a fluent API, 2 imports and using minimal code:
+import org.apache.accumulo.core.client.Accumulo;
+import org.apache.accumulo.core.client.AccumuloClient;
+try (AccumuloClient client = Accumulo.newClient()
+          .to("instance", "zk")
+          .as("user", "pass").build()) {
+      // use the client
+      client.tableOperations().create("newTable");
+    }
+As you can see the client is also closable, which gives developers more control over resources.
+See the [Accumulo entry point javadoc][client].
+Key and Mutation have new fluent APIs, which now allow mixing of ```String``` and ```byte[]```
+Key newKey = Key.builder().row("foo").family("bar").build();
+Mutation m = new Mutation("row0017");"001").qualifier(new byte[] {0,1}).put("v99");"002").qualifier(new byte[] {0,1}).delete();
+More examples for [Key] and [Mutation].
+## Offline creation
+Tables can now be created with splits offline.  This frees up online resources to perform
other critical operations.
+See the [GitHub issue][offline].
+## Search Documentation
+New ability to quickly search documentation on the website. The user manual was completely
+for 2.0. Check it out [here][manual]. Users can now quickly [search] the website across all
2.x documentation.
+## New Crypto
+On disk encryption was redone to be more secure and flexible. For an in depth description
of how Accumulo 
+does on disk encryption, see the [user manual][crypto].  NOTE: This is currently an experimental
+An experimental feature is considered a work in progress or incomplete and could change.
+## Zstandard compression
+Support for Zstandard compression was added in 2.0.  It has been measured to perform better
+gzip (better compression ratio and speed) and snappy (better compression ratio). Checkout
Facebook's [github][zstd] for Zstandard and
+the [table.file.compress.type][z-config] property for configuring Accumulo.
+## New Scan Executors
+Users now have more control over scans with the new scan executors.  Tables can be configured
to utilize these 
+powerful new mechanisms using just a few properties, giving user control over things like
scan prioritization and 
+better cluster resource utilization.
+For example, a cluster has a bunch of long running scans and one really fast scan.  The long
running scans will eat up 
+a majority of the server resources causing the one really fast scan to be delayed.  Scan
executors allow an admin 
+to configure the cluster in a way that allows the one fast scan to be prioritized and not
have to wait.
+Checkout some examples in the [user guide][scans].
+[FATE]: {% dlink /administration/fate %}
+[summary]: {% dlink /development/summaries %}
+[client]: {% jurl org.apache.accumulo.core.client.Accumulo %}
+[offline]: {% ghi 573 %}
+[manual]: {% dlink /getting-started/quickstart %}
+[crypto]: {% dlink /security/on-disk-encryption %}
+[z-config]: {% dlink /configuration/server-properties %}
+[scans]: {% dlink /administration/scan-executors %}

View raw message