Author: buildbot Date: Thu Jun 25 22:38:55 2015 New Revision: 955998 Log: Staging update by buildbot for accumulo Added: websites/staging/accumulo/trunk/content/release_notes/1.5.3.html Modified: websites/staging/accumulo/trunk/content/ (props changed) Propchange: websites/staging/accumulo/trunk/content/ ------------------------------------------------------------------------------ --- cms:source-revision (original) +++ cms:source-revision Thu Jun 25 22:38:55 2015 @@ -1 +1 @@ -1687578 +1687659 Added: websites/staging/accumulo/trunk/content/release_notes/1.5.3.html ============================================================================== --- websites/staging/accumulo/trunk/content/release_notes/1.5.3.html (added) +++ websites/staging/accumulo/trunk/content/release_notes/1.5.3.html Thu Jun 25 22:38:55 2015 @@ -0,0 +1,577 @@ + + + + + + + + + + + + + + + + + + + Apache Accumulo 1.7.0 Release Notes + + + + + + + + + +
+
+ +
+ + +
+ +

Apache Accumulo 1.7.0 Release Notes

+ +

Apache Accumulo 1.7.0 is a significant release that includes many important +milestone features which expand the functionality of Accumulo. These include +features related to security, availability, and extensibility. Nearly 700 JIRA +issues were resolved in this version. Approximately two-thirds were bugs and +one-third were improvements.

+

In the context of Accumulo's Semantic Versioning guidelines, +this is a "minor version". This means that new APIs have been created, some +deprecations may have been added, but no deprecated APIs have been removed. +Code written against 1.6.x should work against 1.7.0, likely binary-compatible +but definitely source-compatible. As always, the Accumulo developers take API compatibility +very seriously and have invested much time to ensure that we meet the promises set forth to our users.

+

Major Changes

+

Updated Minimum Requirements

+

Apache Accumulo 1.7.0 comes with an updated set of minimum requirements.

+
    +
  • Java7 is required. Java6 support is dropped.
  • +
  • Hadoop 2.2.0 or greater is required. Hadoop 1.x support is dropped.
  • +
  • ZooKeeper 3.4.x or greater is required.
  • +
+

Client Authentication with Kerberos

+

Kerberos is the de-facto means to provide strong authentication across Hadoop +and other related components. Kerberos requires a centralized key distribution +center to authentication users who have credentials provided by an +administrator. When Hadoop is configured for use with Kerberos, all users must +provide Kerberos credentials to interact with the filesystem, launch YARN +jobs, or even view certain web pages.

+

While Accumulo has long supported operating on Kerberos-enabled HDFS, it still +required Accumulo users to use password-based authentication to authenticate +with Accumulo. ACCUMULO-2815 added support for allowing +Accumulo clients to use the same Kerberos credentials to authenticate to +Accumulo that they would use to authenticate to other Hadoop components, +instead of a separate user name and password just for Accumulo.

+

This authentication leverages Simple Authentication and Security Layer +(SASL) and GSSAPI to support Kerberos authentication over the +existing Thrift-based RPC infrastructure that Accumulo employs.

+

These additions represent a significant forward step for Accumulo, bringing +its client-authentication up to speed with the rest of the Hadoop ecosystem. +This results in a much more cohesive authentication story for Accumulo that +resonates with the battle-tested cell-level security and authorization model +already familiar to Accumulo users.

+

More information on configuration, administration, and application of Kerberos +client authentication can be found in the Kerberos chapter of the +Accumulo User Manual.

+

Data-Center Replication

+

In previous releases, Accumulo only operated within the constraints of a +single installation. Because single instances of Accumulo often consist of +many nodes and Accumulo's design scales (near) linearly across many nodes, it +is typical that one Accumulo is run per physical installation or data-center. +ACCUMULO-378 introduces support in Accumulo to automatically +copy data from one Accumulo instance to another.

+

This data-center replication feature is primarily applicable to users wishing +to implement a disaster recovery strategy. Data can be automatically copied +from a primary instance to one or more other Accumulo instances. In contrast +to normal Accumulo operation, in which ingest and query are strongly +consistent, data-center replication is a lazy, eventually consistent +operation. This is desirable for replication, as it prevents additional +latency for ingest operations on the primary instance. Additionally, the +implementation of this feature can sustain prolonged outages between the +primary instance and replicas without any administrative overhead.

+

The Accumulo User Manual contains a new chapter on replication +which details the design and implementation of the feature, explains how users +can configure replication, and describes special cases to consider when +choosing to integrate the feature into a user application.

+

User-Initiated Compaction Strategies

+

Per-table compaction strategies were added in 1.6.0 to provide custom logic to +decide which files are involved in a major compaction. In 1.7.0, the ability +to specify a compaction strategy for a user-initiated compaction was added in +ACCUMULO-1798. This allows surgical compactions on a subset +of tablet files. Previously, a user-initiated compaction would compact all +files in a tablet.

+

In the Java API, this new feature can be accessed in the following way:

+
Connection conn = ...
+CompactionStrategyConfig csConfig = new CompactionStrategyConfig(strategyClassName).setOptions(strategyOpts);
+CompactionConfig compactionConfig = new CompactionConfig().setCompactionStrategy(csConfig);
+connector.tableOperations().compact(tableName, compactionConfig)
+
+ + +

In ACCUMULO-3134, the shell's compact command was modified +to enable selecting which files to compact based on size, name, and path. +Options were also added to the shell's compaction command to allow setting +RFile options for the compaction output. Setting the output options could be +useful for testing. For example, one tablet to be compacted using snappy +compression.

+

The following is an example shell command that compacts all files less than +10MB, if the tablet has at least two files that meet this criteria. If a +tablet had a 100MB, 50MB, 7MB, and 5MB file then the 7MB and 5MB files would +be compacted. If a tablet had a 100MB and 5MB file, then nothing would be done +because there are not at least two files meeting the selection criteria.

+
compact -t foo --min-files 2 --sf-lt-esize 10M
+
+ + +

The following is an example shell command that compacts all bulk imported +files in a table.

+
compact -t foo --sf-ename I.*
+
+ + +

These provided convenience options to select files execute using a specialized +compaction strategy. Options were also added to the shell to specify an +arbitrary compaction strategy. The option to specify an arbitrry compaction +strategy is mutually exclusive with the file selection and file creation +options, since those options are unique to the specialized compaction strategy +provided. See compact --help in the shell for the available options.

+

API Clarification

+

The declared API in 1.6.x was incomplete. Some important classes like +ColumnVisibility were not declared as Accumulo API. Significant work was done +under ACCUMULO-3657 to correct the API statement and clean up +the API to be representative of all classes which users are intended to +interact with. The expanded and simplified API statement is in the +README.

+

In some places in the API, non-API types were used. Ideally, public API +members would only use public API types. A tool called APILyzer +was created to find all API members that used non-API types. Many of the +violations found by this tool were deprecated to clearly communicate that a +non-API type was used. One example is a public API method that returned a +class called KeyExtent. KeyExtent was never intended to be in the public +API because it contains code related to Accumulo internals. KeyExtent and +the API methods returning it have since been deprecated. These were replaced +with a new class for identifying tablets that does not expose internals. +Deprecating a type like this from the API makes the API more stable while also +making it easier for contributors to change Accumulo internals without +impacting the API.

+

The changes in ACCUMULO-3657 also included an Accumulo API +regular expression for use with checkstyle. Starting with 1.7.0, projects +building on Accumulo can use this checkstyle rule to ensure they are only +using Accumulo's public API. The regular expression can be found in the +README.

+

Performance Improvements

+

Configurable Threadpool Size for Assignments

+

During start-up, the Master quickly assigns tablets to Tablet Servers. However, +Tablet Servers load those assigned tablets one at a time. In 1.7, the servers +will be more aggressive, and will load tablets in parallel, so long as they do +not have mutations that need to be recovered.

+

ACCUMULO-1085 allows the size of the threadpool used in the Tablet Servers +for assignment processing to be configurable.

+

Group-Commit Threshold as a Factor of Data Size

+

When ingesting data into Accumulo, the majority of time is spent in the +write-ahead log. As such, this is a common place that optimizations are added. +One optimization is known as "group-commit". When multiple clients are +writing data to the same Accumulo tablet, it is not efficient for each of them +to synchronize the WAL, flush their updates to disk for durability, and then +release the lock. The idea of group-commit is that multiple writers can queue +the write for their mutations to the WAL and then wait for a sync that will +satisfy the durability constraints of their batch of updates. This has a +drastic improvement on performance, since many threads writing batches +concurrently can "share" the same fsync.

+

In previous versions, Accumulo controlled the frequency in which this +group-commit sync was performed as a factor of the number of clients writing +to Accumulo. This was both confusing to correctly configure and also +encouraged sub-par performance with few write threads. +ACCUMULO-1950 introduced a new configuration property +tserver.total.mutation.queue.max which defines the amount of data that is +queued before a group-commit is performed in such a way that is agnostic of +the number of writers. This new configuration property is much easier to +reason about than the previous (now deprecated) tserver.mutation.queue.max. +Users who have set tserver.mutation.queue.max in the past are encouraged +to start using the new tserver.total.mutation.queue.max property.

+

Other improvements

+

Balancing Groups of Tablets

+

By default, Accumulo evenly spreads each table's tablets across a cluster. In +some situations, it is advantageous for query or ingest to evenly spreads +groups of tablets within a table. For ACCUMULO-3439, a new +balancer was added to evenly spread groups of tablets to optimize performance. +This blog post provides more details about when and why +users may desire to leverage this feature..

+

User-specified Durability

+

Accumulo constantly tries to balance durability with performance. Guaranteeing +durability of every write to Accumulo is very difficult in a +massively-concurrent environment that requires high throughput. One common +area of focus is the write-ahead log, since it must eventually call fsync on +the local filesystem to guarantee that data written is durable in the face of +unexpected power failures. In some cases where durability can be sacrificed, +either due to the nature of the data itself or redundant power supplies, +ingest performance improvements can be attained.

+

Prior to 1.7, a user could only configure the level of durability for +individual tables. With the implementation of ACCUMULO-1957, +the durability can be specified by the user when creating a BatchWriter, +giving users control over durability at the level of the individual writes. +Every Mutation written using that BatchWriter will be written with the +provided durability. This can result in substantially faster ingest rates when +the durability can be relaxed.

+

waitForBalance API

+

When creating a new Accumulo table, the next step is typically adding splits +to that table before starting ingest. This can be extremely important since a +table without any splits will only be hosted on a single tablet server and +create a ingest bottleneck until the table begins to naturally split. Adding +many splits before ingesting will ensure that a table is distributed across +many servers and result in high throughput when ingest first starts.

+

Adding splits to a table has long been a synchronous operation, but the +assignment of those splits was asynchronous. A large number of splits could be +processed, but it was not guaranteed that they would be evenly distributed +resulting in the same problem as having an insufficient number of splits. +ACCUMULO-2998 adds a new method to InstanceOperations which +allows users to wait for all tablets to be balanced. This method lets users +wait until tablets are appropriately distributed so that ingest can be run at +full-bore immediately.

+

Hadoop Metrics2 Support

+

Accumulo has long had its own metrics system implemented using Java MBeans. +This enabled metrics to be reported by Accumulo services, but consumption by +other systems often required use of an additional tool like jmxtrans to read +the metrics from the MBeans and send them to some other system.

+

ACCUMULO-1817 replaces this custom metrics system Accumulo +with Hadoop Metrics2. Metrics2 has a number of benefits, the most common of +which is invalidating the need for an additional process to send metrics to +common metrics storage and visualization tools. With Metrics2 support, +Accumulo can send its metrics to common tools like Ganglia and Graphite.

+

For more information on enabling Hadoop Metrics2, see the Metrics +Chapter in the Accumulo User Manual.

+

Distributed Tracing with HTrace

+

HTrace has recently started gaining traction as a standalone project, +especially with its adoption in HDFS. Accumulo has long had distributed +tracing support via its own "Cloudtrace" library, but this wasn't intended for +use outside of Accumulo.

+

ACCUMULO-898 replaces Accumulo's Cloudtrace code with HTrace. +This has the benefit of adding timings (spans) from HDFS into Accumulo spans +automatically.

+

Users who inspect traces via the Accumulo Monitor (or another system) will begin +to see timings from HDFS during operations like Major and Minor compactions when +running with at least Apache Hadoop 2.6.0.

+

VERSIONS file present in binary distribution

+

In the pre-built binary distribution or distributions built by users from the +official source release, users will now see a VERSIONS file present in the +lib/ directory alongside the Accumulo server-side jars. Because the created +tarball strips off versions from the jar file names, it can require extra work +to actually find what the version of each dependent jar (typically inspecting +the jar's manifest).

+

ACCUMULO-2863 adds a VERSIONS file to the lib/ directory +which contains the Maven groupId, artifactId, and verison (GAV) information for +each jar file included in the distribution.

+

Per-Table Volume Chooser

+

The VolumeChooser interface is a server-side extension point that allows user +tables to provide custom logic in choosing where its files are written when +multiple HDFS instances are available. By default, a randomized volume chooser +implementation is used to evenly balance files across all HDFS instances.

+

Previously, this VolumeChooser logic was instance-wide which meant that it would +affect all tables. This is potentially undesirable as it might unintentionally +impact other users in a multi-tenant system. ACCUMULO-3177 +introduces a new per-table property which supports configuration of a +VolumeChooser. This ensures that the implementation to choose how HDFS +utilization happens when multiple are available is limited to the expected +subset of all tables.

+

Table and namespace custom properties

+

In order to avoid errors caused by mis-typed configuration properties, Accumulo was strict about which configuration properties +could be set. However, this prevented users from setting arbitrary properties that could be used by custom balancers, compaction +strategies, volume choosers, and iterators. Under ACCUMULO-2841, the ability to set arbitrary table and +namespace properties was added. The properties need to be prefixed with table.custom.. The changes made in +ACCUMULO-3177 and ACCUMULO-3439 leverage this new feature.

+

Notable Bug Fixes

+

SourceSwitchingIterator Deadlock

+

An instance of SourceSwitchingIterator, the Accumulo iterator which +transparently manages whether data for a tablet read from memory (the +in-memory map) or disk (HDFS after a minor compaction), was found deadlocked +in a production system.

+

This deadlock prevented the scan and the minor compaction from ever +successfully completing without restarting the tablet server. +ACCUMULO-3745 fixes the inconsistent synchronization inside +of the SourceSwitchingIterator to prevent this deadlock from happening in the +future.

+

The only mitigation of this bug was to restart the tablet server that is +deadlocked.

+

Table flush blocked indefinitely

+

While running the Accumulo RandomWalk distributed test, it was observed that +all activity in Accumulo had stopped and there was an offline Accumulo +metadata table tablet. The system first tried to flush a user tablet, but the +metadata table was not online (likely due to the agitation process which stops +and starts Accumulo processes during the test). After this call, a call to +load the metadata tablet was queued but could not complete until the previous +flush call. Thus, a deadlock occurred.

+

This deadlock happened because the synchronous flush call could not complete +before the load tablet call completed, but the load tablet call couldn't run +because of connection caching we perform in Accumulo's RPC layer to reduce the +quantity of sockets we need to create to send data. +ACCUMULO-3597 prevents this deadlock by forcing the use of a +non-cached connection for the RPC message requesting a metadata tablet to be +loaded.

+

While this feature does result in additional network resources to be used, the +concern is minimal because the number of metadata tablets is typically very +small with respect to the total number of tablets in the system.

+

The only mitigation of this bug was to restart the tablet server that is hung.

+

Testing

+

Each unit and functional test only runs on a single node, while the RandomWalk +and Continuous Ingest tests run on any number of nodes. Agitation refers to +randomly restarting Accumulo processes and Hadoop DataNode processes, and, in +HDFS High-Availability instances, forcing NameNode fail-over.

+

During testing, multiple Accumulo developers noticed some stability issues +with HDFS using Apache Hadoop 2.6.0 when restarting Accumulo processes and +HDFS datanodes. The developers investigated these issues as a part of the +normal release testing procedures, but were unable to find a definitive cause +of these failures. Users are encouraged to follow +ACCUMULO-2388 if they wish to follow any future developments. +One possible workaround is to increase the general.rpc.timeout in the +Accumulo configuration from 120s to 240s.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
OSHadoopNodesZooKeeperHDFS High-AvailabilityTests
Gentoo + N/A1N/ANoUnit and Integration Tests
Gentoo + 2.6.01 (2 TServers)3.4.5No24hr CI w/ agitation and verification, 24hr RW w/o agitation.
Centos 6.62.6.033.4.6No24hr RW w/ agitation, 24hr CI w/o agitation, 72hr CI w/ and w/o agitation
Amazon Linux2.6.020 m1large3.4.6No24hr CI w/o agitation
+
+ + +
+ +
+
+ + + +