accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r1626335 - /accumulo/site/trunk/content/release_notes/1.5.2.mdtext
Date Fri, 19 Sep 2014 21:10:48 GMT
Author: elserj
Date: Fri Sep 19 21:10:47 2014
New Revision: 1626335

More 1.5.2 release note additions


Modified: accumulo/site/trunk/content/release_notes/1.5.2.mdtext
--- accumulo/site/trunk/content/release_notes/1.5.2.mdtext (original)
+++ accumulo/site/trunk/content/release_notes/1.5.2.mdtext Fri Sep 19 21:10:47 2014
@@ -31,14 +31,12 @@ who cannot or do not want to upgrade to 
 over earlier versions in the 1.5 line.
-## Notable Improvements
+## Performance Improvements
-While new features are typically not added in a bug-fix release as 1.5.2, the
-community does create a variety of improvements that are API compatible. Contained
-here are some of the more notable improvements.
+Apache Accumulo 1.5.2 includes a number of performance-related fixes over previous versions.
-### Performance improvements
+### Write-Ahead Log sync performance
 The Write-Ahead Log (WAL) files are used to ensure durability of updates made to Accumulo.
 A "sync" is called on the file in HDFS to make sure that the changes to the WAL are persisted
@@ -46,6 +44,50 @@ to disk, which allows Accumulo to recove
 an issue where an operation against a WAL would unnecessarily wait for multiple syncs, slowing
 down the ingest on the system.
+### Minor-Compactions not aggressive enough
+On a system with ample memory provided to Accumulo, long hold-times were observed which
+blocks the ingest of new updates. Trying to free more server-side memory by running minor
+compactions more frequently increased the overall throughput on the node. These changes
+were made in [ACCUMULO-2905][10].
+### HeapIterator optimization
+Iterators, a notable feature of Accumulo, are provided to users as a server-side programming
+construct, but are also used internally for numerous server operations. One of these system
+is the HeapIterator which implements a PriorityQueue of other Iterators. One way this iterator
+used is to merge multiple files in HDFS to present a single, sorted stream of Key-Value pairs.
+introduces a performance optimization to the HeapIterator which can improve the speed of
+HeapIterator in common cases.
+### Write-Ahead log sync implementation
+In Hadoop-2, two implementation of "sync" are provider: hflush and hsync. Both of these
+methods provide a way to request that the datanodes write the data to the underlying
+medium and not just hold it in memory (the 'fsync' syscall). While both of these methods
+inform the Datanodes to sync the relevant block(s), hflush does not wait for acknowledgement
+from the Datanodes that the sync finished, where hsync does. To provide the most reliable
+"out of the box", Accumulo defaults to hsync so that your data is as secure as possible in

+a variety of situations (notably, unexpected power outages).
+The downside is that performance tends to suffer because waiting for a sync to disk is a
+expensive operation. [ACCUMULO-2842][12] introduces a new system property, tserver.wal.sync.method,
+that lets users to change the HDFS sync implementation from 'hsync' to 'hflush'. Using 'hflush'
+of 'hsync' should result in about a 30% increase in ingest performance.
+For users upgrading from Hadoop-1 or Hadoop-0.20 releases, "hflush" is the equivalent of
+sync was implemented and should give equivalent performance.
+### Server-side mutation queue size
+When users desire writes to be as durable as possible, using 'hsync', the ingest performance
+of the system can be improved by increasing the tserver.mutation.queue.max property. The
+of this change is that it will cause TabletServers to use additional memory per writer. In
+the value of this parameter defaulted to a conservative 256K, which resulted in sub-par ingest
+1.5.2 and [ACCUMULO-3018][13] increases this buffer to 1M which has a noticeable impact on
+ingest performance with a minimal increase in TabletServer memory usage.
 ## Notable Bug Fixes
@@ -84,6 +126,13 @@ The Writable interface methods on the Ra
 calls to serialize the IteratorSettings configured for the Job. [ACCUMULO-2962][8]
 fixes the serialization and adds some additional tests.
+### Constraint violation causes hung scans
+A failed bulk import transaction had the ability to create an infinitely retrying
+loop due to a constraint violation. This directly prevents scans from completing,
+but will also hang compactions. [ACCUMULO-3096][14] fixes the issue so that the
+constraint no longer hangs the entire system.
 ## Documentation
 The following documentation updates were made: 
@@ -130,4 +179,9 @@ and, in HDFS High-Availability instances
\ No newline at end of file
\ No newline at end of file

View raw message