accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r1626327 - /accumulo/site/trunk/content/release_notes/1.5.2.mdtext
Date Fri, 19 Sep 2014 20:44:00 GMT
Author: elserj
Date: Fri Sep 19 20:43:59 2014
New Revision: 1626327

WIP on 1.5.2 release notes


Modified: accumulo/site/trunk/content/release_notes/1.5.2.mdtext
--- accumulo/site/trunk/content/release_notes/1.5.2.mdtext (original)
+++ accumulo/site/trunk/content/release_notes/1.5.2.mdtext Fri Sep 19 20:43:59 2014
@@ -1,4 +1,4 @@
-Title: Apache Accumulo 1.5.1 Release Notes
+Title: Apache Accumulo 1.5.2 Release Notes
 Notice:    Licensed to the Apache Software Foundation (ASF) under one
            or more contributor license agreements.  See the NOTICE file
            distributed with this work for additional information
@@ -16,134 +16,73 @@ Notice:    Licensed to the Apache Softwa
            specific language governing permissions and limitations
            under the License.
-Apache Accumulo 1.5.1 is a maintenance release on the 1.5 version branch.
-This release contains changes from over 200 issues, comprised of bug fixes
+Apache Accumulo 1.5.2 is a maintenance release on the 1.5 version branch.
+This release contains changes from over 100 issues, comprised of bug fixes
 (client side and server side), new test cases, and updated Hadoop support
 contributed by over 30 different contributors and committers.
-As this is a maintenance release, Apache Accumulo 1.5.1 has no client API 
-incompatibilities over Apache Accumulo 1.5.0 and requires no manual upgrade 
-process. Users of 1.5.0 are strongly encouraged to update as soon as possible 
+As this is a maintenance release, Apache Accumulo 1.5.2 has no client API 
+incompatibilities over Apache Accumulo 1.5.0 and 1.5.1 and requires no manual upgrade 
+process. Users of 1.5.0 or 1.5.1 are strongly encouraged to update as soon as possible 
 to benefit from the improvements.
+Users who are new to Accumulo are encouraged to use a 1.6 release as opposed
+to the 1.5 line as development has already shifted towards the 1.6 line. For those
+who cannot or do not want to upgrade to 1.6, 1.5.2 is still an excellent choice
+over earlier versions in the 1.5 line.
 ## Notable Improvements
-While new features are typically not added in a bug-fix release as 1.5.1, the
+While new features are typically not added in a bug-fix release as 1.5.2, the
 community does create a variety of improvements that are API compatible. Contained
 here are some of the more notable improvements.
-### PermGen Leak from Client API
-Accumulo's client code creates background threads that users presently cannot 
-stop through the API. This is quick to cause problems when invoking the Accumulo
-API in application containers such as Apache Tomcat or JBoss and repeatedly 
-redeploying an application. [ACCUMULO-2128][3] introduces a static utility, 
-org.apache.accumulo.core.util.CleanUp, that users can invoke as part of a 
-teardown hook in their container that will stop these threads and avoid 
-the eventual OutOfMemoryError "PermGen space".
-### Prefer IPv4 when starting Accumulo processes
-While Hadoop [does not support IPv6 networks][28], attempting to run on a 
-system that does not have IPv6 completely disabled can cause strange failures.
-[ACCUMULO-2262][4] invokes the JVM-provided configuration parameter at process
-startup to prefer IPv4 over IPv6.
-### Memory units in configuration
-In previous versions, units of memory had to be provided as upper-case (e.g. '2G', not '2g').
-Additionally, a non-intuitive error was printed when a lower-case unit was provided.
-[ACCUMULO-1933][7] allows lower-case memory units in all Accumulo configurations.
-### Thrift maximum frame size
-Apache Thrift is used as the internal RPC service. [ACCUMULO-2360][14] allows 
-users to configure the maximum frame size an Accumulo server will read. This 
-prevents non Accumulo client from connecting and causing memory exhaustion.
-### MultiTableBatchWriter concurrency
-The MultiTableBatchWriter is a class which allows multiple tables to be written to
-from a single object that maintains a single buffer for caching Mutations across all tables.
This is desirable
-as it greatly simplifies the JVM heap usage from caching Mutations across
-many tables. Sadly, in Apache Accumulo 1.5.0, concurrent access to a single MultiTableBatchWriter
-heavily suffered from synchronization issues. [ACCUMULO-1833][35] introduces a fix
-which alleviates the blocking and idle-wait that previously occurred when multiple threads
-a single MultiTableBatchWriter instance concurrently.
-### Hadoop Versions
-Since Apache Accumulo 1.5.0 was released, Apache Hadoop 2.2.0 was also released
-as the first generally available (GA) Hadoop 2 release. This was a very exciting release
-for a number of reasons, but this also caused additional effort on Accumulo's part to
-ensure that Apache Accumulo continues to work across multiple Hadoop versions. Apache Accumulo
-should function with any recent Hadoop 1 or Hadoop 2 without any special steps, tricks or
+### Performance improvements
+The Write-Ahead Log (WAL) files are used to ensure durability of updates made to Accumulo.
+A "sync" is called on the file in HDFS to make sure that the changes to the WAL are persisted
+to disk, which allows Accumulo to recover in the case of failure. [ACCUMULO-2766][9] fixed
+an issue where an operation against a WAL would unnecessarily wait for multiple syncs, slowing
+down the ingest on the system.
 ## Notable Bug Fixes
-As with any Apache Accumulo release, we have numerous bug fixes that have been fixed. Most
-are very subtle and won't affect the common user; however, some notable bugs were resolved

-as a part of 1.5.1 that are rather common.
-### Failure of ZooKeeper server in quorum kills connected Accumulo services
-Apache ZooKeeper provides a number of wonderful features that Accumulo uses to accomplish
-a variety of tasks, most notably a distributed locking service. Typically, multiple ZooKeeper
-servers are run to provide resilience against a certain number of node failures. [ACCUMULO-1572][13]
-resolves an issue where Accumulo processes would kill themselves when the ZooKeeper server
-were communicating with died instead of failing over to another ZooKeeper server in the quorum.
-### Monitor table state isn't updated
-The Accumulo Monitor contains a column for the state of each table in the Accumulo instance.
-The previous resolution was to restart the Monitor process when it got in this state.
-[ACCUMULO-1920][25] resolves an issue where the Monitor would not see updates from ZooKeeper.
-### Two locations for the same extent
-The !METADATA table is the brains behind the data storage for each table, tracking information
-like which files comprise a Tablet, and which TabletServers are hosting which Tablets. [ACCUMULO-2057][9]
-fixes an issue where the !METADATA table contained multiple locations (hosting server) for
-a single Tablet.
-### Deadlock on !METADATA tablet unload
-Tablets are unloaded, typically, when a shutdown request is issued. [ACCUMULO-1143][27] resolves
-a potential deadlock issue when a merging-minor compaction is issued to flush in-memory data
-to disk before unloading a Tablet.
-### Other notable fixes
- * [ACCUMULO-1800][5] Fixed deletes made via the Proxy.
- * [ACCUMULO-1994][6] Fixed ranges in the Proxy.
- * [ACCUMULO-2234][8] Fixed offline map reduce over non default HDFS location.
- * [ACCUMULO-1615][15] Fixed `service accumulo-tserver stop`.
- * [ACCUMULO-1876][16] Fixed issues depending on Accumulo using Apache Ivy.
- * [ACCUMULO-2261][10] Duplicate locations for a Tablet.
- * [ACCUMULO-2037][11] Tablets assigned to previous location.
- * [ACCUMULO-1821][12] Avoid recovery on recovering Tablets.
- * [ACCUMULO-2078][20] Incorrectly computed ACCUMULO_LOG_HOST in example configurations.
- * [ACCUMULO-1985][21] Configuration to bind Monitor on all network interfaces.
- * [ACCUMULO-1999][22] Allow '0' to signify random port for the Master.
- * [ACCUMULO-1630][24] Fixed GC to interpret any IP/hostname.
-## Known Issues
-When using Accumulo 1.5 and Hadoop 2, Accumulo will call hsync() on HDFS.
-Calling hsync improves durability by ensuring data is on disk (where other older 
-Hadoop versions might lose data in the face of power failure); however, calling
-hsync frequently does noticably slow writes. A simple work around is to increase 
-the value of the tserver.mutation.queue.max configuration parameter via accumulo-site.xml.
-A value of "4M" is a better recommendation, and memory consumption will increase by
-the number of concurrent writers to that TabletServer. For example, a value of 4M with
-50 concurrent writers would equate to approximately 200M of Java heap being used for
-mutation queues.
+### Fixes MapReduce package name change
+1.5.1 inadvertently included a change to RangeInputSplit which created an incompatibility
+with 1.5.0. The original class has been restored to ensure that users accessing
+the RangeInputSplit class do not have to alter their client code. See [ACCUMULO-2586][1]
+more information
+### Add configurable maximum frame size to Thrift proxy
+The Thrift proxy server was subject to memory exhaustion, typically
+due to bad input, where the server would attempt to allocate a very large
+buffer and die in the process. [ACCUMULO-2658][2] introduces a configuration
+parameter, like [ACCUMULO-2360][3], to prevent this error.
+### Offline tables can prevent tablet balancing
-For more information, see [ACCUMULO-1950][2] and [this comment][1].
+A table with many tablets was created, data ingested into it, and then taken
+offline. There were tablet migrations also queued for the table which could not
+happen because the table was offline at that point. The balancer doesn't run
+when there are outstanding migrations; therefore, the system became more and more
+out of balance. [ACCUMULO-2694][4] introduces a fix to ensure that offline tables
+do not block balancing and improves the server-side logging.
+### MiniAccumuloCluster process management
+MiniAccumuloCluster had a few issues which could cause deadlock or a method that
+never returns. Most of these are related to management of the Accumulo processes
+([ACCUMULO-2764][5], [ACCUMULO-2985][6], and [ACCUMULO-3055][7]).
+### IteratorSettings not correctly serialized in RangeInputSplit
+The Writable interface methods on the RangeInputSplit class accidentally omitted
+calls to serialize the IteratorSettings configured for the Job. [ACCUMULO-2962][8]
+fixes the serialization and adds some additional tests.
 ## Documentation
@@ -160,15 +99,6 @@ The following documentation updates were
 ## Testing
-Below is a list of all platforms that 1.5.1 was tested against by developers. Each Apache
Accumulo release
-has a set of tests that must be run before the candidate is capable of becoming an official
release. That list includes the following:
- 1. Successfully run all unit tests
- 2. Successfully run all functional test (test/system/auto)
- 3. Successfully complete two 24-hour RandomWalk tests (LongClean module), with and without
- 4. Successfully complete two 24-hour Continuous Ingest tests, with and without "agitation",
with data verification
- 5. Successfully complete two 72-hour Continuous Ingest tests, with and without "agitation"
 Each unit and functional test only runs on a single node, while the RandomWalk and Continuous
Ingest tests run 
 on any number of nodes. *Agitation* refers to randomly restarting Accumulo processes and
Hadoop Datanode processes,
 and, in HDFS High-Availability instances, forcing NameNode failover.
@@ -189,71 +119,15 @@ and, in HDFS High-Availability instances
     <td>Yes (QJM)</td>
     <td>All required tests</td>
-  <tr>
-    <td>CentOS 6.4</td>
-    <td>CDH 4.5.0 (2.0.0+cdh4.5.0)</td>
-    <td>7</td>
-    <td>CDH 4.5.0 (3.4.5+cdh4.5.0)</td>
-    <td>Yes (QJM)</td>
-    <td>Unit, functional and 24hr Randomwalk w/ agitation</td>
-  </tr>
-  <tr>
-    <td>CentOS 6.4</td>
-    <td>CDH 4.5.0 (2.0.0+cdh4.5.0)</td>
-    <td>7</td>
-    <td>CDH 4.5.0 (3.4.5+cdh4.5.0)</td>
-    <td>Yes (QJM)</td>
-    <td>2x 24/hr continuous ingest w/ verification</td>
-  </tr>
-  <tr>
-    <td>CentOS 6.3</td>
-    <td>Apache 1.0.4</td>
-    <td>1</td>
-    <td>Apache 3.3.5</td>
-    <td>No</td>
-    <td>Local testing, unit and functional tests</td>
-  </tr>
-  <tr>
-    <td>RHEL 6.4</td>
-    <td>Apache 2.2.0</td>
-    <td>10</td>
-    <td>Apache 3.4.5</td>
-    <td>No</td>
-    <td>Functional tests</td>
-  </tr>
\ No newline at end of file

View raw message