accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r1494983 - /accumulo/site/trunk/content/notable_features.mdtext
Date Thu, 20 Jun 2013 13:31:28 GMT
Author: kturner
Date: Thu Jun 20 13:31:28 2013
New Revision: 1494983

updated w/ some info about 1.5 improvements


Modified: accumulo/site/trunk/content/notable_features.mdtext
--- accumulo/site/trunk/content/notable_features.mdtext (original)
+++ accumulo/site/trunk/content/notable_features.mdtext Thu Jun 20 13:31:28 2013
@@ -136,7 +136,8 @@ Scans will not see data inserted into a 
 If consecutive keys have identical portions (row, colf, colq, or colvis), there
 is a flag to indicate that a portion is the same as that of the previous key.
 This is applied when keys are stored on disk and when transferred over the
+network.  Starting with 1.5, prefix erasure is supported.  When its cost 
+effective, prefixes repeated in subsequent key fields are not repeated.
 ### Native In-Memory Map
@@ -170,6 +171,16 @@ written. When an index block exceeds the
 written out between data blocks. The size of index blocks is configurable on a
 per table basis.
+### Binary search in RFile blocks (1.5)
+RFile uses its index to locate a block of key values.  Once it reaches a block 
+it performs a linear scan to find a key on interest.  Starting with 1.5, Accumulo
+will generate indexes of cached blocks in an adaptive manner.  Accumulo indexes 
+the blocks that are read most frequently.  When a block is read a few times, a 
+small index is generated.  As a block is read more, larger indexes are generated 
+making future seeks faster. This strategy allows Accumulo to dynamically respond 
+to read patterns without precomputing block indexes when RFiles are written.
 ## Testing <a id="testing"></a>
 ### Mock
@@ -177,6 +188,13 @@ per table basis.
 The Accumulo client API has a mock implementation that is useful writing unit
 test against Accumulo. Mock Accumulo is in memory and in process.
+### Mini Accumulo Cluster (1.5 & 1.4.4)
+Mini Accumulo cluster is a set of utility code that makes it easy to spin up 
+a local Accumulo instance running against the local filesystem.  Mini Accumulo
+is slower than Mock Accumulo, but its behavior is mirrors a real Accumulo 
+instance more closely.  
 ### Functional Test
 Small, system-level tests of basic Accumulo features run in a test harness,
@@ -236,6 +254,13 @@ could be different from the Accumulo nod
 Accumulo can be a source and/or sink for map reduce jobs.
+### Thrift Proxy (1.5 & 1.4.4)
+The Accumulo client code contains a lot of complexity.  For example, the 
+client code locates tablets, retries in the case of failures, and supports 
+concurrent reading and writing.  All of this is written in Java.  The thrift
+proxy wraps the Accumulo client API with thrift, making this API easily
+available to other languages like Python, Ruby, C++, etc.
 ## Extensible Behaviors <a id="behaviors"></a>
@@ -327,6 +352,12 @@ was growing.  Without this feature, inge
 constant rate, even as scan performance decreases because tablets have too many
+### Loading jars using VFS (1.5)
+User written iterators are a useful way to manipulate data in data in Accumulo.  
+Before 1.5., users had to copy their iterators to each tablet server.  Starting 
+with 1.5 Accumulo can load iterators from HDFS using Apache commons VFS.
 ## On-demand Data Management <a id="ondemand_dm"></a>
 ### Compactions
@@ -335,7 +366,8 @@ Ability to force tablets to compact to o
 compacted.  This is useful for improving query performance, permanently
 applying iterators, or using a new locality group configuration.  One example
 of using iterators is applying a filtering iterator to remove data from a
+table. As of 1.5, users can initiate a compaction with iterators only applied to 
+that compaction event.
 ### Split points
@@ -356,6 +388,11 @@ mutated independently. Testing was the m
 feature. For example to test a new filtering iterator, clone the table, add the
 filter to the clone, and force a major compaction.
+### Import/Export Table (1.5)
+An offline tables metadata and files can easily be copied to another cluster and 
 ### Compact Range (1.4)
 Compact each tablet that falls within a row range down to a single file.  

View raw message