accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From md...@apache.org
Subject svn commit: r1589445 - /accumulo/site/trunk/content/glossary.mdtext
Date Wed, 23 Apr 2014 15:59:03 GMT
Author: mdrob
Date: Wed Apr 23 15:59:02 2014
New Revision: 1589445

URL: http://svn.apache.org/r1589445
Log:
CMS commit to accumulo by mdrob

Modified:
    accumulo/site/trunk/content/glossary.mdtext

Modified: accumulo/site/trunk/content/glossary.mdtext
URL: http://svn.apache.org/viewvc/accumulo/site/trunk/content/glossary.mdtext?rev=1589445&r1=1589444&r2=1589445&view=diff
==============================================================================
--- accumulo/site/trunk/content/glossary.mdtext (original)
+++ accumulo/site/trunk/content/glossary.mdtext Wed Apr 23 15:59:02 2014
@@ -25,7 +25,7 @@ Notice:    Licensed to the Apache Softwa
 - **iterator** - a mechanism for modifying tablet-local portions of the key/value space.
Iterators are used for standard administrative tasks as well as for custom processing.
 - **iterator priority** - an iterator must be configured with a particular scope and priority.
 When a tablet server enters that scope, it will instantiate iterators in priority order starting
from the smallest priority and ending with the largest, and apply each to the data read before
rewriting the data or sending the data to the user.
 - **iterator scopes** - the possible scopes for iterators are where the tablet server is
already reading and/or writing data: minor compaction / flush time (*minc* scope), major compaction
/ file merging time (*majc* scope), and query time (*scan* scope)
-- **gc** - process that identifies temporary files that are no longer needed by any process,
and deletes them.
+- **gc** - process that identifies temporary files in HDFS that are no longer needed by any
process, and deletes them.
 - **key** - the key into the distributed sorted map which is accumulo.  The key is subdivided
into row, column, and timestamp.  The column is further divided into  family, qualifier, and
visibility.
 - **locality group** - a set of column families that will be grouped together on disk.  With
no locality groups configured, data is stored on disk in row order.  If each column family
were configured to be its own locality group, the data for each column would be stored separately,
in row order.  Configuring sets of columns into locality groups is a compromise between the
two approaches and will improve performance when multiple columns are accessed in the same
scan.
 - **log-structured merge-tree** - the sorting / flushing / merging scheme on which BigTable's
design is based.



Mime
View raw message