accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r1329932 - /accumulo/site/trunk/content/glossary.mdtext
Date Tue, 24 Apr 2012 18:48:07 GMT
Author: billie
Date: Tue Apr 24 18:48:06 2012
New Revision: 1329932

added some definitions


Modified: accumulo/site/trunk/content/glossary.mdtext
--- accumulo/site/trunk/content/glossary.mdtext (original)
+++ accumulo/site/trunk/content/glossary.mdtext Tue Apr 24 18:48:06 2012
@@ -16,22 +16,28 @@ Notice:    Licensed to the Apache Softwa
            specific language governing permissions and limitations
            under the License.
-- **authorizations** - 
-- **column** - 
-- **column family** - 
-- **column qualifier** - 
-- **column visibility** -
-- **key** - 
-- **iterator** - 
-- **iterator scopes** - 
-- **log-structured merge-tree** - 
-- **major compaction** - 
+- **authorizations** - a set of strings associated with a user or with a particular scan
that will be used to determine which key/value pairs are visible to the user.
+- **cell** - a set of key/value pairs whose keys differ only in timestamp.
+- **column** - the portion of the key that sorts after the row and is divided into family,
qualifier, and visibility.
+- **column family** - the portion of the key that sorts second and controls locality groups,
the row/column hybrid nature of accumulo.
+- **column qualifier** - the portion of the key that sorts third and provides additional
key uniqueness.
+- **column visibility** - the portion of the key that sorts fourth and controls user access
to individual key/value pairs. Visibilities are boolean AND (&) and OR (|) combinations
of authorization strings with parentheses required to determine ordering, e.g. (AB&C)|DEF.
+- **iterator** - a mechanism for modifying tablet-local portions of the key/value space.
Iterators are used for standard administrative tasks as well as for custom processing.
+- **iterator priority** - an iterator must be configured with a particular scope and priority.
 When a tablet server enters that scope, it will instantiate iterators in priority order starting
from the smallest priority and ending with the largest, and apply each to the data read before
rewriting the data or sending the data to the user.
+- **iterator scopes** - the possible scopes for iterators are where the tablet server is
already reading and/or writing data: minor compaction / flush time (*minc* scope), major compaction
/ file merging time (*majc* scope), and query time (*scan* scope)
+- **gc** - 
+- **key** - the key into the distributed sorted map which is accumulo.  The key is subdivided
into row, column, and timestamp.  The column is further divided into  family, qualifier, and
+- **locality group** - a set of column families that will be grouped together on disk.  With
no locality groups configured, data is stored on disk in row order.  If each column family
were configured to be its own locality group, the data for each column would be stored separately,
in row order.  Configuring sets of columns into locality groups is a compromise between the
two approaches and will improve performance when multiple columns are accessed in the same
+- **log-structured merge-tree** - the sorting / flushing / merging scheme on which BigTable's
design is based.
+- **logger** - 
+- **major compaction** - merging multiple files into a single file.  If all of a tablet's
files are merged into a single file, it is called a *full major compaction*.
 - **master** - 
-- **minor compaction** - 
-- **permissions** - 
-- **row** - 
-- **scan** -
-- **tablet** - a contiguous key range; the unit of work for a tablet server
-- **tablet server** - 
-- **timestamp** - 
-- **value** - 
\ No newline at end of file
+- **minor compaction** - flushing data from memory to disk.  Usually this creates a new file
for a tablet, but if the memory flushed is merge-sorted in with data from an existing file
(replacing that file), it is called a *merging minor compaction*.
+- **monitor** -
+- **permissions** - administrative abilities that must be given to a user such as creating
tables or users and changing permissions or configuration parameters.
+- **row** - the portion of the key that is controls atomicity.  Keys with the same row are
guaranteed to remain on a single tablet hosted by a single tablet server, therefore multiple
key/value pairs can be added to or removed from a row at the same time. The row is used for
the primary sorting of the key.
+- **scan** - reading a range of key/value pairs.
+- **tablet** - a contiguous key range; the unit of work for a tablet server.
+- **tablet servers** - a set of servers that hosts reads and writes for tablets.  Each server
hosts a distinct set of tablets at any given time, but the tablets may be hosted by different
servers over time.
+- **timestamp** - the portion of the key that controls versioning.  Otherwise identical keys
with differing timestamps are considered to be versions of a single *cell*.  Accumulo can
be configured to keep the *N* newest versions of each *cell*.  When a deletion entry is inserted,
it deletes all earlier versions for its cell.
+- **value** - immutable bytes associated with a particular key.
\ No newline at end of file

View raw message