incubator-accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r809769 - in /websites/staging/accumulo/trunk/content: ./ accumulo/1.4/examples/index.html accumulo/1.4/user_manual/Accumulo_Design.html
Date Fri, 23 Mar 2012 20:24:01 GMT
Author: buildbot
Date: Fri Mar 23 20:24:00 2012
New Revision: 809769

Log:
Staging update by buildbot for accumulo

Modified:
    websites/staging/accumulo/trunk/content/   (props changed)
    websites/staging/accumulo/trunk/content/accumulo/1.4/examples/index.html
    websites/staging/accumulo/trunk/content/accumulo/1.4/user_manual/Accumulo_Design.html

Propchange: websites/staging/accumulo/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Mar 23 20:24:00 2012
@@ -1 +1 @@
-1304576
+1304582

Modified: websites/staging/accumulo/trunk/content/accumulo/1.4/examples/index.html
==============================================================================
--- websites/staging/accumulo/trunk/content/accumulo/1.4/examples/index.html (original)
+++ websites/staging/accumulo/trunk/content/accumulo/1.4/examples/index.html Fri Mar 23 20:24:00
2012
@@ -101,19 +101,19 @@ This user will need to have the ability 
 <p>In all commands, you will need to replace "instance", "zookeepers", "username",
and "password" with the values you set for your Accumulo instance.</p>
 <p>Commands intended to be run in bash are prefixed by '$'.  These are always assumed
to be run from the $ACCUMULO_HOME directory.</p>
 <p>Commands intended to be run in the Accumulo shell are prefixed by '&gt;'.</p>
-<p><a href="examples/batch.html">batch</a></p>
-<p><a href="examples/bloom.html">bloom</a></p>
-<p><a href="examples/bulkIngest.html">bulkIngest</a></p>
-<p><a href="examples/combiner.html">combiner</a></p>
-<p><a href="examples/constraints.html">constraints</a></p>
-<p><a href="examples/dirlist.html">dirlist</a></p>
-<p><a href="examples/filedata.html">filedata</a></p>
-<p><a href="examples/filter.html">filter</a></p>
-<p><a href="examples/helloworld.html">helloworld</a></p>
-<p><a href="examples/isolation.html">isolation</a></p>
-<p><a href="examples/mapred.html">mapred</a></p>
-<p><a href="examples/shard.html">shard</a></p>
-<p><a href="examples/visibility.html">visibility</a></p>
+<p><a href="batch.html">batch</a></p>
+<p><a href="bloom.html">bloom</a></p>
+<p><a href="bulkIngest.html">bulkIngest</a></p>
+<p><a href="combiner.html">combiner</a></p>
+<p><a href="constraints.html">constraints</a></p>
+<p><a href="dirlist.html">dirlist</a></p>
+<p><a href="filedata.html">filedata</a></p>
+<p><a href="filter.html">filter</a></p>
+<p><a href="helloworld.html">helloworld</a></p>
+<p><a href="isolation.html">isolation</a></p>
+<p><a href="mapred.html">mapred</a></p>
+<p><a href="shard.html">shard</a></p>
+<p><a href="visibility.html">visibility</a></p>
   </div>
 
   <div id="footer">

Modified: websites/staging/accumulo/trunk/content/accumulo/1.4/user_manual/Accumulo_Design.html
==============================================================================
--- websites/staging/accumulo/trunk/content/accumulo/1.4/user_manual/Accumulo_Design.html
(original)
+++ websites/staging/accumulo/trunk/content/accumulo/1.4/user_manual/Accumulo_Design.html
Fri Mar 23 20:24:00 2012
@@ -107,7 +107,7 @@
 <h2 id="wzxhzdk8wzxhzdk9-components"><a id=Components></a> Components</h2>
 <p>An instance of Accumulo includes many TabletServers, write-ahead Logger servers,
one Garbage Collector process, one Master server and many Clients. </p>
 <h3 id="wzxhzdk10wzxhzdk11-tablet-server"><a id=Tablet_Server></a> Tablet
Server</h3>
-<p>The TabletServer manages some subset of all the tablets (partitions of tables).
This includes receiving writes from clients, persisting writes to a write‐ahead log,
sorting new key‐value pairs in memory, periodically flushing sorted key‐value pairs
to new files in HDFS, and responding to reads from clients, forming a merge‐sorted view
of all keys and values from all the files it has created and the sorted in‐memory store.
</p>
+<p>The TabletServer manages some subset of all the tablets (partitions of tables).
This includes receiving writes from clients, persisting writes to a write-ahead log, sorting
new key-value pairs in memory, periodically flushing sorted key-value pairs to new files in
HDFS, and responding to reads from clients, forming a merge-sorted view of all keys and values
from all the files it has created and the sorted in-memory store. </p>
 <p>TabletServers also perform recovery of a tablet that was previously on a server
that failed, reapplying any writes found in the write-ahead log to the tablet. </p>
 <h3 id="wzxhzdk12wzxhzdk13-loggers"><a id=Loggers></a> Loggers</h3>
 <p>The Loggers accept updates to Tablet servers and write them to local on-disk storage.
Each tablet server will write their updates to multiple loggers to preserve data in case of
hardware failure. </p>
@@ -121,10 +121,10 @@
 <p>Accumulo stores data in tables, which are partitioned into tablets. Tablets are
partitioned on row boundaries so that all of the columns and values for a particular row are
found together within the same tablet. The Master assigns Tablets to one TabletServer at a
time. This enables row-level transactions to take place without using distributed locking
or some other complicated synchronization mechanism. As clients insert and query data, and
as machines are added and removed from the cluster, the Master migrates tablets to ensure
they remain available and that the ingest and query load is balanced across the cluster. </p>
 <p><img alt="Image data_distribution" src="./data_distribution.png" /></p>
 <h2 id="wzxhzdk22wzxhzdk23-tablet-service"><a id=Tablet_Service></a> Tablet
Service</h2>
-<p>When a write arrives at a TabletServer it is written to a Write‐Ahead Log and
then inserted into a sorted data structure in memory called a MemTable. When the MemTable
reaches a certain size the TabletServer writes out the sorted key-value pairs to a file in
HDFS called Indexed Sequential Access Method (ISAM) file. This process is called a minor compaction.
A new MemTable is then created and the fact of the compaction is recorded in the Write‐Ahead
Log. </p>
-<p>When a request to read data arrives at a TabletServer, the TabletServer does a binary
search across the MemTable as well as the in-memory indexes associated with each ISAM file
to find the relevant values. If clients are performing a scan, several key‐value pairs
are returned to the client in order from the MemTable and the set of ISAM files by performing
a merge‐sort as they are read. </p>
+<p>When a write arrives at a TabletServer it is written to a Write-Ahead Log and then
inserted into a sorted data structure in memory called a MemTable. When the MemTable reaches
a certain size the TabletServer writes out the sorted key-value pairs to a file in HDFS called
Indexed Sequential Access Method (ISAM) file. This process is called a minor compaction. A
new MemTable is then created and the fact of the compaction is recorded in the Write-Ahead
Log. </p>
+<p>When a request to read data arrives at a TabletServer, the TabletServer does a binary
search across the MemTable as well as the in-memory indexes associated with each ISAM file
to find the relevant values. If clients are performing a scan, several key-value pairs are
returned to the client in order from the MemTable and the set of ISAM files by performing
a merge-sort as they are read. </p>
 <h2 id="wzxhzdk24wzxhzdk25-compactions"><a id=Compactions></a> Compactions</h2>
-<p>In order to manage the number of files per tablet, periodically the TabletServer
performs Major Compactions of files within a tablet, in which some set of ISAM files are combined
into one file. The previous files will eventually be removed by the Garbage Collector. This
also provides an opportunity to permanently remove deleted key‐value pairs by omitting
key‐value pairs suppressed by a delete entry when the new file is created. </p>
+<p>In order to manage the number of files per tablet, periodically the TabletServer
performs Major Compactions of files within a tablet, in which some set of ISAM files are combined
into one file. The previous files will eventually be removed by the Garbage Collector. This
also provides an opportunity to permanently remove deleted key-value pairs by omitting key-value
pairs suppressed by a delete entry when the new file is created. </p>
 <h2 id="wzxhzdk26wzxhzdk27-fault-tolerance"><a id=Fault-Tolerance></a>
Fault-Tolerance</h2>
 <p>If a TabletServer fails, the Master detects it and automatically reassigns the tablets
assigned from the failed server to other servers. Any key-value pairs that were in memory
at the time the TabletServer are automatically reapplied from the Write-Ahead Log to prevent
any loss of data. </p>
 <p>The Master will coordinate the copying of write-ahead logs to HDFS so the logs are
available to all tablet servers. To make recovery efficient, the updates within a log are
grouped by tablet. The sorting process can be performed by Hadoops MapReduce or the Logger
server. TabletServers can quickly apply the mutations from the sorted logs that are destined
for the tablets they have now been assigned. </p>



Mime
View raw message