incubator-accumulo-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bil...@apache.org
Subject svn commit: r1214467 - in /incubator/accumulo/site/trunk/content/accumulo: user_manual_1.3-incubating/ user_manual_1.3-incubating/examples/ user_manual_1.4-incubating/
Date Wed, 14 Dec 2011 21:04:00 GMT
Author: billie
Date: Wed Dec 14 21:04:00 2011
New Revision: 1214467

URL: http://svn.apache.org/viewvc?rev=1214467&view=rev
Log:
ACCUMULO-221 fixed a's in generated site docs

Modified:
    incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/Analytics.mdtext
    incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/Table_Configuration.mdtext
    incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/examples/dirlist.mdtext
    incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/examples/shard.mdtext
    incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Analytics.mdtext
    incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Security.mdtext
    incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Table_Configuration.mdtext
    incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Writing_Accumulo_Clients.mdtext

Modified: incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/Analytics.mdtext
URL: http://svn.apache.org/viewvc/incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/Analytics.mdtext?rev=1214467&r1=1214466&r2=1214467&view=diff
==============================================================================
--- incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/Analytics.mdtext
(original)
+++ incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/Analytics.mdtext
Wed Dec 14 21:04:00 2011
@@ -32,7 +32,7 @@ Accumulo supports more advanced data pro
 
 ## <a id=MapReduce></a> MapReduce
 
-Accumulo tables can be used as the source and destination of MapReduce jobs. To use a Accumulo
table with a MapReduce job (specifically with the new Hadoop API as of version 0.20), configure
the job parameters to use the AccumuloInputFormat and AccumuloOutputFormat. Accumulo specific
parameters can be set via these two format classes to do the following: 
+Accumulo tables can be used as the source and destination of MapReduce jobs. To use an Accumulo
table with a MapReduce job (specifically with the new Hadoop API as of version 0.20), configure
the job parameters to use the AccumuloInputFormat and AccumuloOutputFormat. Accumulo specific
parameters can be set via these two format classes to do the following: 
 
 * Authenticate and provide user credentials for the input 
 * Restrict the scan to a range of rows 
@@ -40,7 +40,7 @@ Accumulo tables can be used as the sourc
 
 ### <a id=Mapper_and_Reducer_classes></a> Mapper and Reducer classes
 
-To read from a Accumulo table create a Mapper with the following class parameterization and
be sure to configure the AccumuloInputFormat. 
+To read from an Accumulo table create a Mapper with the following class parameterization
and be sure to configure the AccumuloInputFormat. 
     
     
     class MyMapper extends Mapper<Key,Value,WritableComparable,Writable> {
@@ -50,7 +50,7 @@ To read from a Accumulo table create a M
     }
     
 
-To write to a Accumulo table, create a Reducer with the following class parameterization
and be sure to configure the AccumuloOutputFormat. The key emitted from the Reducer identifies
the table to which the mutation is sent. This allows a single Reducer to write to more than
one table if desired. A default table can be configured using the AccumuloOutputFormat, in
which case the output table name does not have to be passed to the Context object within the
Reducer. 
+To write to an Accumulo table, create a Reducer with the following class parameterization
and be sure to configure the AccumuloOutputFormat. The key emitted from the Reducer identifies
the table to which the mutation is sent. This allows a single Reducer to write to more than
one table if desired. A default table can be configured using the AccumuloOutputFormat, in
which case the output table name does not have to be passed to the Context object within the
Reducer. 
     
     
     class MyReducer extends Reducer<WritableComparable, Writable, Text, Mutation> {
@@ -142,11 +142,11 @@ The only restriction on an aggregating i
 
 ### <a id=Feature_Vectors></a> Feature Vectors
 
-An interesting use of aggregating iterators within a Accumulo table is to store feature vectors
for use in machine learning algorithms. For example, many algorithms such as k-means clustering,
support vector machines, anomaly detection, etc. use the concept of a feature vector and the
calculation of distance metrics to learn a particular model. The columns in a Accumulo table
can be used to efficiently store sparse features and their weights to be incrementally updated
via the use of an aggregating iterator. 
+An interesting use of aggregating iterators within an Accumulo table is to store feature
vectors for use in machine learning algorithms. For example, many algorithms such as k-means
clustering, support vector machines, anomaly detection, etc. use the concept of a feature
vector and the calculation of distance metrics to learn a particular model. The columns in
an Accumulo table can be used to efficiently store sparse features and their weights to be
incrementally updated via the use of an aggregating iterator. 
 
 ## <a id=Statistical_Modeling></a> Statistical Modeling
 
-Statistical models that need to be updated by many machines in parallel could be similarly
stored within a Accumulo table. For example, a MapReduce job that is iteratively updating
a global statistical model could have each map or reduce worker reference the parts of the
model to be read and updated through an embedded Accumulo client. 
+Statistical models that need to be updated by many machines in parallel could be similarly
stored within an Accumulo table. For example, a MapReduce job that is iteratively updating
a global statistical model could have each map or reduce worker reference the parts of the
model to be read and updated through an embedded Accumulo client. 
 
 Using Accumulo this way enables efficient and fast lookups and updates of small pieces of
information in a random access pattern, which is complementary to MapReduce's sequential access
model. 
 

Modified: incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/Table_Configuration.mdtext
URL: http://svn.apache.org/viewvc/incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/Table_Configuration.mdtext?rev=1214467&r1=1214466&r2=1214467&view=diff
==============================================================================
--- incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/Table_Configuration.mdtext
(original)
+++ incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/Table_Configuration.mdtext
Wed Dec 14 21:04:00 2011
@@ -111,7 +111,7 @@ accumulo/src/examples/main/java/accumulo
 
 ## <a id=Bloom_Filters></a> Bloom Filters
 
-As mutations are applied to a Accumulo table, several files are created per tablet. If bloom
filters are enabled, Accumulo will create and load a small data structure into memory to determine
whether a file contains a given key before opening the file. This can speed up lookups considerably.

+As mutations are applied to an Accumulo table, several files are created per tablet. If bloom
filters are enabled, Accumulo will create and load a small data structure into memory to determine
whether a file contains a given key before opening the file. This can speed up lookups considerably.

 
 To enable bloom filters, enter the following command in the Shell: 
     

Modified: incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/examples/dirlist.mdtext
URL: http://svn.apache.org/viewvc/incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/examples/dirlist.mdtext?rev=1214467&r1=1214466&r2=1214467&view=diff
==============================================================================
--- incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/examples/dirlist.mdtext
(original)
+++ incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/examples/dirlist.mdtext
Wed Dec 14 21:04:00 2011
@@ -18,10 +18,10 @@ Notice:    Licensed to the Apache Softwa
 
 This example shows how to use Accumulo to store a file system history.  It has three classes:
 
- * Ingest.java - Recursively lists the files and directories under a given path, ingests
their names and file info (not the file data!) into a Accumulo table, and indexes the file
names in a separate table.
+ * Ingest.java - Recursively lists the files and directories under a given path, ingests
their names and file info (not the file data!) into an Accumulo table, and indexes the file
names in a separate table.
  * QueryUtil.java - Provides utility methods for getting the info for a file, listing the
contents of a directory, and performing single wild card searches on file or directory names.
  * Viewer.java - Provides a GUI for browsing the file system information stored in Accumulo.
- * FileCountMR.java - Runs MR over the file system information and writes out counts to a
Accumulo table.
+ * FileCountMR.java - Runs MR over the file system information and writes out counts to an
Accumulo table.
  * FileCount.java - Accomplishes the same thing as FileCountMR, but in a different way. 
Computes recursive counts and stores them back into table.
  * StringArraySummation.java - Aggregates counts for the FileCountMR reducer.
  

Modified: incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/examples/shard.mdtext
URL: http://svn.apache.org/viewvc/incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/examples/shard.mdtext?rev=1214467&r1=1214466&r2=1214467&view=diff
==============================================================================
--- incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/examples/shard.mdtext
(original)
+++ incubator/accumulo/site/trunk/content/accumulo/user_manual_1.3-incubating/examples/shard.mdtext
Wed Dec 14 21:04:00 2011
@@ -19,7 +19,7 @@ Notice:    Licensed to the Apache Softwa
 Accumulo has in iterator called the intersecting iterator which supports querying a term
index that is partitioned by 
 document, or "sharded". This example shows how to use the intersecting iterator through these
four programs:
 
- * Index.java - Indexes a set of text files into a Accumulo table
+ * Index.java - Indexes a set of text files into an Accumulo table
  * Query.java - Finds documents containing a given set of terms.
  * Reverse.java - Reads the index table and writes a map of documents to terms into another
table.
  * ContinuousQuery.java  Uses the table populated by Reverse.java to select N random terms
per document.  Then it continuously and randomly queries those terms.

Modified: incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Analytics.mdtext
URL: http://svn.apache.org/viewvc/incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Analytics.mdtext?rev=1214467&r1=1214466&r2=1214467&view=diff
==============================================================================
--- incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Analytics.mdtext
(original)
+++ incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Analytics.mdtext
Wed Dec 14 21:04:00 2011
@@ -32,7 +32,7 @@ Accumulo supports more advanced data pro
 
 ## <a id=MapReduce></a> MapReduce
 
-Accumulo tables can be used as the source and destination of MapReduce jobs. To use a Accumulo
table with a MapReduce job (specifically with the new Hadoop API as of version 0.20), configure
the job parameters to use the AccumuloInputFormat and AccumuloOutputFormat. Accumulo specific
parameters can be set via these two format classes to do the following: 
+Accumulo tables can be used as the source and destination of MapReduce jobs. To use an Accumulo
table with a MapReduce job (specifically with the new Hadoop API as of version 0.20), configure
the job parameters to use the AccumuloInputFormat and AccumuloOutputFormat. Accumulo specific
parameters can be set via these two format classes to do the following: 
 
 * Authenticate and provide user credentials for the input 
 * Restrict the scan to a range of rows 
@@ -40,7 +40,7 @@ Accumulo tables can be used as the sourc
 
 ### <a id=Mapper_and_Reducer_classes></a> Mapper and Reducer classes
 
-To read from a Accumulo table create a Mapper with the following class parameterization and
be sure to configure the AccumuloInputFormat. 
+To read from an Accumulo table create a Mapper with the following class parameterization
and be sure to configure the AccumuloInputFormat. 
     
     
     class MyMapper extends Mapper<Key,Value,WritableComparable,Writable> {
@@ -50,7 +50,7 @@ To read from a Accumulo table create a M
     }
     
 
-To write to a Accumulo table, create a Reducer with the following class parameterization
and be sure to configure the AccumuloOutputFormat. The key emitted from the Reducer identifies
the table to which the mutation is sent. This allows a single Reducer to write to more than
one table if desired. A default table can be configured using the AccumuloOutputFormat, in
which case the output table name does not have to be passed to the Context object within the
Reducer. 
+To write to an Accumulo table, create a Reducer with the following class parameterization
and be sure to configure the AccumuloOutputFormat. The key emitted from the Reducer identifies
the table to which the mutation is sent. This allows a single Reducer to write to more than
one table if desired. A default table can be configured using the AccumuloOutputFormat, in
which case the output table name does not have to be passed to the Context object within the
Reducer. 
     
     
     class MyReducer extends Reducer<WritableComparable, Writable, Text, Mutation> {
@@ -142,11 +142,11 @@ The only restriction on an combining ite
 
 ### <a id=Feature_Vectors></a> Feature Vectors
 
-An interesting use of combining iterators within a Accumulo table is to store feature vectors
for use in machine learning algorithms. For example, many algorithms such as k-means clustering,
support vector machines, anomaly detection, etc. use the concept of a feature vector and the
calculation of distance metrics to learn a particular model. The columns in a Accumulo table
can be used to efficiently store sparse features and their weights to be incrementally updated
via the use of an combining iterator. 
+An interesting use of combining iterators within an Accumulo table is to store feature vectors
for use in machine learning algorithms. For example, many algorithms such as k-means clustering,
support vector machines, anomaly detection, etc. use the concept of a feature vector and the
calculation of distance metrics to learn a particular model. The columns in an Accumulo table
can be used to efficiently store sparse features and their weights to be incrementally updated
via the use of an combining iterator. 
 
 ## <a id=Statistical_Modeling></a> Statistical Modeling
 
-Statistical models that need to be updated by many machines in parallel could be similarly
stored within a Accumulo table. For example, a MapReduce job that is iteratively updating
a global statistical model could have each map or reduce worker reference the parts of the
model to be read and updated through an embedded Accumulo client. 
+Statistical models that need to be updated by many machines in parallel could be similarly
stored within an Accumulo table. For example, a MapReduce job that is iteratively updating
a global statistical model could have each map or reduce worker reference the parts of the
model to be read and updated through an embedded Accumulo client. 
 
 Using Accumulo this way enables efficient and fast lookups and updates of small pieces of
information in a random access pattern, which is complementary to MapReduce's sequential access
model. 
 

Modified: incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Security.mdtext
URL: http://svn.apache.org/viewvc/incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Security.mdtext?rev=1214467&r1=1214466&r2=1214467&view=diff
==============================================================================
--- incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Security.mdtext
(original)
+++ incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Security.mdtext
Wed Dec 14 21:04:00 2011
@@ -109,7 +109,7 @@ Any user with the alter table permission
 
 ## <a id=Secure_Authorizations_Handling></a> Secure Authorizations Handling
 
-For applications serving many users, it is not expected that a accumulo user will be created
for each application user. In this case a accumulo user with all authorizations needed by
any of the applications users must be created. To service queries, the application should
create a scanner with the application users authorizations. These authorizations could be
obtained from a trusted 3rd party. 
+For applications serving many users, it is not expected that an accumulo user will be created
for each application user. In this case an accumulo user with all authorizations needed by
any of the applications users must be created. To service queries, the application should
create a scanner with the application users authorizations. These authorizations could be
obtained from a trusted 3rd party. 
 
 Often production systems will integrate with Public-Key Infrastructure (PKI) and designate
client code within the query layer to negotiate with PKI servers in order to authenticate
users and retrieve their authorization tokens (credentials). This requires users to specify
only the information necessary to authenticate themselves to the system. Once user identity
is established, their credentials can be accessed by the client code and passed to Accumulo
outside of the reach of the user. 
 

Modified: incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Table_Configuration.mdtext
URL: http://svn.apache.org/viewvc/incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Table_Configuration.mdtext?rev=1214467&r1=1214466&r2=1214467&view=diff
==============================================================================
--- incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Table_Configuration.mdtext
(original)
+++ incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Table_Configuration.mdtext
Wed Dec 14 21:04:00 2011
@@ -115,7 +115,7 @@ accumulo/src/examples/main/java/accumulo
 
 ## <a id=Bloom_Filters></a> Bloom Filters
 
-As mutations are applied to a Accumulo table, several files are created per tablet. If bloom
filters are enabled, Accumulo will create and load a small data structure into memory to determine
whether a file contains a given key before opening the file. This can speed up lookups considerably.

+As mutations are applied to an Accumulo table, several files are created per tablet. If bloom
filters are enabled, Accumulo will create and load a small data structure into memory to determine
whether a file contains a given key before opening the file. This can speed up lookups considerably.

 
 To enable bloom filters, enter the following command in the Shell: 
     

Modified: incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Writing_Accumulo_Clients.mdtext
URL: http://svn.apache.org/viewvc/incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Writing_Accumulo_Clients.mdtext?rev=1214467&r1=1214466&r2=1214467&view=diff
==============================================================================
--- incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Writing_Accumulo_Clients.mdtext
(original)
+++ incubator/accumulo/site/trunk/content/accumulo/user_manual_1.4-incubating/Writing_Accumulo_Clients.mdtext
Wed Dec 14 21:04:00 2011
@@ -110,7 +110,7 @@ Accumulo supports the ability to present
 * iterators executed as part of a minor or major compaction 
 * bulk import of new files 
 
-Isolation guarantees that either all or none of the changes made by these operations on a
row are seen. Use the IsolatedScanner to obtain an isolated view of a accumulo table. When
using the regular scanner it is possible to see a non isolated view of a row. For example
if a mutation modifies three columns, it is possible that you will only see two of those modifications.
With the isolated scanner either all three of the changes are seen or none. 
+Isolation guarantees that either all or none of the changes made by these operations on a
row are seen. Use the IsolatedScanner to obtain an isolated view of an accumulo table. When
using the regular scanner it is possible to see a non isolated view of a row. For example
if a mutation modifies three columns, it is possible that you will only see two of those modifications.
With the isolated scanner either all three of the changes are seen or none. 
 
 The IsolatedScanner buffers rows on the client side so a large row will not crash a tablet
server. By default rows are buffered in memory, but the user can easily supply their own buffer
if they wish to buffer to disk when rows are large. 
 



Mime
View raw message