# accumulo-commits mailing list archives

##### Site index · List index
Message view
Top
From bil...@apache.org
Subject [1/5] ACCUMULO-1327 converted latex manual to asciidoc
Date Thu, 08 May 2014 02:16:18 GMT
Repository: accumulo
Updated Branches:

http://git-wip-us.apache.org/repos/asf/accumulo/blob/900d6abb/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex b/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex
deleted file mode 100644
index ff1cebd..0000000
--- a/docs/src/main/latex/accumulo_user_manual/chapters/table_design.tex
+++ /dev/null
@@ -1,343 +0,0 @@
-
-% Licensed to the Apache Software Foundation (ASF) under one or more
-% contributor license agreements. See the NOTICE file distributed with
-% The ASF licenses this file to You under the Apache License, Version 2.0
-% (the "License"); you may not use this file except in compliance with
-% the License. You may obtain a copy of the License at
-%
-%
-% Unless required by applicable law or agreed to in writing, software
-% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-% See the License for the specific language governing permissions and
-
-\chapter{Table Design}
-
-\section{Basic Table}
-
-Since Accumulo tables are sorted by row ID, each table can be thought of as being
-indexed by the row ID. Lookups performed by row ID can be executed quickly, by doing
-a binary search, first across the tablets, and then within a tablet. Clients should
-choose a row ID carefully in order to support their desired application. A simple rule
-is to select a unique identifier as the row ID for each entity to be stored and assign
-all the other attributes to be tracked to be columns under this row ID. For example,
-if we have the following data in a comma-separated file:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-\end{verbatim}\endgroup
-
-We might choose to store this data using the userid as the rowID, the column
-name in the column family, and a blank column qualifier:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-Mutation m = new Mutation(userid);
-final String column_qualifier = "";
-m.put("age", column_qualifier, age);
-m.put("balance", column_qualifier, account_balance);
-
-\end{verbatim}\endgroup
-
-We could then retrieve any of the columns for a specific userid by specifying the
-userid as the range of a scanner and fetching specific columns:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-Range r = new Range(userid, userid); // single row
-Scanner s = conn.createScanner("userdata", auths);
-s.setRange(r);
-s.fetchColumnFamily(new Text("age"));
-
-for(Entry<Key,Value> entry : s)
-    System.out.println(entry.getValue().toString());
-\end{verbatim}\endgroup
-
-\section{RowID Design}
-
-Often it is necessary to transform the rowID in order to have rows ordered in a way
-that is optimal for anticipated access patterns. A good example of this is reversing
-the order of components of internet domain names in order to group rows of the
-same parent domain together:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-com.yahoo.mail
-com.yahoo.research
-\end{verbatim}\endgroup
-
-Some data may result in the creation of very large rows - rows with many columns.
-In this case the table designer may wish to split up these rows for better load
-balancing while keeping them sorted together for scanning purposes. This can be
-done by appending a random substring at the end of the row:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-\end{verbatim}\endgroup
-
-It could also be done by adding a string representation of some period of time such as date to the week
-or month:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-\end{verbatim}\endgroup
-
-Appending dates provides the additional capability of restricting a scan to a given
-date range.
-
-\section{Lexicoders}
-Since Keys in Accumulo are sorted lexicographically by default, it's often useful to encode
-common data types into a byte format in which their sort order corresponds to the sort order
-in their native form. An example of this is encoding dates and numerical data so that they can
-be better seeked or searched in ranges.
-
-The lexicoders are a standard and extensible way of encoding Java types. Here's an example
-of a lexicoder that encodes a java Date object so that it sorts lexicographically:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-// create new date lexicoder
-DateLexicoder dateEncoder = new DateLexicoder();
-
-// truncate time to hours
-long epoch = System.currentTimeMillis();
-Date hour = new Date(epoch - (epoch % 3600000));
-
-// encode the rowId so that it is sorted lexicographically
-Mutation mutation = new Mutation(dateEncoder.encode(hour));
-mutation.put(new Text("colf"), new Text("colq"), new Value(new byte[]{}));
-\end{verbatim}\endgroup
-
-If we want to return the most recent date first, we can reverse the sort order
-with the reverse lexicoder:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-// create new date lexicoder and reverse lexicoder
-DateLexicoder dateEncoder = new DateLexicoder();
-ReverseLexicoder reverseEncoder = new ReverseLexicoder(dateEncoder);
-
-// truncate date to hours
-long epoch = System.currentTimeMillis();
-Date hour = new Date(epoch - (epoch % 3600000));
-
-// encode the rowId so that it sorts in reverse lexicographic order
-Mutation mutation = new Mutation(reverseEncoder.encode(hour));
-mutation.put(new Text("colf"), new Text("colq"), new Value(new byte[]{}));
-\end{verbatim}\endgroup
-
-
-\section{Indexing}
-In order to support lookups via more than one attribute of an entity, additional
-indexes can be built. However, because Accumulo tables can support any number of
-columns without specifying them beforehand, a single additional index will often
-suffice for supporting lookups of records in the main table. Here, the index has, as
-the rowID, the Value or Term from the main table, the column families are the same,
-and the column qualifier of the index table contains the rowID from the main table.
-
-\begin{center}
-$\begin{array}{|c|c|c|c|c|c|} \hline -\multicolumn{5}{|c|}{\mbox{Key}} & \multirow{3}{*}{\mbox{Value}}\\ \cline{1-5} -\multirow{2}{*}{\mbox{Row ID}}& \multicolumn{3}{|c|}{\mbox{Column}} & \multirow{2}{*}{\mbox{Timestamp}} & \\ \cline{2-4} -& \mbox{Family} & \mbox{Qualifier} & \mbox{Visibility} & & \\ \hline \hline -\mbox{Term} & \mbox{Field Name} & \mbox{MainRowID} & & &\\ \hline -\end{array}$
-\end{center}
-
-Note: We store rowIDs in the column qualifier rather than the Value so that we can
-have more than one rowID associated with a particular term within the index. If we
-stored this in the Value we would only see one of the rows in which the value
-appears since Accumulo is configured by default to return the one most recent
-value associated with a key.
-
-Lookups can then be done by scanning the Index Table first for occurrences of the
-desired values in the columns specified, which returns a list of row ID from the main
-table. These can then be used to retrieve each matching record, in their entirety, or a
-subset of their columns, from the Main Table.
-
-To support efficient lookups of multiple rowIDs from the same table, the Accumulo
-client library provides a BatchScanner. Users specify a set of Ranges to the
-BatchScanner, which performs the lookups in multiple threads to multiple servers
-and returns an Iterator over all the rows retrieved. The rows returned are NOT in
-sorted order, as is the case with the basic Scanner interface.
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-// first we scan the index for IDs of rows matching our query
-
-Text term = new Text("mySearchTerm");
-
-HashSet<Range> matchingRows = new HashSet<Range>();
-
-Scanner indexScanner = createScanner("index", auths);
-indexScanner.setRange(new Range(term, term));
-
-// we retrieve the matching rowIDs and create a set of ranges
-for(Entry<Key,Value> entry : indexScanner)
-
-// now we pass the set of rowIDs to the batch scanner to retrieve them
-BatchScanner bscan = conn.createBatchScanner("table", auths, 10);
-
-bscan.setRanges(matchingRows);
-bscan.fetchColumnFamily(new Text("attributes"));
-
-for(Entry<Key,Value> entry : bscan)
-    System.out.println(entry.getValue());
-\end{verbatim}\endgroup
-
-One advantage of the dynamic schema capabilities of Accumulo is that different
-fields may be indexed into the same physical table. However, it may be necessary to
-create different index tables if the terms must be formatted differently in order to
-maintain proper sort order. For example, real numbers must be formatted
-differently than their usual notation in order to be sorted correctly. In these cases,
-usually one index per unique data type will suffice.
-
-\section{Entity-Attribute and Graph Tables}
-
-Accumulo is ideal for storing entities and their attributes, especially of the
-attributes are sparse. It is often useful to join several datasets together on common
-entities within the same table. This can allow for the representation of graphs,
-including nodes, their attributes, and connections to other nodes.
-
-Rather than storing individual events, Entity-Attribute or Graph tables store
-aggregate information about the entities involved in the events and the
-relationships between entities. This is often preferrable when single events aren't
-very useful and when a continuously updated summarization is desired.
-
-The physical schema for an entity-attribute or graph table is as follows:
-
-\begin{center}
-$\begin{array}{|c|c|c|c|c|c|} \hline -\multicolumn{5}{|c|}{\mbox{Key}} & \multirow{3}{*}{\mbox{Value}}\\ \cline{1-5} -\multirow{2}{*}{\mbox{Row ID}}& \multicolumn{3}{|c|}{\mbox{Column}} & \multirow{2}{*}{\mbox{Timestamp}} & \\ \cline{2-4} -& \mbox{Family} & \mbox{Qualifier} & \mbox{Visibility} & & \\ \hline \hline -\mbox{EntityID} & \mbox{Attribute Name} & \mbox{Attribute Value} & & & \mbox{Weight} \\ \hline -\mbox{EntityID} & \mbox{Edge Type} & \mbox{Related EntityID} & & & \mbox{Weight} \\ \hline -\end{array}$
-\end{center}
-
-For example, to keep track of employees, managers and products the following
-entity-attribute table could be used. Note that the weights are not always necessary
-and are set to 0 when not used.
-
-$\begin{array}{llll} -\bf{RowID} & \bf{Column Family} & \bf{Column Qualifier} & \bf{Value} \\ -\\ -E001 & name & bob & 0 \\ -E001 & department & sales & 0 \\ -E001 & hire\_date & 20030102 & 0 \\ -E001 & units\_sold & P001 & 780 \\ -\\ -E002 & name & george & 0 \\ -E002 & department & sales & 0 \\ -E002 & manager\_of & E001 & 0 \\ -E002 & manager\_of & E003 & 0 \\ -\\ -E003 & name & harry & 0 \\ -E003 & department & accounts\_recv & 0 \\ -E003 & hire\_date & 20000405 & 0 \\ -E003 & units\_sold & P002 & 566 \\ -E003 & units\_sold & P001 & 232 \\ -\\ -P001 & product\_name & nike\_airs & 0 \\ -P001 & product\_type & shoe & 0 \\ -P001 & in\_stock & germany & 900 \\ -P001 & in\_stock & brazil & 200 \\ -\\ -P002 & product\_name & basic\_jacket & 0 \\ -P002 & product\_type & clothing & 0 \\ -P002 & in\_stock & usa & 3454 \\ -P002 & in\_stock & germany & 700 \\ -\end{array}$
-\vspace{5mm}
-
-To allow efficient updating of edge weights, an aggregating iterator can be
-configured to add the value of all mutations applied with the same key. These types
-of tables can easily be created from raw events by simply extracting the entities,
-attributes, and relationships from individual events and inserting the keys into
-Accumulo each with a count of 1. The aggregating iterator will take care of
-maintaining the edge weights.
-
-\section{Document-Partitioned Indexing}
-
-Using a simple index as described above works well when looking for records that
-match one of a set of given criteria. When looking for records that match more than
-one criterion simultaneously, such as when looking for documents that contain all of
-the words the' and white' and house', there are several issues.
-
-First is that the set of all records matching any one of the search terms must be sent
-to the client, which incurs a lot of network traffic. The second problem is that the
-client is responsible for performing set intersection on the sets of records returned
-to eliminate all but the records matching all search terms. The memory of the client
-may easily be overwhelmed during this operation.
-
-For these reasons Accumulo includes support for a scheme known as sharded
-indexing, in which these set operations can be performed at the TabletServers and
-decisions about which records to include in the result set can be made without
-incurring network traffic.
-
-This is accomplished via partitioning records into bins that each reside on at most
-one TabletServer, and then creating an index of terms per record within each bin as
-follows:
-
-\begin{center}
-$\begin{array}{|c|c|c|c|c|c|} \hline -\multicolumn{5}{|c|}{\mbox{Key}} & \multirow{3}{*}{\mbox{Value}}\\ \cline{1-5} -\multirow{2}{*}{\mbox{Row ID}}& \multicolumn{3}{|c|}{\mbox{Column}} & \multirow{2}{*}{\mbox{Timestamp}} & \\ \cline{2-4} -& \mbox{Family} & \mbox{Qualifier} & \mbox{Visibility} & & \\ \hline \hline -\mbox{BinID} & \mbox{Term} & \mbox{DocID} & & & \mbox{Weight} \\ \hline -\end{array}$
-\end{center}
-
-Documents or records are mapped into bins by a user-defined ingest application. By
-storing the BinID as the RowID we ensure that all the information for a particular
-bin is contained in a single tablet and hosted on a single TabletServer since
-Accumulo never splits rows across tablets. Storing the Terms as column families
-serves to enable fast lookups of all the documents within this bin that contain the
-given term.
-
-Finally, we perform set intersection operations on the TabletServer via a special
-iterator called the Intersecting Iterator. Since documents are partitioned into many
-bins, a search of all documents must search every bin. We can use the BatchScanner
-to scan all bins in parallel. The Intersecting Iterator should be enabled on a
-BatchScanner within user query code as follows:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-Text[] terms = {new Text("the"), new Text("white"), new Text("house")};
-
-BatchScanner bs = conn.createBatchScanner(table, auths, 20);
-IteratorSetting iter = new IteratorSetting(20, "ii", IntersectingIterator.class);
-IntersectingIterator.setColumnFamilies(iter, terms);
-bs.setRanges(Collections.singleton(new Range()));
-
-for(Entry<Key,Value> entry : bs) {
-    System.out.println(" " + entry.getKey().getColumnQualifier());
-}
-\end{verbatim}\endgroup
-
-This code effectively has the BatchScanner scan all tablets of a table, looking for
-documents that match all the given terms. Because all tablets are being scanned for
-every query, each query is more expensive than other Accumulo scans, which
-typically involve a small number of TabletServers. This reduces the number of
-concurrent queries supported and is subject to what is known as the straggler'
-problem in which every query runs as slow as the slowest server participating.
-
-Of course, fast servers will return their results to the client which can display them
-to the user immediately while they wait for the rest of the results to arrive. If the
-results are unordered this is quite effective as the first results to arrive are as good
-as any others to the user.
-

http://git-wip-us.apache.org/repos/asf/accumulo/blob/900d6abb/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex b/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
deleted file mode 100644
index 203fe0c..0000000
--- a/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
+++ /dev/null
@@ -1,794 +0,0 @@
-
-% Licensed to the Apache Software Foundation (ASF) under one or more
-% contributor license agreements. See the NOTICE file distributed with
-% The ASF licenses this file to You under the Apache License, Version 2.0
-% (the "License"); you may not use this file except in compliance with
-% the License. You may obtain a copy of the License at
-%
-%
-% Unless required by applicable law or agreed to in writing, software
-% WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-% See the License for the specific language governing permissions and
-
-\chapter{Troubleshooting}
-
-\section{Logs}
-
-Q. The tablet server does not seem to be running!? What happened?
-
-Accumulo is a distributed system.  It is supposed to run on remote
-equipment, across hundreds of computers.  Each program that runs on
-these remote computers writes down events as they occur, into a local
-file. By default, this is defined in
-\texttt{\$ACCUMULO\_HOME}/conf/accumule-env.sh as ACCUMULO\_LOG\_DIR. - -A. Look in the \texttt{\$ACCUMULO\_LOG\_DIR}/tserver*.log file.  Specifically, check the end of the file.
-
-Q. The tablet server did not start and the debug log does not exists!  What happened?
-
-When the individual programs are started, the stdout and stderr output
-of these programs are stored in .out'' and .err'' files in
-\texttt{\$ACCUMULO\_LOG\_DIR}. Often, when there are missing configuration -options, files or permissions, messages will be left in these files. - -A. Probably a start-up problem. Look in \texttt{\$ACCUMULO\_LOG\_DIR}/tserver*.err
-
-\section{Monitor}
-
-Q. Accumulo is not working, what's wrong?
-
-There's a small web server that collects information about all the
-components that make up a running Accumulo instance. It will highlight
-unusual or unexpected conditions.
-
-A. Point your browser to the monitor (typically the master host, on port 50095).  Is anything red or yellow?
-
-Q. My browser is reporting connection refused, and I cannot get to the monitor
-
-The monitor program's output is also written to .err and .out files in
-the \texttt{\$ACCUMULO\_LOG\_DIR}. Look for problems in this file if the -\texttt{\$ACCUMULO\_LOG\_DIR/monitor*.log} file does not exist.
-
-A. The monitor program is probably not running.  Check the log files for errors.
-
-Q. My browser hangs trying to talk to the monitor.
-
-Your browser needs to be able to reach the monitor program.  Often
-large clusters are firewalled, or use a VPN for internal
-communications. You can use SSH to proxy your browser to the cluster,
-
-It is sometimes helpful to use a text-only browser to sanity-check the
-monitor while on the machine running the monitor:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $links http://localhost:50095 -\end{verbatim}\endgroup - -A. Verify that you are not firewalled from the monitor if it is running on a remote host. - -Q. The monitor responds, but there are no numbers for tservers and tables. The summary page says the master is down. - -The monitor program gathers all the details about the master and the -tablet servers through the master. It will be mostly blank if the -master is down. - -A. Check for a running master. - -\section{HDFS} - -Accumulo reads and writes to the Hadoop Distributed File System. -Accumulo needs this file system available at all times for normal operations. - -Q. Accumulo is having problems getting a block blk\_1234567890123.'' How do I fix it? - -This troubleshooting guide does not cover HDFS, but in general, you -want to make sure that all the datanodes are running and an fsck check -finds the file system clean: - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ hadoop fsck /accumulo
-\end{verbatim}\endgroup
-
-You can use:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $hadoop fsck /accumulo/path/to/corrupt/file -locations -blocks -files -\end{verbatim}\endgroup - -to locate the block references of individual corrupt files and use those -references to search the name node and individual data node logs to determine which -servers those blocks have been assigned and then try to fix any underlying file -system issues on those nodes. - -On a larger cluster, you may need to increase the number of Xceivers - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} - <property> - <name>dfs.datanode.max.xcievers</name> - <value>4096</value> - </property> -\end{verbatim}\endgroup - -A. Verify HDFS is healthy, check the datanode logs. - -\section{Zookeeper} - -Q. \texttt{accumulo init} is hanging. It says something about talking to zookeeper. - -Zookeeper is also a distributed service. You will need to ensure that -it is up. You can run the zookeeper command line tool to connect to -any one of the zookeeper servers: - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ zkCli.sh -server zoohost
-...
-[zk: zoohost:2181(CONNECTED) 0]
-\end{verbatim}\endgroup
-
-It is important to see the word \texttt{CONNECTED}!  If you only see
-\texttt{CONNECTING} you will need to diagnose zookeeper errors.
-
-A. Check to make sure that zookeeper is up, and that
-\texttt{\$ACCUMULO\_HOME/conf/accumulo-site.xml} has been pointed to -your zookeeper server(s). - -Q. Zookeeper is running, but it does not say \texttt{CONNECTED} - -Zookeeper processes talk to each other to elect a leader. All updates -go through the leader and propagate to a majority of all the other -nodes. If a majority of the nodes cannot be reached, zookeeper will -not allow updates. Zookeeper also limits the number connections to a -server from any other single host. By default, this limit can be as small as 10 -and can be reached in some everything-on-one-machine test configurations. - -You can check the election status and connection status of clients by -asking the zookeeper nodes for their status. You connect to zookeeper -and ask it with the four-letter stat'' command: - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ nc zoohost 2181
-stat
-Zookeeper version: 3.4.5-1392090, built on 09/30/2012 17:52 GMT
-Clients:
- /127.0.0.1:58289[0](queued=0,recved=1,sent=0)
- /127.0.0.1:60231[1](queued=0,recved=53910,sent=53915)
-
-Latency min/avg/max: 0/5/3008
-Sent: 1561592
-Connections: 2
-Outstanding: 0
-Zxid: 0x621a3b
-Mode: standalone
-Node count: 22524
-$-\end{verbatim}\endgroup - - -A. Check zookeeper status, verify that it has a quorum, and has not exceeded maxClientCnxns. - -Q. My tablet server crashed! The logs say that it lost it's zookeeper lock. - -Tablet servers reserve a lock in zookeeper to maintain their ownership -over the tablets that have been assigned to them. Part of their -responsibility for keeping the lock is to send zookeeper a keep-alive -message periodically. If the tablet server fails to send a message in -a timely fashion, zookeeper will remove the lock and notify the tablet -server. If the tablet server does not receive a message from -zookeeper, it will assume its lock has been lost, too. If a tablet -server loses its lock, it kills itself: everything assumes it is dead -already. - -A. Investigate why the tablet server did not send a timely message to -zookeeper. - -\subsection{Keeping the tablet server lock} - -Q. My tablet server lost its lock. Why? - -The primary reason a tablet server loses its lock is that it has been pushed into swap. - -A large java program (like the tablet server) may have a large portion -of its memory image unused. The operation system will favor pushing -this allocated, but unused memory into swap so that the memory can be -re-used as a disk buffer. When the java virtual machine decides to -access this memory, the OS will begin flushing disk buffers to return that -memory to the VM. This can cause the entire process to block long -enough for the zookeeper lock to be lost. - -A. Configure your system to reduce the kernel parameter swappiness'' from the default (60) to zero. - -Q. My tablet server lost its lock, and I have already set swappiness to -zero. Why? - -Be careful not to over-subscribe memory. This can be easy to do if -your accumulo processes run on the same nodes as hadoop's map-reduce -framework. Remember to add up: - -\begin{itemize} -\item{size of the JVM for the tablet server} -\item{size of the in-memory map, if using the native map implementation} -\item{size of the JVM for the data node} -\item{size of the JVM for the task tracker} -\item{size of the JVM times the maximum number of mappers and reducers} -\item{size of the kernel and any support processes} -\end{itemize} - -If a 16G node can run 2 mappers and 2 reducers, and each can be 2G, -then there is only 8G for the data node, tserver, task tracker and OS. - -A. Reduce the memory footprint of each component until it fits comfortably. - -Q. My tablet server lost its lock, swappiness is zero, and my node has lots of unused memory! - -The JVM memory garbage collector may fall behind and cause a -stop-the-world'' garbage collection. On a large memory virtual -machine, this collection can take a long time. This happens more -frequently when the JVM is getting low on free memory. Check the logs -of the tablet server. You will see lines like this: - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -2013-06-20 13:43:20,607 [tabletserver.TabletServer] DEBUG: gc ParNew=0.00(+0.00) secs - ConcurrentMarkSweep=0.00(+0.00) secs freemem=1,868,325,952(+1,868,325,952) totalmem=2,040,135,680 -\end{verbatim}\endgroup - -When freemem'' becomes small relative to the amount of memory -needed, the JVM will spend more time finding free memory than -performing work. This can cause long delays in sending keep-alive -messages to zookeeper. - -A. Ensure the tablet server JVM is not running low on memory. - -\section{Tools} - -The accumulo script can be used to run classes from the command line. -This section shows how a few of the utilities work, but there are many -more. - -There's a class that will examine an accumulo storage file and print -out basic metadata. - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo /accumulo/tables/1/default_tablet/A000000n.rf
-Locality group         : <DEFAULT>
-        Start block          : 0
-        Num   blocks         : 1
-        Index level 0        : 62 bytes  1 blocks
-        First key            : 288be9ab4052fe9e span:34078a86a723e5d3:3da450f02108ced5 [] 1373373521623 false
-        Last key             : start:13fc375709e id:615f5ee2dd822d7a [] 1373373821660 false
-        Num entries          : 466
-        Column families      : [waitForCommits, start, md major compactor 1, md major compactor 2, md major compactor 3,
-                                 bringOnline, prep, md major compactor 4, md major compactor 5, md root major compactor 3,
-                                 minorCompaction, wal, compactFiles, md root major compactor 4, md root major compactor 1,
-                                 md root major compactor 2, compact, id, client:update, span, update, commit, write,
-                                 majorCompaction]
-
-Meta block     : BCFile.index
-      Raw size             : 4 bytes
-      Compressed size      : 12 bytes
-      Compression type     : gz
-
-Meta block     : RFile.index
-      Raw size             : 780 bytes
-      Compressed size      : 344 bytes
-      Compression type     : gz
-\end{verbatim}\endgroup
-
-When trying to diagnose problems related to key size, the PrintInfo tool can provide a histogram of the individual key sizes:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo --histogram /accumulo/tables/1/default_tablet/A000000n.rf -... -Up to size count %-age - 10 : 222 28.23% - 100 : 244 71.77% - 1000 : 0 0.00% - 10000 : 0 0.00% - 100000 : 0 0.00% - 1000000 : 0 0.00% - 10000000 : 0 0.00% - 100000000 : 0 0.00% - 1000000000 : 0 0.00% -10000000000 : 0 0.00% -\end{verbatim}\endgroup - -Likewise, PrintInfo will dump the key-value pairs and show you the contents of the RFile: - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ ./bin/accumulo org.apache.accumulo.core.file.rfile.PrintInfo --dump /accumulo/tables/1/default_tablet/A000000n.rf
-row columnFamily:columnQualifier [visibility] timestamp deleteFlag -> Value
-...
-\end{verbatim}\endgroup
-
-Q. Accumulo is not showing me any data!
-
-A. Do you have your auths set so that it matches your visibilities?
-
-Q. What are my visibilities?
-
-A. Use PrintInfo'' on a representative file to get some idea of the visibilities in the underlying data.
-
-Note that the use of PrintInfo is an administrative tool and can only
-by used by someone who can access the underlying Accumulo data. It
-does not provide the normal access controls in Accumulo.
-
-If you would like to backup, or otherwise examine the contents of Zookeeper, there are commands to dump and load to/from XML.
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$./bin/accumulo org.apache.accumulo.server.util.DumpZookeeper --root /accumulo >dump.xml -$ ./bin/accumulo org.apache.accumulo.server.util.RestoreZookeeper --overwrite < dump.xml
-\end{verbatim}\endgroup
-
-Q. How can I get the information in the monitor page for my cluster monitoring system?
-
-A. Use GetMasterStats:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$./bin/accumulo org.apache.accumulo.test.GetMasterStats | grep Load - OS Load Average: 0.27 -\end{verbatim}\endgroup - -Q. The monitor page is showing an offline tablet. How can I find out which tablet it is? - -A. Use FindOfflineTablets: - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ ./bin/accumulo org.apache.accumulo.server.util.FindOfflineTablets
-2<<@(null,null,localhost:9997) is UNASSIGNED  #walogs:2
-\end{verbatim}\endgroup
-
-Here's what the output means:
-
-\begin{enumerate}
-\item{\texttt{2<<} This is the tablet from (-inf, +inf) for the
-  table with id 2.  tables -l'' in the shell will show table ids for
-  tables.}
-\item{@(null, null, localhost:9997)} Location information.  The
-  format is \texttt{@(assigned, hosted, last)}.  In this case, the
-  tablet has not been assigned, is not hosted anywhere, and was once
-  hosted on localhost.
-\item{\#walogs:2} The number of write-ahead logs that this tablet requires for recovery.
-\end{enumerate}
-
-An unassigned tablet with write-ahead logs is probably waiting for
-logs to be sorted for efficient recovery.
-
-Q. How can I be sure that the metadata tables are up and consistent?
-
-A. \texttt{CheckForMetadataProblems} will verify the start/end of
-every tablet matches, and the start and stop for the table is empty:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$./bin/accumulo org.apache.accumulo.server.util.CheckForMetadataProblems -u root --password -Enter the connection password: -All is well for table !0 -All is well for table 1 -\end{verbatim}\endgroup - -Q. My hadoop cluster has lost a file due to a NameNode failure. How can I remove the file? - -A. There's a utility that will check every file reference and ensure -that the file exists in HDFS. Optionally, it will remove the -reference: - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ ./bin/accumulo org.apache.accumulo.server.util.RemoveEntriesForMissingFiles -u root --password
-2013-07-16 13:10:57,293 [util.RemoveEntriesForMissingFiles] INFO : File /accumulo/tables/2/default_tablet/F0000005.rf
- is missing
-2013-07-16 13:10:57,296 [util.RemoveEntriesForMissingFiles] INFO : 1 files of 3 missing
-\end{verbatim}\endgroup
-
-Q. I have many entries in zookeeper for old instances I no longer need.  How can I remove them?
-
-A. Use CleanZookeeper:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$./bin/accumulo org.apache.accumulo.server.util.CleanZookeeper -\end{verbatim}\endgroup - -This command will not delete the instance pointed to by the local \texttt{conf/accumulo-site.xml} file. - -Q. I need to decommission a node. How do I stop the tablet server on it? - -A. Use the admin command: - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ ./bin/accumulo admin stop hostname:9997
-2013-07-16 13:15:38,403 [util.Admin] INFO : Stopping server 12.34.56.78:9997
-\end{verbatim}\endgroup
-
-Q. I cannot login to a tablet server host, and the tablet server will not shut down.  How can I kill the server?
-
-A. Sometimes you can kill a stuck'' tablet server by deleting it's lock in zookeeper:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-$./bin/accumulo org.apache.accumulo.server.util.TabletServerLocks --list - 127.0.0.1:9997 TSERV_CLIENT=127.0.0.1:9997 -$ ./bin/accumulo org.apache.accumulo.server.util.TabletServerLocks -delete 127.0.0.1:9997
-$./bin/accumulo org.apache.accumulo.server.util.TabletServerLocks -list - 127.0.0.1:9997 null -\end{verbatim}\endgroup - -You can find the master and instance id for any accumulo instances using the same zookeeper instance: - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ ./bin/accumulo org.apache.accumulo.server.util.ListInstances
-INFO : Using ZooKeepers localhost:2181
-
- Instance Name       | Instance ID                          | Master
----------------------+--------------------------------------+-------------------------------
-              "test" | 6140b72e-edd8-4126-b2f5-e74a8bbe323b |                127.0.0.1:9999
-\end{verbatim}\endgroup
-
-
-most tables is contained within the metadata table in the accumulo namespace,
-while metadata for that table is contained in the root table in the accumulo
-namespace. The root table is composed of a single tablet, which does not
-split, so it is also called the root tablet. Information about the root
-table, such as its location and write-ahead logs, are stored in ZooKeeper.
-
-Let's create a table and put some data into it:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-shell> createtable test
-shell> tables -l
-accumulo.root        =>        +r
-test                 =>         2
-trace                =>         1
-shell> insert a b c d
-shell> flush -w
-\end{verbatim}\endgroup
-
-Now let's take a look at the metadata for this table:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-shell> scan -b 3; -e 3<
-3< file:/default_tablet/F000009y.rf []    186,1
-3< last:13fe86cd27101e5 []    127.0.0.1:9997
-3< loc:13fe86cd27101e5 []    127.0.0.1:9997
-3< log:127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995 []    127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995|6
-3< srv:dir []    /default_tablet
-3< srv:flush []    1
-3< srv:lock []    tservers/127.0.0.1:9997/zlock-0000000001$13fe86cd27101e5 -3< srv:time [] M1373998392323 -3< ~tab:~pr [] \x00 -\end{verbatim}\endgroup - -Let's decode this little session: - -\begin{enumerate} -\item{\texttt{scan -b 3; -e 3<}\\ - Every tablet gets its own row. Every row starts with the table id followed by - ;'' or <'', and followed by the end row split point for that tablet.} -\item{\texttt{file:/default\_tablet/F000009y.rf [] 186,1}\\ - File entry for this tablet. This tablet contains a single file reference. The - file is /accumulo/tables/3/default\_tablet/F000009y.rf''. It contains 1 - key/value pair, and is 186 bytes long.} -\item{\texttt{last:13fe86cd27101e5 [] 127.0.0.1:9997}\\ - Last location for this tablet. It was last held on 127.0.0.1:9997, and the - unique tablet server lock data was 13fe86cd27101e5''. The default balancer - will tend to put tablets back on their last location.} -\item{\texttt{loc:13fe86cd27101e5 [] 127.0.0.1:9997}\\ - The current location of this tablet.} -\item{\texttt{log:127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995 [] 127.0. ...}\\ - This tablet has a reference to a single write-ahead log. This file can be found in\\ - /accumulo/wal/127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995. The value - of this entry could refer to multiple files. This tablet's data is encoded as - 6'' within the log.} -\item{\texttt{srv:dir [] /default\_tablet}\\ - Files written for this tablet will be placed into - /accumulo/tables/3/default\_tablet.} -\item{\texttt{srv:flush [] 1}\\ - Flush id. This table has successfully completed the flush with the id of - 1''.} -\item{\texttt{srv:lock [] tservers/127.0.0.1:9997/zlock-0000000001\$13fe86cd27101e5}\\
-  This is the lock information for the tablet holding the present lock.  This
-  information is checked against zookeeper whenever this is updated, which
-  prevents a metadata update from a tablet server that no longer holds its
-  lock.}
-\item{\texttt{srv:time []    M1373998392323} }
-\item{\texttt{\textasciitilde{}tab:\textasciitilde{}pr []    \textbackslash{}x00}\\
-  The end-row marker for the previous tablet (prev-row).  The first byte
-  indicates the presence of a prev-row.  This tablet has the range (-inf, +inf),
-  so it has no prev-row (or end row).}
-\end{enumerate}
-
-Besides these columns, you may see:
-
-\begin{enumerate}
-\item{\texttt{rowId future:zooKeeperID location} Tablet has been assigned to a tablet, but not yet loaded.}
-\item{\texttt{\textasciitilde{}del:filename} When a tablet server is done use a file, it will create a delete marker in the appropriate metadata table, unassociated with any tablet.  The garbage collector will remove the marker, and the file, when no other reference to the file exists.}
-\item{\texttt{rowId loaded:filename} A file has been bulk-loaded into this tablet, however the bulk load has not yet completed on other tablets, so this is marker prevents the file from being loaded multiple times.}
-\item{\texttt{rowId !cloned} A marker that indicates that this tablet has been successfully cloned.}
-\item{\texttt{rowId splitRatio:ratio} A marker that indicates a split is in progress, and the files are being split at the given ratio.}
-\item{\texttt{rowId chopped} A marker that indicates that the files in the tablet do not contain keys outside the range of the tablet.}
-\item{\texttt{rowId scan} A marker that prevents a file from being removed while there are still active scans using it.}
-
-\end{enumerate}
-
-\section{Simple System Recovery}
-
-Q. One of my Accumulo processes died. How do I bring it back?
-
-The easiest way to bring all services online for an Accumulo instance is to run the start-all.sh script.
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $bin/start-all.sh -\end{verbatim}\endgroup - -This process will check the process listing, using jps on each host before attempting to restart a service on the given host. -Typically, this check is sufficient except in the face of a hung/zombie process. For large clusters, it may be -undesirable to ssh to every node in the cluster to ensure that all hosts are running the appropriate processes and start-here.sh may be of use. - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ ssh host_with_dead_process
-  $bin/start-here.sh -\end{verbatim}\endgroup - -start-here.sh should be invoked on the host which is missing a given process. Like start-all.sh, it will start all -necessary processes that are not currently running, but only on the current host and not cluster-wide. Tools such as pssh or -pdsh can be used to automate this process. - -start-server.sh can also be used to start a process on a given host; however, it is not generally recommended for -users to issue this directly as the start-all.sh and start-here.sh scripts provide the same functionality with -more automation and are less prone to user error. - -A. Use start-all.sh or start-here.sh. - -Q. My process died again. Should I restart it via cron or tools like supervisord? - -A. A repeatedly dying Accumulo process is a sign of a larger problem. Typically these problems are due to a -misconfiguration of Accumulo or over-saturation of resources. Blind automation of any service restart inside of Accumulo -is generally an undesirable situation as it is indicative of a problem that is being masked and ignored. Accumulo -processes should be stable on the order of months and not require frequent restart. - - -\section{Advanced System Recovery} - -\subsection{HDFS Failure} -Q. I had disasterous HDFS failure. After bringing everything back up, several tablets refuse to go online. - -Data written to tablets is written into memory before being written into indexed files. In case the server -is lost before the data is saved into a an indexed file, all data stored in memory is first written into a -write-ahead log (WAL). When a tablet is re-assigned to a new tablet server, the write-ahead logs are read to -recover any mutations that were in memory when the tablet was last hosted. - -If a write-ahead log cannot be read, then the tablet is not re-assigned. All it takes is for one of -the blocks in the write-ahead log to be missing. This is unlikely unless multiple data nodes in HDFS have been -lost. - -A. Get the WAL files online and healthy. Restore any data nodes that may be down. - -Q. How do find out which tablets are offline? - -A. Use accumulo admin checkTablets'' - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$ bin/accumulo admin checkTablets
-\end{verbatim}\endgroup
-
-Q. I lost three data nodes, and I'm missing blocks in a WAL.  I don't care about data loss, how
-can I get those tablets online?
-
-See the discussion in section~\ref{sec:metadata}, which shows a typical metadata table listing.
-The entries with a column family of log'' are references to the WAL for that tablet.
-If you know what WAL is bad, you can find all the references with a grep in the shell:
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-shell> grep 0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995
-3< log:127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995 []    127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995|6
-\end{verbatim}\endgroup
-
-A. You can remove the WAL references in the metadata table.
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-shell> grant -u root Table.WRITE -t accumulo.metadata
-shell> delete 3< log 127.0.0.1+9997/0cb7ce52-ac46-4bf7-ae1d-acdcfaa97995
-\end{verbatim}\endgroup
-
-Note: the colon (:'') is omitted when specifying the row cf cq'' for the delete command.
-
-The master will automatically discover the tablet no longer has a bad WAL reference and will
-assign the tablet.  You will need to remove the reference from all the tablets to get them
-online.
-
-
-Q. The metadata (or root) table has references to a corrupt WAL.
-
-This is a much more serious state, since losing updates to the metadata table will result
-in references to old files which may not exist, or lost references to new files, resulting
-in tablets that cannot be read, or large amounts of data loss.
-
-The best hope is to restore the WAL by fixing HDFS data nodes and bringing the data back online.
-If this is not possible, the best approach is to re-create the instance and bulk import all files from
-the old instance into a new tables.
-
-A complete set of instructions for doing this is outside the scope of this guide,
-but the basic approach is:
-
-\begin{itemize}
- \item Use tables -l'' in the shell to discover the table name to table id mapping
- \item Stop all accumulo processes on all nodes
- \item Move the accumulo directory in HDFS out of the way:
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
- $hadoop fs -mv /accumulo /corrupt -\end{verbatim}\endgroup - \item Re-initalize accumulo - \item Recreate tables, users and permissions - \item Import the directories under \texttt{/corrupt/tables/<id>} into the new instance -\end{itemize} - -Q. One or more HDFS Files under /accumulo/tables are corrupt - -Accumulo maintains multiple references into the tablet files in the METADATA -table and within the tablet server hosting the file, this makes it difficult to -reliably just remove those references. - -The directory structure in HDFS for tables will follow the general structure: - -\small -\begin{verbatim} - /accumulo - /accumulo/tables/ - /accumulo/tables/!0 - /accumulo/tables/!0/default_tablet/A000001.rf - /accumulo/tables/!0/t-00001/A000002.rf - /accumulo/tables/1 - /accumulo/tables/1/default_tablet/A000003.rf - /accumulo/tables/1/t-00001/A000004.rf - /accumulo/tables/1/t-00001/A000005.rf - /accumulo/tables/2/default_tablet/A000006.rf - /accumulo/tables/2/t-00001/A000007.rf -\end{verbatim} -\normalsize - -If files under /accumulo/tables are corrupt, the best course of action is to -recover those files in hdsf see the section on HDFS. Once these recovery efforts -have been exhausted, the next step depends on where the missing file(s) are -located. Different actions are required when the bad files are in Accumulo data -table files or if they are metadata table files. - -{\bf Data File Corruption} - -When an Accumulo data file is corrupt, the most reliable way to restore Accumulo -operations is to replace the missing file with an “empty” file so that -references to the file in the METADATA table and within the tablet server -hosting the file can be resolved by Accumulo. An empty file can be created using -the CreateEmpty utiity: - -\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim} -$accumulo org.apache.accumulo.core.file.rfile.CreateEmpty /path/to/empty/file/empty.rf
-\end{verbatim}\endgroup
-
-The process is to delete the corrupt file and then move the empty file into its
-place (The generated empty file can be copied and used multiple times if necessary and does not need
-to be regenerated each time)
-
-\begingroup\fontsize{8pt}{8pt}\selectfont\begin{verbatim}
-  $hadoop fs –rm /accumulo/tables/corrupt/file/thename.rf; \ - hadoop fs -mv /path/to/empty/file/empty.rf /accumulo/tables/corrupt/file/thename.rf -\end{verbatim}\endgroup - -{\bf Metadata File Corruption} - -If the corrupt files are metadata files, see \ref{sec:metadata} (under the path -\begin{verbatim}/accumulo/tables/!0\end{verbatim}) then you will need to rebuild -the metadata table by initializing a new instance of Accumulo and then importing -all of the existing data into the new instance. This is the same procedure as -recovering from a zookeeper failure (see \ref{ZooKeeper Failure}, except that -you will have the benefit of having the existing user and table authorizations -that are maintained in zookeeper. - -You can use the DumpZookeeper utility to save this information for reference -before creating the new instance. You will not be able to use RestoreZookeeper -because the table names and references are likely to be different between the -original and the new instances, but it can serve as a reference. - -A. If the files cannot be recovered, replace corrupt data files with a empty -rfiles to allow references in the metadata table and in the tablet servers to be -resolved. Rebuild the metadata table if the corrupt files are metadata files. - -\subsection{ZooKeeper Failure} -Q. I lost my ZooKeeper quorum (hardware failure), but HDFS is still intact. How can I recover my Accumulo instance? - -ZooKeeper, in addition to its lock-service capabilities, also serves to bootstrap an Accumulo -instance from some location in HDFS. It contains the pointers to the root tablet in HDFS which -is then used to load the Accumulo metadata tablets, which then loads all user tables. ZooKeeper -also stores all namespace and table configuration, the user database, the mapping of table IDs to -table names, and more across Accumulo restarts. - -Presently, the only way to recover such an instance is to initialize a new instance and import all -of the old data into the new instance. The easiest way to tackle this problem is to first recreate -the mapping of table ID to table name and then recreate each of those tables in the new instance. -Set any necessary configuration on the new tables and add some split points to the tables to close -the gap between how many splits the old table had and no splits. - -The directory structure in HDFS for tables will follow the general structure: - -\small -\begin{verbatim} - /accumulo - /accumulo/tables/ - /accumulo/tables/1 - /accumulo/tables/1/default_tablet/A000001.rf - /accumulo/tables/1/t-00001/A000002.rf - /accumulo/tables/1/t-00001/A000003.rf - /accumulo/tables/2/default_tablet/A000004.rf - /accumulo/tables/2/t-00001/A000005.rf -\end{verbatim} -\normalsize - -For each table, make a new directory that you can move (or copy if you have the HDFS space to do so) -all of the rfiles for a given table into. For example, to process the table with an ID of 1, make a new directory, -say /new-table-1 and then copy all files from /accumulo/tables/1/*/*.rf into that directory. Additionally, -make a directory, /new-table-1-failures, for any failures during the import process. Then, issue the import -command using the Accumulo shell into the new table, telling Accumulo to not re-set the timestamp: - -\small -\begin{verbatim} -user@instance new_table> importdirectory /new-table-1 /new-table-1-failures false -\end{verbatim} -\normalsize - -Any RFiles which were failed to be loaded will be placed in /new-table-1-failures. Rfiles that were successfully -imported will no longer exist in /new-table-1. For failures, move them back to the import directory and retry -the importdirectory command. - -It is \textbf{extremely} important to note that this approach may introduce stale data back into -the tables. For a few reasons, RFiles may exist in the table directory which are candidates for deletion but have -not yet been deleted. Additionally, deleted data which was not compacted away, but still exists in write-ahead logs if -the original instance was somehow recoverable, will be re-introduced in the new instance. Table splits and merges -(which also include the deleteRows API call on TableOperations, are also vulnerable to this problem. This process should -\textbf{not} be used if these are unacceptable risks. It is possible to try to re-create a view of the accumulo.metadata -table to prune out files that are candidates for deletion, but this is a difficult task that also may not be entirely accurate. - -Likewise, it is also possible that data loss may occur from write-ahead log (WAL) files which existed on the old table but -were not minor-compacted into an RFile. Again, it may be possible to reconstruct the state of these WAL files to -replay data not yet in an RFile; however, this is a difficult task and is not implemented in any automated fashion. - -A. The importdirectory shell command can be used to import RFiles from the old instance into a newly created instance, -but extreme care should go into the decision to do this as it may result in reintroduction of stale data or the -omission of new data. - -\section{File Naming Conventions} - -Q. Why are files named like they are? Why do some start with C'' and others with F''? - -A. The file names give you a basic idea for the source of the file. - -The base of the filename is a base-36 unique number. All filenames in accumulo are coordinated -with a counter in zookeeper, so they are always unique, which is useful for debugging. - -The leading letter gives you an idea of how the file was created: - -\begin{itemize} - \item F - Flush: entries in memory were written to a file (Minor Compaction) - \item M - Merging compaction: entries in memory were combined with the smallest file to create one new file - \item C - Several files, but not all files, were combined to produce this file (Major Compaction) - \item A - All files were compacted, delete entries were dropped - \item I - Bulk import, complete, sorted index files. Always in a directory starting with "b-" -\end{itemize} - -This simple file naming convention allows you to see the basic structure of the files from just -their filenames, and reason about what should be happening to them next, just -by scanning their entries in the metadata tables. - -For example, if you see multiple files with M'' prefixes, the tablet is, or was, up against it's -maximum file limit, so it began merging memory updates with files to keep the file count reasonable. This -slows down ingest performance, so knowing there are many files like this tells you that the system -is struggling to keep up with ingest vs the compaction strategy which reduces the number of files. - http://git-wip-us.apache.org/repos/asf/accumulo/blob/900d6abb/docs/src/main/latex/accumulo_user_manual/images/data_distribution.png ---------------------------------------------------------------------- diff --git a/docs/src/main/latex/accumulo_user_manual/images/data_distribution.png b/docs/src/main/latex/accumulo_user_manual/images/data_distribution.png deleted file mode 100644 index 7f18d3f..0000000 Binary files a/docs/src/main/latex/accumulo_user_manual/images/data_distribution.png and /dev/null differ http://git-wip-us.apache.org/repos/asf/accumulo/blob/900d6abb/docs/src/main/latex/accumulo_user_manual/images/failure_handling.png ---------------------------------------------------------------------- diff --git a/docs/src/main/latex/accumulo_user_manual/images/failure_handling.png b/docs/src/main/latex/accumulo_user_manual/images/failure_handling.png deleted file mode 100644 index c131de6..0000000 Binary files a/docs/src/main/latex/accumulo_user_manual/images/failure_handling.png and /dev/null differ http://git-wip-us.apache.org/repos/asf/accumulo/blob/900d6abb/pom.xml ---------------------------------------------------------------------- diff --git a/pom.xml b/pom.xml index 78af9c5..2b92402 100644 --- a/pom.xml +++ b/pom.xml @@ -230,7 +230,7 @@ <artifactId>accumulo-docs</artifactId> <version>${project.version}</version>
<classifier>user-manual</classifier>
-        <type>pdf</type>
+        <type>html</type>
</dependency>
<dependency>
<groupId>org.apache.accumulo</groupId>
@@ -594,6 +594,11 @@
</configuration>
</plugin>
<plugin>
+          <groupId>org.asciidoctor</groupId>
+          <artifactId>asciidoctor-maven-plugin</artifactId>
+          <version>0.1.4</version>
+        </plugin>
+        <plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>build-helper-maven-plugin</artifactId>
<version>1.8</version>
@@ -621,11 +626,6 @@
<version>1.2.1</version>
</plugin>
<plugin>
-          <groupId>org.codehaus.mojo</groupId>
-          <artifactId>latex-maven-plugin</artifactId>
-          <version>1.1</version>
-        </plugin>
-        <plugin>
<groupId>org.eclipse.m2e</groupId>
<artifactId>lifecycle-mapping</artifactId>
<version>1.0.0</version>


Mime
View raw message