# accumulo-commits mailing list archives

##### Site index · List index
Message view
Top
From els...@apache.org
Subject [05/23] git commit: ACCUMULO-2441 outline the file prefix conventions
Date Tue, 11 Mar 2014 18:25:14 GMT
ACCUMULO-2441 outline the file prefix conventions

Project: http://git-wip-us.apache.org/repos/asf/accumulo/repo
Commit: http://git-wip-us.apache.org/repos/asf/accumulo/commit/0297276e
Tree: http://git-wip-us.apache.org/repos/asf/accumulo/tree/0297276e
Diff: http://git-wip-us.apache.org/repos/asf/accumulo/diff/0297276e

Commit: 0297276e692d117cd515ec31d1ca1412570e4785
Parents: 4fabfba
Author: Eric Newton <eric.newton@gmail.com>
Authored: Fri Mar 7 19:05:56 2014 -0500
Committer: Eric Newton <eric.newton@gmail.com>
Committed: Fri Mar 7 19:05:56 2014 -0500

----------------------------------------------------------------------
.../chapters/troubleshooting.tex                | 29 ++++++++++++++++++++
1 file changed, 29 insertions(+)
----------------------------------------------------------------------

http://git-wip-us.apache.org/repos/asf/accumulo/blob/0297276e/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
----------------------------------------------------------------------
diff --git a/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex b/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
index 8ba7176..18d472f 100644
--- a/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
+++ b/docs/src/main/latex/accumulo_user_manual/chapters/troubleshooting.tex
@@ -599,3 +599,32 @@ but the basic approach is:
\item Recreate tables, users and permissions
\item Import the directories under \texttt{/corrupt/tables/<id>} into the new instance
\end{itemize}
+
+\section{File Naming Conventions}
+
+Q. Why are files named like they are? Why do some start with C'' and others with F''?
+
+A. The file names give you a basic idea for the source of the file.
+
+The base of the filename is a base-36 unique number. All filenames in accumulo are coordinated

+with a counter in zookeeper, so they are always unique, which is useful for debugging.
+
+The leading letter gives you an idea of how the file was created:
+
+\begin{itemize}
+ \item F - Flush: entries in memory were written to a file (Minor Compaction)
+ \item M - Merging compaction: entries in memory were combined with the smallest file to
create one new file
+ \item C - Several files, but not all files, were combined to produce this file (Major Compaction)
+ \item A - All files were compacted, delete entries were dropped
+ \item I - Bulk import, complete, sorted index files. Always in a directory starting with
"b-"
+\end{itemize}
+
+This simple file naming convention allows you to see the basic structure of the files from
just
+their filenames, and reason about what should be happening to them next, just
+by scanning their entries in the metadata tables.
+
+For example, if you see multiple files with M'' prefixes, the tablet is, or was, up against
it's
+maximum file limit, so it began merging memory updates with files to keep the file count
reasonable.  This
+slows down ingest performance, so knowing there are many files like this tells you that the
system
+is struggling to keep up with ingest vs the compaction strategy which reduces the number
of files.
+


Mime
View raw message