directory-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r873181 - in /websites/staging/directory/trunk/content: ./ mavibot/index.html
Date Tue, 06 Aug 2013 12:45:55 GMT
Author: buildbot
Date: Tue Aug  6 12:45:54 2013
New Revision: 873181

Log:
Staging update by buildbot for directory

Modified:
    websites/staging/directory/trunk/content/   (props changed)
    websites/staging/directory/trunk/content/mavibot/index.html

Propchange: websites/staging/directory/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Tue Aug  6 12:45:54 2013
@@ -1 +1 @@
-1510937
+1510938

Modified: websites/staging/directory/trunk/content/mavibot/index.html
==============================================================================
--- websites/staging/directory/trunk/content/mavibot/index.html (original)
+++ websites/staging/directory/trunk/content/mavibot/index.html Tue Aug  6 12:45:54 2013
@@ -123,7 +123,30 @@
 
 
 
-    <p>test</p>
+    <h1 id="apache-mavibotwzxhzdk0">Apache Mavibot&trade;</h1>
+<p>Mavibot is a MVCC B+Tree implementation in Java.</p>
+<h2 id="btree-basics">Btree basics</h2>
+<p>A <em>Btree</em> is a data structure that stores <em><Key,
Value></em> tuples in a tree, with the guarantee that the tree will be ordered, and
that the depth of the tree is the same for all the leaves. A <em>Btree</em> has
nodes and leaves (with the only exception of a <em>Btree</em> with only a root
page). The nodes are used to route to the underlying values, and have children. Leaves don't
have children.</p>
+<p>Nodes and leaves have a maximum number of elements stored into them, and when they
are full, they are split. If the split is done on a leaf, we may have to reorganize the tree
so that either we can move some elements up and keep the tree at the current height, or we
may have to reorganize the full tree so that all the leaves are at the same level, which will
then be one deeper (if we added some value) than the tree before the split.</p>
+<h2 id="btree-vs-btree">Btree vs B+Tree</h2>
+<p>The difference between those two data structures is that <em>Btree</em>
store values in the nodes, when <em>B+Tree</em> do store all the values in leaves.</p>
+<p>At first glance, we can say that finding a value in a <em>Btree</em>
will be faster, as we may not go down to the leaves to find it. OTOH, a <em>B+Tree</em>
has many advantages, but the two major advantages are :</p>
+<ul>
+<li>we don't need to go up in the tree to browse the tree when searching for more than
one value, we can just read the leaves, as they are chained.</li>
+<li>We will have smaller nodes, so we can cache more of the tree pages than if we have
values in the nodes.</li>
+</ul>
+<p>Those two big advantages make the <em>B+Tree</em> more interesting to
use than the simpler <em>Btree</em>.</p>
+<p>(See <a href="http://en.wikipedia.org/wiki/B%2B_tree">Wikipedia page on B+tree</a>
and <a href="http://en.wikipedia.org/wiki/B-tree">Wikipedia page on Btree</a>
)</p>
+<h2 id="mvcc">MVCC</h2>
+<p><a href="http://en.wikipedia.org/wiki/Multiversion_concurrency_control">MVCC</a>
(Multi Version Concurrency Control) is a way to provide concurrent access to the <em>Btree</em>
(it's extensively used in many other areas, like programming languages and transactional memory).
The main idea is to create a new version of the tree each time we do a modification. It also
allows the reorganization of the data on the fly, but this is an extra benefit.</p>
+<p>The way it works is that when you do a search on the tree, you first acquire the
current revision. Even if the search is taking a while, because it fetches many values, the
tree will remain unchanged for the selected revision</p>
+<p>Any modification done on the tree will first create a new revision, and the modified
pages will first be copied, so that the previous versions will still be available for any
search operation being executed at the same time.</p>
+<p>It has three direct consequences :</p>
+<ul>
+<li>first, a search will always return 'outdated' values, in the way that new data
won't be returned, as they will be stored in a version which is newer.</li>
+<li>Second, and more important, we don't need any lock to access the data when doing
a search, as there is no possible modification on a versioned tree.</li>
+<li>Third, concurrent modifications are thus limited, as we want to be sure that we
don't override some modification done by another thread. They are ways to mitigate this constraints,
but in most of the case, it's acceptable.</li>
+</ul>
     
         <div class="news"><h1 id="news">News</h1>
 <h2 id="apache-mavibot-has-been-moved-from-apache-labs-to-pache-directory-project-posted-on-august-6th-2013">Apache
Mavibot has been moved from Apache Labs to Pache Directory project <em>posted on August
6th, 2013</em></h2></div>



Mime
View raw message