directory-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From build...@apache.org
Subject svn commit: r887819 - in /websites/staging/directory/trunk/content: ./ mavibot/ mavibot/user-guide/
Date Sat, 23 Nov 2013 18:08:11 GMT
Author: buildbot
Date: Sat Nov 23 18:08:10 2013
New Revision: 887819

Log:
Staging update by buildbot for directory

Added:
    websites/staging/directory/trunk/content/mavibot/user-guide/7-internals.html
    websites/staging/directory/trunk/content/mavibot/user-guide/7.1-logical-structure.html
    websites/staging/directory/trunk/content/mavibot/user-guide/7.2-physical-storage.html
Removed:
    websites/staging/directory/trunk/content/mavibot/user-guide/2-internals.html
    websites/staging/directory/trunk/content/mavibot/user-guide/2.1-logical-structure.html
    websites/staging/directory/trunk/content/mavibot/user-guide/2.2-physical-storage.html
Modified:
    websites/staging/directory/trunk/content/   (props changed)
    websites/staging/directory/trunk/content/mavibot/index.html
    websites/staging/directory/trunk/content/mavibot/user-guide/1-introduction.html

Propchange: websites/staging/directory/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Sat Nov 23 18:08:10 2013
@@ -1 +1 @@
-1542558
+1544851

Modified: websites/staging/directory/trunk/content/mavibot/index.html
==============================================================================
--- websites/staging/directory/trunk/content/mavibot/index.html (original)
+++ websites/staging/directory/trunk/content/mavibot/index.html Sat Nov 23 18:08:10 2013
@@ -149,6 +149,23 @@
 
     
         <div class="news"><h1 id="news">News</h1>
+<h2 class="news">Apache Mavibot 1.0.0-M2 released <em>posted on November 6th, 2013</em></h2>
+
+<p>The Apache Directory team is pleased to announce the release of Apache Mavibot 1.0.0-M2, the second milestone towards a 1.0 version.</p>
+<p><strong>Mavibot</strong> is a Multi Version Concurrency Control (MVCC) BTree in Java. It is expected to be a replacement for JDBM (The current backend for the <strong>Apache Directory Server</strong>), but could be a good fit for any other project in need of a Java MVCC BTree implementation.
+This milestone contains two different BTrees : 
+<em> one for in-memory BTrees
+</em> one for managed BTrees</p>
+<p>The rational for this big modification is that we can't easily have one BTree to gather all the characteristics of both the in-memory and the managed BTrees with oe single class.</p>
+<p>We also have rewrote the way we handle added elements when we reach the end of the memory : we now use a cache instead of depending on wekReferences, which proved to be just way too slow.</p>
+<p>The next milestones will add the missing features :</p>
+<ul>
+<li>bulk load support</li>
+<li>multi-version support with free pages management</li>
+<li>transaction support</li>
+</ul>
+<p><strong>ApacheDS</strong> has already been tested with <strong>Mavibot 1.0.-M2-SNAPSHOT</strong>, and it offers performances twice better than JDBM.</p>
+<p>Downloads are available <a href="downloads.html">here</a></p>
 <h2 class="news">Apache Mavibot 1.0.0-M1 released <em>posted on August 6th, 2013</em></h2>
 
 <p>The Apache Directory team is pleased to announce the release of Apache Mavibot 1.0.0-M1, the first milestone towards a 1.0 version.</p>

Modified: websites/staging/directory/trunk/content/mavibot/user-guide/1-introduction.html
==============================================================================
--- websites/staging/directory/trunk/content/mavibot/user-guide/1-introduction.html (original)
+++ websites/staging/directory/trunk/content/mavibot/user-guide/1-introduction.html Sat Nov 23 18:08:10 2013
@@ -152,7 +152,71 @@
 <p>We hope it will be enough for you to quickly get started, but in any case, if you feel like improving this document, feel free to post your suggestion to the Apache Directory mailing list : any contribution is welcomed !</p>
 <h2 id="contents">Contents</h2>
 <ul>
-<li>[1.1 - .html)</li>
+<li><a href="1.1-ug-btree-basics.html">1.1 - BTree basics</a></li>
+<li><a href="2-btree-types.html">2 - BTree types</a></li>
+<li>In-Memory</li>
+<li>Persistent</li>
+<li>
+<p>Managed</p>
+</li>
+<li>
+<p><a href="3-btree-management.html">3 - BTree management</a></p>
+</li>
+<li>creation</li>
+<li>close</li>
+<li>flush</li>
+<li>
+<p>load</p>
+</li>
+<li>
+<p><a href="4-btree-operations.html">4 - BTree operations</a></p>
+</li>
+<li>browse</li>
+<li>contains</li>
+<li>delete</li>
+<li>get</li>
+<li>getValues</li>
+<li>hasKey</li>
+<li>insert</li>
+<li>
+<p>getRevision</p>
+</li>
+<li>
+<p><a href="5-btree-informations.html">5 - BTree information</a></p>
+</li>
+<li>getComparator</li>
+<li>getFile</li>
+<li>getJournal</li>
+<li>getNbElems</li>
+<li>isAllowDuplicates</li>
+<li>isInMemory</li>
+<li>
+<p>isPersistent</p>
+</li>
+<li>
+<p><a href="6-btree-configuration.html">6 - BTree configuration</a></p>
+</li>
+<li>getKeySerializer</li>
+<li>getKeySerializerFQCN</li>
+<li>setKeySerializer</li>
+<li>getName</li>
+<li>setName</li>
+<li>getPageSize</li>
+<li>setPageSize</li>
+<li>getReadTimeOut</li>
+<li>setReadTimeOut</li>
+<li>getValueSerializer</li>
+<li>getValueSerializerFQCN</li>
+<li>setValueSerializer</li>
+<li>getWriteBufferSize</li>
+<li>
+<p>setWriteBufferSize</p>
+</li>
+<li>
+<p><a href="7-btree-internals.html">7 - BTree internals</a></p>
+</li>
+<li><a href="7.1-logical-structure.html">7.1 - Logical Structure</a></li>
+<li><a href="7.1-physical-structure.html">7.2 - Physical Structure</a></li>
 </ul>
 
 

Added: websites/staging/directory/trunk/content/mavibot/user-guide/7-internals.html
==============================================================================
--- websites/staging/directory/trunk/content/mavibot/user-guide/7-internals.html (added)
+++ websites/staging/directory/trunk/content/mavibot/user-guide/7-internals.html Sat Nov 23 18:08:10 2013
@@ -0,0 +1,181 @@
+<!DOCTYPE html>
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+<html>
+	<head>
+		<title>7 - Mavibot Internals &mdash; Apache Directory</title>
+		
+	    <link href="./../../css/common.css" rel="stylesheet" type="text/css">
+	    <link href="./../../css/turquoise.css" rel="stylesheet" type="text/css">
+    
+        
+        <link rel="shortcut icon" href="./../../images/mavibot-icon_16x16.png">
+    
+        <!-- Google Analytics -->
+        <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"></script>
+        <script type="text/javascript">
+            _uacct = "UA-1358462-1";
+            urchinTracker();
+        </script>
+	</head>
+	<body>
+	    <div id="container">
+            <div id="header">
+                <div id="subProjectsNavBar">
+                    <a href="./../../">
+                        
+                        Apache Directory Project
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../apacheds">
+                        
+                        ApacheDS
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../studio">
+                        
+                        Apache Directory Studio
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../api">
+                        
+                        Apache LDAP API
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../mavibot">
+                        
+                        <STRONG>Mavibot</STRONG>
+                        
+                    </a>
+                </div><!-- subProjectsNavBar -->
+            </div><!-- header -->
+            <div id="content">
+                <div id="leftColumn">
+                    
+<div id="navigation">
+    
+    <h5>Mavibot 1.0</h5>
+    <ul>
+        <li><a href="./../../mavibot/">Home</a></li>
+        <li><a href="./../../mavibot/news.html">News</a></li>
+    </ul>
+    <h5>Downloads</h5>
+    <ul>
+	    <li><a href="./../../mavibot/downloads.html">Version 1.0.0-M2</a>&nbsp;&nbsp;<IMG src="./../../images/new_badge.gif" alt="" style="margin-bottom:-3px;" border="0"></li>
+        <li><a href="./../../mavibot/download-old-versions.html">Older versions</a></li>
+    </ul>
+    <h5>Getting Started</h5>
+    <ul>
+        <li><a href="./../../mavibot/vision.html">Vision</a></li>
+    </ul>
+    <h5>Documentation</h5>
+    <ul>
+        <li><a href="./../../mavibot/five-minutes-tutorial.html">Five minutes tutorial</a></li>
+	<li><a href="./../../mavibot/user-guide.html">User Guide</a></li>
+        <li><a href="./../../mavibot/gen-docs/latest/apidocs/">JavaDocs</a></li>
+        <!--li><a href="./../../mavibot/gen-docs/latest/">Generated Reports</a></li-->
+        <li><a href="./../../mavibot/developer-guide.html">Developer Guide</a></li>
+    </ul>
+    
+    
+    <h5>Support</h5>
+    <ul>
+        <li><a href="./../../mailing-lists-and-irc.html">Mailing Lists &amp; IRC</a></li>
+        <li><a href="./../../sources.html">Sources</a></li>
+        <li><a href="./../../issue-tracking.html">Issue Tracking</a></li>
+        <li><a href="./../../commercial-support.html">Commercial Support</a></li>
+    </ul>
+    <h5>Community</h5>
+    <ul>
+        <li><a href="./../../contribute.html">How to Contribute</a></li>
+        <li><a href="./../../team.html">Team</a></li>
+        <li><a href="./../../original-project-proposal.html">Original Project Proposal</a></li>
+        <li><a href="./../../special-thanks.html" class="external-link" rel="nofollow">Special Thanks</a></li>
+    </ul>
+    <h5>About Apache</h5>
+    <ul>
+        <li><a href="http://www.apache.org/">Apache</a></li>
+        <li><a href="http://www.apache.org/licenses/">License</a></li>
+        <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>
+        <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+        <li><a href="http://www.apache.org/security/">Security</a></li>
+    </ul>
+    <a href="http://ldapcon.org" target="_blank" rel="nofollow"><img src="./../../images/banner-ldapcon-2013.png" alt="LDAPCon 2013" width="167" height="231"></a>
+    
+</div><!-- navigation -->
+
+                </div><!-- leftColumn -->
+                <div id="rightColumn">
+
+
+    <div class="nav">
+        <div class="nav_prev">
+        
+			&nbsp;
+        
+        </div>
+        <div class="nav_up">
+        
+            <a href="../user-guide.html">User Guide</a>
+		
+        </div>
+        <div class="nav_next">
+        
+            <a href="7.1-logical-structure.html">7.1 - Logical Structure</a>
+		
+        </div>
+        <div class="clearfix"></div>
+    </div>
+
+
+<h1 id="7-mavibot-internals">7 - Mavibot Internals</h1>
+<p>TODO</p>
+
+
+    <div class="nav">
+        <div class="nav_prev">
+        
+			&nbsp;
+        
+        </div>
+        <div class="nav_up">
+        
+            <a href="../user-guide.html">User Guide</a>
+		
+        </div>
+        <div class="nav_next">
+        
+            <a href="7.1-logical-structure.html">7.1 - Logical Structure</a>
+		
+        </div>
+        <div class="clearfix"></div>
+    </div>
+
+
+                </div><!-- rightColumn -->
+                <div id="endContent"></div>
+            </div><!-- content -->
+            <div id="footer">&copy; 2003-2012, <a href="http://www.apache.org">The Apache Software Foundation</a> - <a href="./../../privacy-policy.html">Privacy Policy</a><br />
+                Apache Directory, ApacheDS, Apache Directory Server, Apache Directory Studio, Apache LDAP API, Apache Triplesec, Triplesec, Apache Mavibot, Mavibot, Apache, the Apache feather logo, and the Apache Directory project logos are trademarks of The Apache Software Foundation.
+            </div>
+        </div><!-- container -->
+    </body>
+</html>

Added: websites/staging/directory/trunk/content/mavibot/user-guide/7.1-logical-structure.html
==============================================================================
--- websites/staging/directory/trunk/content/mavibot/user-guide/7.1-logical-structure.html (added)
+++ websites/staging/directory/trunk/content/mavibot/user-guide/7.1-logical-structure.html Sat Nov 23 18:08:10 2013
@@ -0,0 +1,227 @@
+<!DOCTYPE html>
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+<html>
+	<head>
+		<title>7.1 - Logical Structure &mdash; Apache Directory</title>
+		
+	    <link href="./../../css/common.css" rel="stylesheet" type="text/css">
+	    <link href="./../../css/turquoise.css" rel="stylesheet" type="text/css">
+    
+        
+        <link rel="shortcut icon" href="./../../images/mavibot-icon_16x16.png">
+    
+        <!-- Google Analytics -->
+        <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"></script>
+        <script type="text/javascript">
+            _uacct = "UA-1358462-1";
+            urchinTracker();
+        </script>
+	</head>
+	<body>
+	    <div id="container">
+            <div id="header">
+                <div id="subProjectsNavBar">
+                    <a href="./../../">
+                        
+                        Apache Directory Project
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../apacheds">
+                        
+                        ApacheDS
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../studio">
+                        
+                        Apache Directory Studio
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../api">
+                        
+                        Apache LDAP API
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../mavibot">
+                        
+                        <STRONG>Mavibot</STRONG>
+                        
+                    </a>
+                </div><!-- subProjectsNavBar -->
+            </div><!-- header -->
+            <div id="content">
+                <div id="leftColumn">
+                    
+<div id="navigation">
+    
+    <h5>Mavibot 1.0</h5>
+    <ul>
+        <li><a href="./../../mavibot/">Home</a></li>
+        <li><a href="./../../mavibot/news.html">News</a></li>
+    </ul>
+    <h5>Downloads</h5>
+    <ul>
+	    <li><a href="./../../mavibot/downloads.html">Version 1.0.0-M2</a>&nbsp;&nbsp;<IMG src="./../../images/new_badge.gif" alt="" style="margin-bottom:-3px;" border="0"></li>
+        <li><a href="./../../mavibot/download-old-versions.html">Older versions</a></li>
+    </ul>
+    <h5>Getting Started</h5>
+    <ul>
+        <li><a href="./../../mavibot/vision.html">Vision</a></li>
+    </ul>
+    <h5>Documentation</h5>
+    <ul>
+        <li><a href="./../../mavibot/five-minutes-tutorial.html">Five minutes tutorial</a></li>
+	<li><a href="./../../mavibot/user-guide.html">User Guide</a></li>
+        <li><a href="./../../mavibot/gen-docs/latest/apidocs/">JavaDocs</a></li>
+        <!--li><a href="./../../mavibot/gen-docs/latest/">Generated Reports</a></li-->
+        <li><a href="./../../mavibot/developer-guide.html">Developer Guide</a></li>
+    </ul>
+    
+    
+    <h5>Support</h5>
+    <ul>
+        <li><a href="./../../mailing-lists-and-irc.html">Mailing Lists &amp; IRC</a></li>
+        <li><a href="./../../sources.html">Sources</a></li>
+        <li><a href="./../../issue-tracking.html">Issue Tracking</a></li>
+        <li><a href="./../../commercial-support.html">Commercial Support</a></li>
+    </ul>
+    <h5>Community</h5>
+    <ul>
+        <li><a href="./../../contribute.html">How to Contribute</a></li>
+        <li><a href="./../../team.html">Team</a></li>
+        <li><a href="./../../original-project-proposal.html">Original Project Proposal</a></li>
+        <li><a href="./../../special-thanks.html" class="external-link" rel="nofollow">Special Thanks</a></li>
+    </ul>
+    <h5>About Apache</h5>
+    <ul>
+        <li><a href="http://www.apache.org/">Apache</a></li>
+        <li><a href="http://www.apache.org/licenses/">License</a></li>
+        <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>
+        <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+        <li><a href="http://www.apache.org/security/">Security</a></li>
+    </ul>
+    <a href="http://ldapcon.org" target="_blank" rel="nofollow"><img src="./../../images/banner-ldapcon-2013.png" alt="LDAPCon 2013" width="167" height="231"></a>
+    
+</div><!-- navigation -->
+
+                </div><!-- leftColumn -->
+                <div id="rightColumn">
+
+
+    <div class="nav">
+        <div class="nav_prev">
+        
+            <a href="7-internals.html">7 - Mavibot Internals</a>
+		
+        </div>
+        <div class="nav_up">
+        
+            <a href="7-internals.html">7 - Mavibot Internals</a>
+		
+        </div>
+        <div class="nav_next">
+        
+            <a href="7.2-physical-storage.html">7.2 - Physical storage</a>
+		
+        </div>
+        <div class="clearfix"></div>
+    </div>
+
+
+<h1 id="71-logical-structure">7.1 - Logical Structure</h1>
+<p><strong>Mavibot</strong> stores data into <em>BTree</em>s, and we may manage many <em>BTree</em>s, so we have to define the right data structure to handle those data.</p>
+<p>We can have three different ways to use <strong>Mavibot</strong> :
+<em> using in-memory </em>BTree<em>s (IN-MEMORY)
+</em> using in-memory <em>BTree</em>s stored on disk (PERSISTED)
+<em> storing the </em>BTree<em>s on disk (so called managed </em>BTree*s) (MANAGED)</p>
+<h2 id="in-memory-btrees">In Memory BTrees</h2>
+<p>They are <em>BTree</em>s stored in memory : as soon as you quit your program, all the stored data will bo lost. The biggest advantage is that it's fast.</p>
+<p>As <em>Mavibot</em> is handling <strong>MVCC</strong> <em>BTree</em>s, you have to keep in maind that for each modification, we copy pages and values, so the <em>BTree</em>s will quickly grow and eat the memory. On the other hand, copied data which are not anymore in use will be discarded automatically. The beauty of having a garbage collector is that we don't have to take care of those copied data : if they are not any more referenced by any objects using the <em>BTree</em>, they will be reclaimed by the GC.</p>
+<p>The following schema shows what is the logical data structure whe using a in memory <em>BTree</em> :</p>
+<p><img alt="In-Memory BTree" src="images/InMemoryBTree.png" /></p>
+<h2 id="persistent-btrees">Persistent BTrees</h2>
+<p>A persistent <em>BTree</em> is a <em>BTree</em> which can be flushed on disk on demand. The <em>BTree</em> is a in-Memory <em>BTree</em>, but when you close it, all of its content latest revision is serialized on disk. You can re-read it when needed.</p>
+<p>Otherwise, there is no difference with an in-memory <em>BTree</em></p>
+<h2 id="managed-btrees">Managed BTrees</h2>
+<p>Managed <em>BTree</em>s are very different : we will keep a updated version of the <em>BTree</em> on disk after each modifciation. even if the program crashes, you have the guarantee that the disk will contain everything needed to recover the <em>BTree</em> as it was just before the crash.</p>
+<p>This is important to understand that we don't keep all the <em>BTree</em> in memory when it's managed, but instead we try to limit the elements we load in memory. In other words, there is no guarantee whatsoever that you will have any pat of the <em>BTree</em> in memory, except the root page, so that means <strong>Mavibot</strong> may have to fetch some missing data from disk at any moment.</p>
+<p>Obviously this approach have big pros and cons :</p>
+<p>Pros :
+<em> there is no limit but the available disk you have to the number of elements you can store in your </em>BTree<em>
+</em> your <em>BTree</em> will always be consistent, even if you have a crash
+* you can stop your application and restart it, your data are still around</p>
+<p>Cons :
+<em> as your data may not be present in memory, it cost a lot to fetch them from disk
+</em> as we have to take care of missing data, accessing them requires an extra layer of accessor to deal wth the fact they may be on disk, costing some extra memory</p>
+<p>Here, this is just a question of tradeoff : depending on your memory size, and the level of robustness you want, you may decide to go for a in-memory <em>BTree</em>, a persistent <em>BTree</em> or a managed one. Most of the time, though, managed <em>BTree</em> is what you want to use.</p>
+<p>Also note that we use internal cache to speed up the data access. This cache and its size can be configured.</p>
+<p>We will see how we manage <em>BTree</em>s internally.</p>
+<h3 id="users-btrees">User's BTrees</h3>
+<p>Managed user's <em>BTree</em>s are stored using <em>Nodes</em> and <em>Leaves</em>. A <em>Node</em> contains only keys are references to underlaying nodes or leaves. A <em>Leaf</em> contans keys and values. As we don't want to eat too much memory, the references to nodes, meaves, keys and values are stored as offset, read and translated to java objects on demand. For instance, we keep an offset to a key until someone needs to access the key, then we deserialize this key and store it in memory. This is the very same for references to nodes, leaves or values.</p>
+<p>Here is a schema describing this mechanism :</p>
+<p><img alt="Managed references" src="images/managedReferences.png" /></p>
+<p>In this schema, we have only loaded two pages in memory : the node and one leaf. In these pages, the keys aren't yet objects, we are pointing to the page's raw data, except for the <strong>D</strong> key which is already a Java Object (it has been deserialized). The very same for the references to the leaves : we have only loaded and deserialized one single leaf, the one containing the value <strong>D</strong>. In this leaf, the keys aren't deserialized except the <strong>D</strong> key, and the only value which is a Java instance is the deserialized <strong>vD</strong> value.</p>
+<p>So each elements is an instance of an encapsulating object which contains the offset of the serialized element in a byte[], and the deserialized value if the value has already been accessed.</p>
+<h3 id="special-btrees">Special BTrees</h3>
+<p>We have two special <em>BTree</em>s we use to manage the revisions and the copied pages. We will explain what they are good for</p>
+<h4 id="revision-tree">Revision tree</h4>
+<p>We need to keep a track of each active revision, so that a search can work with a specific revision. The idea is that when a search starts, it uses the latest revision, but as some modification can occur while the search is bieng processed, some new revisions will be added. In some case, we may want to keep a revision active for quite a long time.</p>
+<p>So we store the active revisions in a dedicated <em>BTree</em>.</p>
+<p>As we may have many <em>BTree</em>s, we have to use a key which is a combinaison of the <em>BTree</em> name and its revision. So the revision <em>BTree</em> manage the revisions of all the managed <em>BTree</em>s.</p>
+<p>When a revision is not anymore used, we can remove it from the revision <em>BTree</em>.</p>
+<p>This <em>BTree</em> is not a <strong>MVCC</strong> <em>BTree</em>, like the other ones. In other words, we only keep the latest revision of this <em>BTree</em> (ie, all the modified pages are immediately freed)</p>
+<h4 id="copied-pages-btree">Copied pages BTree</h4>
+<p>Once we create a new revision, the pages we copied are not anymore in use except if the revisions they are associated with are still in use. The problem is that we can't discard those pages and move them to the free list until the associated revision is free.</p>
+<p>We use a dedicated <em>BTree</em> to keep a track of the copied pages, which will be reclaimed and moved to the free pages list once the associated revision will be released.</p>
+<h3 id="managing-the-free-pages">Managing the free pages</h3>
+<p>We have a mechanism to manage the <em>PageIO</em> that are not anymore in use. This is a linked list in which the free pages are added. If we need some page, we first look into this list, and get back as many <em>PageIO</em>s as we need - until we reach the end of this list. If we free some page, we add them at the end of the free list.</p>
+<p>We always free a logical page, which may be stored into many <em>PageIO</em>s. The good thing is that those <em>PageIO</em>s are already linked, so we just need to make the last free <em>PageIO</em> to point on the first freed <em>PageIO</em>, and to move the pointer to the last free page to the last <em>PageIO</em> used to store the logical page.</p>
+
+
+    <div class="nav">
+        <div class="nav_prev">
+        
+            <a href="7-internals.html">7 - Mavibot Internals</a>
+		
+        </div>
+        <div class="nav_up">
+        
+            <a href="7-internals.html">7 - Mavibot Internals</a>
+		
+        </div>
+        <div class="nav_next">
+        
+            <a href="7.2-physical-storage.html">7.2 - Physical storage</a>
+		
+        </div>
+        <div class="clearfix"></div>
+    </div>
+
+
+                </div><!-- rightColumn -->
+                <div id="endContent"></div>
+            </div><!-- content -->
+            <div id="footer">&copy; 2003-2012, <a href="http://www.apache.org">The Apache Software Foundation</a> - <a href="./../../privacy-policy.html">Privacy Policy</a><br />
+                Apache Directory, ApacheDS, Apache Directory Server, Apache Directory Studio, Apache LDAP API, Apache Triplesec, Triplesec, Apache Mavibot, Mavibot, Apache, the Apache feather logo, and the Apache Directory project logos are trademarks of The Apache Software Foundation.
+            </div>
+        </div><!-- container -->
+    </body>
+</html>

Added: websites/staging/directory/trunk/content/mavibot/user-guide/7.2-physical-storage.html
==============================================================================
--- websites/staging/directory/trunk/content/mavibot/user-guide/7.2-physical-storage.html (added)
+++ websites/staging/directory/trunk/content/mavibot/user-guide/7.2-physical-storage.html Sat Nov 23 18:08:10 2013
@@ -0,0 +1,261 @@
+<!DOCTYPE html>
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+<html>
+	<head>
+		<title>7.2 - Physical storage &mdash; Apache Directory</title>
+		
+	    <link href="./../../css/common.css" rel="stylesheet" type="text/css">
+	    <link href="./../../css/turquoise.css" rel="stylesheet" type="text/css">
+    
+        
+        <link rel="shortcut icon" href="./../../images/mavibot-icon_16x16.png">
+    
+        <!-- Google Analytics -->
+        <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"></script>
+        <script type="text/javascript">
+            _uacct = "UA-1358462-1";
+            urchinTracker();
+        </script>
+	</head>
+	<body>
+	    <div id="container">
+            <div id="header">
+                <div id="subProjectsNavBar">
+                    <a href="./../../">
+                        
+                        Apache Directory Project
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../apacheds">
+                        
+                        ApacheDS
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../studio">
+                        
+                        Apache Directory Studio
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../api">
+                        
+                        Apache LDAP API
+                        
+                    </a>
+                    &nbsp;|&nbsp;
+                    <a href="./../../mavibot">
+                        
+                        <STRONG>Mavibot</STRONG>
+                        
+                    </a>
+                </div><!-- subProjectsNavBar -->
+            </div><!-- header -->
+            <div id="content">
+                <div id="leftColumn">
+                    
+<div id="navigation">
+    
+    <h5>Mavibot 1.0</h5>
+    <ul>
+        <li><a href="./../../mavibot/">Home</a></li>
+        <li><a href="./../../mavibot/news.html">News</a></li>
+    </ul>
+    <h5>Downloads</h5>
+    <ul>
+	    <li><a href="./../../mavibot/downloads.html">Version 1.0.0-M2</a>&nbsp;&nbsp;<IMG src="./../../images/new_badge.gif" alt="" style="margin-bottom:-3px;" border="0"></li>
+        <li><a href="./../../mavibot/download-old-versions.html">Older versions</a></li>
+    </ul>
+    <h5>Getting Started</h5>
+    <ul>
+        <li><a href="./../../mavibot/vision.html">Vision</a></li>
+    </ul>
+    <h5>Documentation</h5>
+    <ul>
+        <li><a href="./../../mavibot/five-minutes-tutorial.html">Five minutes tutorial</a></li>
+	<li><a href="./../../mavibot/user-guide.html">User Guide</a></li>
+        <li><a href="./../../mavibot/gen-docs/latest/apidocs/">JavaDocs</a></li>
+        <!--li><a href="./../../mavibot/gen-docs/latest/">Generated Reports</a></li-->
+        <li><a href="./../../mavibot/developer-guide.html">Developer Guide</a></li>
+    </ul>
+    
+    
+    <h5>Support</h5>
+    <ul>
+        <li><a href="./../../mailing-lists-and-irc.html">Mailing Lists &amp; IRC</a></li>
+        <li><a href="./../../sources.html">Sources</a></li>
+        <li><a href="./../../issue-tracking.html">Issue Tracking</a></li>
+        <li><a href="./../../commercial-support.html">Commercial Support</a></li>
+    </ul>
+    <h5>Community</h5>
+    <ul>
+        <li><a href="./../../contribute.html">How to Contribute</a></li>
+        <li><a href="./../../team.html">Team</a></li>
+        <li><a href="./../../original-project-proposal.html">Original Project Proposal</a></li>
+        <li><a href="./../../special-thanks.html" class="external-link" rel="nofollow">Special Thanks</a></li>
+    </ul>
+    <h5>About Apache</h5>
+    <ul>
+        <li><a href="http://www.apache.org/">Apache</a></li>
+        <li><a href="http://www.apache.org/licenses/">License</a></li>
+        <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>
+        <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+        <li><a href="http://www.apache.org/security/">Security</a></li>
+    </ul>
+    <a href="http://ldapcon.org" target="_blank" rel="nofollow"><img src="./../../images/banner-ldapcon-2013.png" alt="LDAPCon 2013" width="167" height="231"></a>
+    
+</div><!-- navigation -->
+
+                </div><!-- leftColumn -->
+                <div id="rightColumn">
+
+
+    <div class="nav">
+        <div class="nav_prev">
+        
+            <a href="7.1-logical-structure.html">7.1 - Logical Structure</a>
+		
+        </div>
+        <div class="nav_up">
+        
+            <a href="../user-guide.html">User Guide</a>
+		
+        </div>
+        <div class="nav_next">
+        
+			&nbsp;
+        
+        </div>
+        <div class="clearfix"></div>
+    </div>
+
+
+<h1 id="72-physical-storage">7.2 - Physical storage</h1>
+<p>When associated with a RecordManager, Mavibot stores all the Btrees in one single file, which is split in many physical pages, all having the same size. </p>
+<blockquote>
+<p><strong>Note</strong>
+Currently, the choice was to use one single size for all the pages, regardless the data we store into them. The rationnal is to
+get close to the OS page size (frequently 512 bytes or 4096 bytes). This is not necessarily the best choice though, let's say 
+it's something we might want to change later.</p>
+</blockquote>
+<h2 id="general-file-structure">General file structure</h2>
+<p>The file we use to store the data is a plain binary file, used to store all the BTrees. We can store many btrees in one single file.</p>
+<p>This file is considered as a fileSystem, with fixed size 'pages' (a page is an array of bytes). The page size is arbitrary fixed when the RecordManager is created, and we will store every logical data n those physical pages, which will require to spread the logical data in many pages in most of the cases.</p>
+<h3 id="pageio">PageIO</h3>
+<p>Let's first introduce the <em>PageIO</em>, which is used to store the data on disk.</p>
+<p>A <em>PageIO</em> contains some raw data. As we have to map some logical data that may be wider than a physical fixed size <em>PageIO</em>, we use potentially more than one <em>PageIO</em> to store the data, and we link the <em>PageIO</em>s alltogether.</p>
+<p>Each <em>PageIO</em> has a height bytes pointer at the beginning, pointing to the next PageIO (or to nothing, if there is no more <em>PageIO</em> in the chain), plus an extra 4 bytes on the first <em>PageIO</em> to define the number of bytes stored in the chain of PageIO. Here is the mapping between a logical page and some PageIOs :</p>
+<p><img alt="PageIO mapping" src="images/PageIOLogical.png" /></p>
+<p>Every <em>PageIO</em>s are contiguous on disk, but the <em>PageIO</em>s used to store a logical page may be located anywhere on the disk, they don't have to be continuous.</p>
+<p>Here is the structure of a <em>PageIO</em> on disk :</p>
+<ul>
+<li>next page offset (8 bytes) : the offset of the next <em>PageIO</em>, or -1L if no more <em>PageIO</em> is needed</li>
+<li>data size (4 bytes) : for the first <em>PageIO</em>, the size of the stored data across all the <em>PageIO</em>s used to store a page.</li>
+<li>data (N bytes) : a block of data, which size will be min( PageSize - offset - data size, data size) for the first <em>PageIO</em> or min( PageSize - offset, data size) for any other <em>PageIO</em>s</li>
+</ul>
+<h2 id="logical-structure-mapping-on-disk">Logical structure mapping on disk</h2>
+<p>We will now describe how each logical structure is serialized on disk.</p>
+<h3 id="recordmanager-header">RecordManager header</h3>
+<p>We keep a few bytes at the beginning of the file to store some critical information about the RecordManager. Here is the list of stored informations :</p>
+<ul>
+<li>The <em>PageIO</em> size (in bytes)</li>
+<li>The number of managed BTrees</li>
+<li>The offset of the first free page</li>
+<li>The offset of the last free page</li>
+</ul>
+<p>Here is a picture that shows the header content :</p>
+<p><img alt="RecordManager header" src="images/RMHEader.png" /></p>
+<p>We keep a track of the free pages (a free page is a PageIO that is not anymore used, for instance because the data have been deleted.) This is done by keeping a link between each PageIO and by pointing to the first feee PageIO and to the last free PageIO of this list.</p>
+<blockquote>
+<p><strong>Note</strong> We might get rid of the last free page offset.</p>
+</blockquote>
+<p>At startup, of course, we have no free pages, and those pointers contain the -1 offset.</p>
+<p>This header is stored in a <em>PageIO</em>, at the very beginning of the file.</p>
+<h3 id="the-recordmanager-structure">The RecordManager structure</h3>
+<p>The <em>RecordManager</em> manages <em>BTree</em>s, and we have to store them into <em>PageIO</em>s. How do we do that ?</p>
+<p>All the <em>BTree</em>s have a header that contains many informations about them, and point to a <em>rootPage</em> which is the current root (so the root for the latest revision). As a <em>RecordManager</em> can manage more than one <em>BTree</em>, we have to find a way to retreive all the <em>BTree</em>s at startup : we use an internal link, so that a <em>BTree</em> points to the next btree. At startup, we read the first <em>BTree</em> which is stored in the second <em>PageIO</em> in the file (so just after the RecordManager header), then we read the next <em>BTree</em> pointed by the first <em>BTree</em>, and so on.</p>
+<h4 id="the-btree-header">The BTree header</h4>
+<p>Each <em>BTree</em> has to keep many informations so that it can be used. Those informations are :</p>
+<ul>
+<li>revision (8 bytes) : the current revision for this <em>BTree</em>. This value is updated after each modification in the <em>BTree</em>.</li>
+<li>nbElems (8 bytes) : the total number of elements we have in the <em>BTree</em>. This is updated after each modification either.</li>
+<li>rootPage offset (8 bytes) : the position in the file where the <em>rootPage</em> is stored</li>
+<li>nextBtree offset (8 bytes) : the position of the next <em>BTree</em> header in the file (or -1 if we don't have any other <em>BTree</em>)</li>
+<li>pageSize (4 bytes) : the number of elements we cans store in a <em>Node</em> or a <em>Leaf</em>. It's not related in any possible way with the <em>PageIO</em> size.</li>
+<li>nameSize (4 bytes) : The <em>BTree</em> name size</li>
+<li>name (nameSize bytes) : the <em>BTree</em> name</li>
+<li>keySerializerSize (4 bytes) : The size of the java <em>FQCN</em> for the key serializer</li>
+<li>keySerializer (keySerializerSize bytes) : The java <em>FQCN</em> for the key serializer</li>
+<li>valueSerializerSize (4 bytes) : The size of the java <em>FQCN</em> for the value serializer</li>
+<li>valueSerializer (valueSerializerSize bytes): The java <em>FQCN</em> for the value serializer</li>
+<li>dupsAllowed (1 byte): tells if the <em>BTree</em> can have duplicated values.</li>
+</ul>
+<p>As we can see, thi sheader can have various length, and if one one the names is long, we may need more than one PageIOs to store it.</p>
+<p>Here is a diagram which present this data structure on disk :</p>
+<p><img alt="BTreeHeader header" src="images/btreeHeader.png" /></p>
+<p>Note that a <em>BTree</em> header can be stored on one or many <em>IOPage</em>s, depending on its size.</p>
+<p>All in all, when we have more than one <em>BTree</em> stored in the file, the content of the file which stores the <em>BTree</em> headers will look like this one :</p>
+<p><img alt="BTrees" src="images/BTree.png" /></p>
+<p>Note that each <em>BTreeHeader</em> has at least one root page, even if it contains no data. In this schema, we show the root page just after the <em>BTree</em> it is associated to, but after a few updates, the root page may perfectly well be stored elswhere on the disk.</p>
+<h4 id="the-nodes-and-leaves">The Nodes and Leaves</h4>
+<p>Nodes and Leaves are logical <em>BTree</em> pages which are serialized on disk into one to many <em>PageIO</em>s. They have slightly different data structures, as <em>Node</em>s contains pointers to <em>Leaves</em>, and no data, while <em>Leaves</em> contains data. In any case, both contain the keys. The <em>Node</em> has one ore value than the <em>Leaf</em>, too.</p>
+<p>On disk, each <em>Node</em> and <em>Leaf</em> are stored in <em>PageIO</em>s, as we said. A <em>Node</em> will have pointers to some other logical pages, and on disk, those pointers will be offset of the first <em>PageIO</em> used to store the logical page it points to.</p>
+<p>Here is the <em>Node</em> and <em>Leaf</em> data structures once serialized :</p>
+<p><img alt="Node and Leaf" src="images/nodeLeaf.png" /></p>
+<p>Note that we store the size of the serialized data : this is necessary as we have to know how many <em>PageIO</em>s will be needed to store the logical page.</p>
+<p>The <em>rootPage</em> is just a <em>Node</em> or a <em>Leaf</em>.</p>
+<h4 id="potential-improvement">Potential improvement</h4>
+<p>We can get better performance by serializing the data differently. Instead of storing keys and values as byte arrays prefixed by their length, we could store an array of keys and values' offsets before the associated byte[]. Here is the resulting data structure, once serialized :</p>
+<p><img alt="Node and Leaf, improved" src="images/nodeLeaf2.png" /></p>
+<p>(The <em>Node</em> is not described, as it's basically the same data structure, but with one extra value).</p>
+<p>It does not need more space to serialize the data this way, as the offsets are ints, and in the previous version, those ints are used to store the length of the keys and values anyway.</p>
+<p>The gain is that we can have access to a given key and value without having to read all the previous keys and values. Also we can now read a leaf or a node without having to deserialize all the keys and values they contain.</p>
+<h2 id="page-serialization">Page serialization</h2>
+<p>We serialize <em>Node</em> and <em>Leaf</em> differently on disk, as seen in a previois paragraph.</p>
+
+
+    <div class="nav">
+        <div class="nav_prev">
+        
+            <a href="7.1-logical-structure.html">7.1 - Logical Structure</a>
+		
+        </div>
+        <div class="nav_up">
+        
+            <a href="../user-guide.html">User Guide</a>
+		
+        </div>
+        <div class="nav_next">
+        
+			&nbsp;
+        
+        </div>
+        <div class="clearfix"></div>
+    </div>
+
+
+                </div><!-- rightColumn -->
+                <div id="endContent"></div>
+            </div><!-- content -->
+            <div id="footer">&copy; 2003-2012, <a href="http://www.apache.org">The Apache Software Foundation</a> - <a href="./../../privacy-policy.html">Privacy Policy</a><br />
+                Apache Directory, ApacheDS, Apache Directory Server, Apache Directory Studio, Apache LDAP API, Apache Triplesec, Triplesec, Apache Mavibot, Mavibot, Apache, the Apache feather logo, and the Apache Directory project logos are trademarks of The Apache Software Foundation.
+            </div>
+        </div><!-- container -->
+    </body>
+</html>



Mime
View raw message