directory-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r1544851 - in /directory/site/trunk/content/mavibot/user-guide: 1-introduction.mdtext 2-internals.mdtext 2.1-logical-structure.mdtext 2.2-physical-storage.mdtext 7-internals.mdtext 7.1-logical-structure.mdtext 7.2-physical-storage.mdtext
Date Sat, 23 Nov 2013 18:07:55 GMT
Author: elecharny
Date: Sat Nov 23 18:07:54 2013
New Revision: 1544851

Added some pages 

      - copied, changed from r1539289, directory/site/trunk/content/mavibot/user-guide/2-internals.mdtext
      - copied, changed from r1539289, directory/site/trunk/content/mavibot/user-guide/2.2-physical-storage.mdtext

Modified: directory/site/trunk/content/mavibot/user-guide/1-introduction.mdtext
--- directory/site/trunk/content/mavibot/user-guide/1-introduction.mdtext (original)
+++ directory/site/trunk/content/mavibot/user-guide/1-introduction.mdtext Sat Nov 23 18:07:54
@@ -30,5 +30,53 @@ We hope it will be enough for you to qui
 ## Contents
-*  [1.1 - .html)
+*  [1.1 - BTree basics](1.1-ug-btree-basics.html)
+*  [2 - BTree types](2-btree-types.html)
+  * In-Memory
+  * Persistent
+  * Managed
+*  [3 - BTree management](3-btree-management.html)
+  * creation
+  * close
+  * flush
+  * load
+*  [4 - BTree operations](4-btree-operations.html)
+  * browse
+  * contains
+  * delete
+  * get
+  * getValues
+  * hasKey
+  * insert
+  * getRevision
+*  [5 - BTree information](5-btree-informations.html)
+  * getComparator
+  * getFile
+  * getJournal
+  * getNbElems
+  * isAllowDuplicates
+  * isInMemory
+  * isPersistent
+*  [6 - BTree configuration](6-btree-configuration.html)
+  * getKeySerializer
+  * getKeySerializerFQCN
+  * setKeySerializer
+  * getName
+  * setName
+  * getPageSize
+  * setPageSize
+  * getReadTimeOut
+  * setReadTimeOut
+  * getValueSerializer
+  * getValueSerializerFQCN
+  * setValueSerializer
+  * getWriteBufferSize
+  * setWriteBufferSize
+* [7 - BTree internals](7-btree-internals.html)
+  * [7.1 - Logical Structure](7.1-logical-structure.html)
+  * [7.2 - Physical Structure](7.1-physical-structure.html)

Copied: directory/site/trunk/content/mavibot/user-guide/7-internals.mdtext (from r1539289,
--- directory/site/trunk/content/mavibot/user-guide/2-internals.mdtext (original)
+++ directory/site/trunk/content/mavibot/user-guide/7-internals.mdtext Sat Nov 23 18:07:54
@@ -1,8 +1,8 @@
-Title: 1 - Introduction
+Title: 7 - Mavibot Internals
 NavUp: ../user-guide.html
 NavUpText: User Guide
-NavNext: 2-.html
-NavNextText: 2 - 
+NavNext: 7.1-logical-structure.html
+NavNextText: 7.1 - Logical Structure
 Notice: Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
@@ -20,15 +20,6 @@ Notice: Licensed to the Apache Software 
     specific language governing permissions and limitations
     under the License.
-# 1 - Introduction
-This Mavibot Guide goal is to provide some clue for any developer wanting to use Mavibot.

-It also gives some insights about how Mavibot is built and how it works.
-We hope it will be enough for you to quickly get started, but in any case, if you feel like
improving this document, feel free to post your suggestion to the Apache Directory mailing
list : any contribution is welcomed !
-## Contents
-*  [1.1 - .html)
+# 7 - Mavibot Internals
\ No newline at end of file

Added: directory/site/trunk/content/mavibot/user-guide/7.1-logical-structure.mdtext
--- directory/site/trunk/content/mavibot/user-guide/7.1-logical-structure.mdtext (added)
+++ directory/site/trunk/content/mavibot/user-guide/7.1-logical-structure.mdtext Sat Nov 23
18:07:54 2013
@@ -0,0 +1,115 @@
+Title: 7.1 - Logical Structure
+NavUp: 7-internals.html
+NavUpText: 7 - Mavibot Internals
+NavNext: 7.2-physical-storage.html
+NavNextText: 7.2 - Physical storage
+NavPrev: 7-internals.html
+NavPrevText: 7 - Mavibot Internals
+Notice: Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+    .
+    .
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+# 7.1 - Logical Structure
+**Mavibot** stores data into *BTree*s, and we may manage many *BTree*s, so we have to define
the right data structure to handle those data.
+We can have three different ways to use **Mavibot** :
+* using in-memory *BTree*s (IN-MEMORY)
+* using in-memory *BTree*s stored on disk (PERSISTED)
+* storing the *BTree*s on disk (so called managed *BTree*s) (MANAGED)
+## In Memory BTrees
+They are *BTree*s stored in memory : as soon as you quit your program, all the stored data
will bo lost. The biggest advantage is that it's fast.
+As *Mavibot* is handling **MVCC** *BTree*s, you have to keep in maind that for each modification,
we copy pages and values, so the *BTree*s will quickly grow and eat the memory. On the other
hand, copied data which are not anymore in use will be discarded automatically. The beauty
of having a garbage collector is that we don't have to take care of those copied data : if
they are not any more referenced by any objects using the *BTree*, they will be reclaimed
by the GC.
+The following schema shows what is the logical data structure whe using a in memory *BTree*
+![In-Memory BTree](images/InMemoryBTree.png)
+## Persistent BTrees
+A persistent *BTree* is a *BTree* which can be flushed on disk on demand. The *BTree* is
a in-Memory *BTree*, but when you close it, all of its content latest revision is serialized
on disk. You can re-read it when needed.
+Otherwise, there is no difference with an in-memory *BTree*
+## Managed BTrees
+Managed *BTree*s are very different : we will keep a updated version of the *BTree* on disk
after each modifciation. even if the program crashes, you have the guarantee that the disk
will contain everything needed to recover the *BTree* as it was just before the crash.
+This is important to understand that we don't keep all the *BTree* in memory when it's managed,
but instead we try to limit the elements we load in memory. In other words, there is no guarantee
whatsoever that you will have any pat of the *BTree* in memory, except the root page, so that
means **Mavibot** may have to fetch some missing data from disk at any moment.
+Obviously this approach have big pros and cons :
+Pros :
+* there is no limit but the available disk you have to the number of elements you can store
in your *BTree*
+* your *BTree* will always be consistent, even if you have a crash
+* you can stop your application and restart it, your data are still around
+Cons :
+* as your data may not be present in memory, it cost a lot to fetch them from disk
+* as we have to take care of missing data, accessing them requires an extra layer of accessor
to deal wth the fact they may be on disk, costing some extra memory
+Here, this is just a question of tradeoff : depending on your memory size, and the level
of robustness you want, you may decide to go for a in-memory *BTree*, a persistent *BTree*
or a managed one. Most of the time, though, managed *BTree* is what you want to use.
+Also note that we use internal cache to speed up the data access. This cache and its size
can be configured.
+We will see how we manage *BTree*s internally.
+### User's BTrees
+Managed user's *BTree*s are stored using *Nodes* and *Leaves*. A *Node* contains only keys
are references to underlaying nodes or leaves. A *Leaf* contans keys and values. As we don't
want to eat too much memory, the references to nodes, meaves, keys and values are stored as
offset, read and translated to java objects on demand. For instance, we keep an offset to
a key until someone needs to access the key, then we deserialize this key and store it in
memory. This is the very same for references to nodes, leaves or values.
+Here is a schema describing this mechanism :
+![Managed references](images/managedReferences.png)
+In this schema, we have only loaded two pages in memory : the node and one leaf. In these
pages, the keys aren't yet objects, we are pointing to the page's raw data, except for the
**D** key which is already a Java Object (it has been deserialized). The very same for the
references to the leaves : we have only loaded and deserialized one single leaf, the one containing
the value **D**. In this leaf, the keys aren't deserialized except the **D** key, and the
only value which is a Java instance is the deserialized **vD** value.
+So each elements is an instance of an encapsulating object which contains the offset of the
serialized element in a byte[], and the deserialized value if the value has already been accessed.
+### Special BTrees
+We have two special *BTree*s we use to manage the revisions and the copied pages. We will
explain what they are good for
+#### Revision tree
+We need to keep a track of each active revision, so that a search can work with a specific
revision. The idea is that when a search starts, it uses the latest revision, but as some
modification can occur while the search is bieng processed, some new revisions will be added.
In some case, we may want to keep a revision active for quite a long time.
+So we store the active revisions in a dedicated *BTree*.
+As we may have many *BTree*s, we have to use a key which is a combinaison of the *BTree*
name and its revision. So the revision *BTree* manage the revisions of all the managed *BTree*s.
+When a revision is not anymore used, we can remove it from the revision *BTree*.
+This *BTree* is not a **MVCC** *BTree*, like the other ones. In other words, we only keep
the latest revision of this *BTree* (ie, all the modified pages are immediately freed)
+#### Copied pages BTree
+Once we create a new revision, the pages we copied are not anymore in use except if the revisions
they are associated with are still in use. The problem is that we can't discard those pages
and move them to the free list until the associated revision is free.
+We use a dedicated *BTree* to keep a track of the copied pages, which will be reclaimed and
moved to the free pages list once the associated revision will be released.
+### Managing the free pages
+We have a mechanism to manage the *PageIO* that are not anymore in use. This is a linked
list in which the free pages are added. If we need some page, we first look into this list,
and get back as many *PageIO*s as we need - until we reach the end of this list. If we free
some page, we add them at the end of the free list.
+We always free a logical page, which may be stored into many *PageIO*s. The good thing is
that those *PageIO*s are already linked, so we just need to make the last free *PageIO* to
point on the first freed *PageIO*, and to move the pointer to the last free page to the last
*PageIO* used to store the logical page.

Copied: directory/site/trunk/content/mavibot/user-guide/7.2-physical-storage.mdtext (from
r1539289, directory/site/trunk/content/mavibot/user-guide/2.2-physical-storage.mdtext)
--- directory/site/trunk/content/mavibot/user-guide/2.2-physical-storage.mdtext (original)
+++ directory/site/trunk/content/mavibot/user-guide/7.2-physical-storage.mdtext Sat Nov 23
18:07:54 2013
@@ -1,10 +1,8 @@
-Title: 2.2 - Physical storage
+Title: 7.2 - Physical storage
 NavUp: ../user-guide.html
 NavUpText: User Guide
-NavPrev: 2.1-logical-structure.html
-NavPrevText: 2.1 - Logical Structure
-NavNext: 2-.html
-NavNextText: 2 - 
+NavPrev: 7.1-logical-structure.html
+NavPrevText: 7.1 - Logical Structure
 Notice: Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
@@ -22,7 +20,7 @@ Notice: Licensed to the Apache Software 
     specific language governing permissions and limitations
     under the License.
-# 2.2 - Physical storage
+# 7.2 - Physical storage
 When associated with a RecordManager, Mavibot stores all the Btrees in one single file, which
is split in many physical pages, all having the same size. 
@@ -146,4 +144,6 @@ It does not need more space to serialize
 The gain is that we can have access to a given key and value without having to read all the
previous keys and values. Also we can now read a leaf or a node without having to deserialize
all the keys and values they contain.
+## Page serialization
+We serialize *Node* and *Leaf* differently on disk, as seen in a previois paragraph.

View raw message