directory-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kayyag...@apache.org
Subject svn commit: r1592385 - /directory/site/trunk/content/mavibot/user-guide/7.1-logical-structure.mdtext
Date Sun, 04 May 2014 17:10:44 GMT
Author: kayyagari
Date: Sun May  4 17:10:44 2014
New Revision: 1592385

URL: http://svn.apache.org/r1592385
Log:
re-wrote parts of the text

Modified:
    directory/site/trunk/content/mavibot/user-guide/7.1-logical-structure.mdtext

Modified: directory/site/trunk/content/mavibot/user-guide/7.1-logical-structure.mdtext
URL: http://svn.apache.org/viewvc/directory/site/trunk/content/mavibot/user-guide/7.1-logical-structure.mdtext?rev=1592385&r1=1592384&r2=1592385&view=diff
==============================================================================
--- directory/site/trunk/content/mavibot/user-guide/7.1-logical-structure.mdtext (original)
+++ directory/site/trunk/content/mavibot/user-guide/7.1-logical-structure.mdtext Sun May 
4 17:10:44 2014
@@ -24,7 +24,7 @@ Notice: Licensed to the Apache Software 
 
 # 7.1 - Logical Structure
 
-**Mavibot** stores data into *BTree*s, and we may manage many *BTree*s, so we have to define
the right data structure to handle those data.
+**Mavibot** stores data in one or more *BTree*s, and defines a few more internal data structures
to handle the data and *BTree*s.
 
 We can have three different ways to use **Mavibot** :
 * using in-memory *BTree*s (IN-MEMORY)
@@ -34,82 +34,79 @@ We can have three different ways to use 
 
 ## In Memory BTrees
 
-They are *BTree*s stored in memory : as soon as you quit your program, all the stored data
will bo lost. The biggest advantage is that it's fast.
+They are *BTree*s stored in memory : as soon as you quit your program, all the stored data
will be lost. The biggest advantage is that it is fast.
 
-As *Mavibot* is handling **MVCC** *BTree*s, you have to keep in maind that for each modification,
we copy pages and values, so the *BTree*s will quickly grow and eat the memory. On the other
hand, copied data which are not anymore in use will be discarded automatically. The beauty
of having a garbage collector is that we don't have to take care of those copied data : if
they are not any more referenced by any objects using the *BTree*, they will be reclaimed
by the GC.
+As *Mavibot* is handling **MVCC** *BTree*s, you have to keep in mind that for each modification,
we copy pages and values, hence the *BTree*s will quickly grow and use a lot of memory. On
the other hand, copied data which are not anymore in use will be discarded automatically.
The beauty of having a garbage collector is that we don't have to take care of this copied
data, i.e., if they are not any more referenced by any objects using the *BTree*, they will
be reclaimed by the GC.
 
-The following schema shows what is the logical data structure whe using a in memory *BTree*
:
+The below diagram shows the logical representation of an in-memory *BTree* :
 
 ![In-Memory BTree](images/InMemoryBTree.png)
 
 ## Persistent BTrees
 
-A persistent *BTree* is a *BTree* which can be flushed on disk on demand. The *BTree* is
a in-Memory *BTree*, but when you close it, all of its content latest revision is serialized
on disk. You can re-read it when needed.
+A persistent *BTree* is a *BTree* which can be flushed to disk on demand. The *BTree* is
a in-Memory *BTree*, but while closing it, then content of the latest revision is serialized
on disk. The data can be loaded while opening a persistent BTree.
 
-Otherwise, there is no difference with an in-memory *BTree*
+Othe than that, there is no difference between an in-memory *BTree* and a persistent *BTree*.
 
 ## Managed BTrees
 
-Managed *BTree*s are very different : we will keep a updated version of the *BTree* on disk
after each modifciation. even if the program crashes, you have the guarantee that the disk
will contain everything needed to recover the *BTree* as it was just before the crash.
+Managed *BTree*s are very different : data is guaranteed to be preserved on disk after each
modifciation, even when the program crashes, it is guaranteed that the disk will contain everything
needed to recover the *BTree* to the state it was in just before the crash.
 
-This is important to understand that we don't keep all the *BTree* in memory when it's managed,
but instead we try to limit the elements we load in memory. In other words, there is no guarantee
whatsoever that you will have any pat of the *BTree* in memory, except the root page, so that
means **Mavibot** may have to fetch some missing data from disk at any moment.
+This is important to understand that in managed mode, not all *BTree*s (of a mavibot database)
are kept in memory. In other words, all nodes, except the *root* node, of a BTree may or may
not be present at the time of accessing. **Mavibot** will fetch these nodes from disk when
needed.
 
-Obviously this approach have big pros and cons :
+Obviously this approach has both pros and cons :
 
 Pros :
-* there is no limit but the available disk you have to the number of elements you can store
in your *BTree*
-* your *BTree* will always be consistent, even if you have a crash
-* you can stop your application and restart it, your data are still around
+* there is no limit on the number of elements one can store in a BTree, except on the available
disk space 
+* A *BTree* will always be consistent, even if there was a crash
+* data durability is gauranteed
 
 Cons :
-* as your data may not be present in memory, it cost a lot to fetch them from disk
-* as we have to take care of missing data, accessing them requires an extra layer of accessor
to deal wth the fact they may be on disk, costing some extra memory
+* reads might be costly when the data is not present in memory, due to fetching data from
disk 
+* accessing the data from disk requires an extra layer of accessor code, this costs some
extra memory
 
-Here, this is just a question of tradeoff : depending on your memory size, and the level
of robustness you want, you may decide to go for a in-memory *BTree*, a persistent *BTree*
or a managed one. Most of the time, though, managed *BTree* is what you want to use.
+Here, this is just a question of tradeoff : depending on the existing memory size, and the
level of robustness needed, one may decide to go for an in-memory *BTree*, a persistent *BTree*
or a managed one. Most of the time, though, managed *BTree* is what you want to use.
 
 Also note that we use internal cache to speed up the data access. This cache and its size
can be configured.
 
-We will see how we manage *BTree*s internally.
+Managed *BTree*s are stored using *Nodes* and *Leaves*. A *Node* contains only keys or references
to underlaying nodes or leaves. A *Leaf* contains keys and values. As we don't want to eat
too much memory, the references to nodes, leaves, keys and values are stored as offset, read
and translated to java objects on demand. For instance, we keep an offset to a key until someone
needs to access the key, then we deserialize this key and store it in memory. This is the
very same for references to nodes, leaves or values.
 
-### User's BTrees
+Here is a schematic describing this structure :
 
-Managed user's *BTree*s are stored using *Nodes* and *Leaves*. A *Node* contains only keys
are references to underlaying nodes or leaves. A *Leaf* contans keys and values. As we don't
want to eat too much memory, the references to nodes, meaves, keys and values are stored as
offset, read and translated to java objects on demand. For instance, we keep an offset to
a key until someone needs to access the key, then we deserialize this key and store it in
memory. This is the very same for references to nodes, leaves or values.
+![Managed references](images/managedReferences.png)
 
-Here is a schema describing this mechanism :
+In this BTree, only two pages are present in memory : one node and one leaf. In these pages,
the keys aren't yet objects, they are pointing to the page's raw data, except for the **D**
key and it's value, they were loaded and deserialized.
 
-![Managed references](images/managedReferences.png)
+Here each element, contains an offset and the byte[] of the serialized value or the deserialized
value if the value has already been accessed.
 
-In this schema, we have only loaded two pages in memory : the node and one leaf. In these
pages, the keys aren't yet objects, we are pointing to the page's raw data, except for the
**D** key which is already a Java Object (it has been deserialized). The very same for the
references to the leaves : we have only loaded and deserialized one single leaf, the one containing
the value **D**. In this leaf, the keys aren't deserialized except the **D** key, and the
only value which is a Java instance is the deserialized **vD** value.
+### User's BTrees
 
-So each elements is an instance of an encapsulating object which contains the offset of the
serialized element in a byte[], and the deserialized value if the value has already been accessed.
+These are the BTrees that are created by the user and these trees hold the data.
 
 ### Special BTrees
 
-We have two special *BTree*s we use to manage the revisions and the copied pages. We will
explain what they are good for
-
+These are the two special *BTree*s used internally to manage the revisions and the copied
pages.
 
 #### Revision tree
 
-We need to keep a track of each active revision, so that a search can work with a specific
revision. The idea is that when a search starts, it uses the latest revision, but as some
modification can occur while the search is bieng processed, some new revisions will be added.
In some case, we may want to keep a revision active for quite a long time.
-
-So we store the active revisions in a dedicated *BTree*.
+Mavibot uses this tree to keep track of each active revision, so that a search can work with
a specific revision. The idea is that when a search starts, it uses the latest revision, but
while the search is being processed a new modification can occur which creates a new revision.
And also sometimes, we may want to keep a revision active for quite a long time.
 
-As we may have many *BTree*s, we have to use a key which is a combinaison of the *BTree*
name and its revision. So the revision *BTree* manage the revisions of all the managed *BTree*s.
+This revision *BTree* manages the revisions of all the managed *BTree*s.
+The key of the revision btree is a combination of the *BTree* name and its revision. 
 
-When a revision is not anymore used, we can remove it from the revision *BTree*.
-
-This *BTree* is not a **MVCC** *BTree*, like the other ones. In other words, we only keep
the latest revision of this *BTree* (ie, all the modified pages are immediately freed)
+When a revision is not anymore used, it can be removed from the revision *BTree*.
 
+Unlike all other user btrees the revision *BTree* is not a **MVCC** *BTree*. In other words,
only the latest revision of the revision btree is preserved(i.e, all the modified pages are
immediately freed)
 
 #### Copied pages BTree
 
-Once we create a new revision, the pages we copied are not anymore in use except if the revisions
they are associated with are still in use. The problem is that we can't discard those pages
and move them to the free list until the associated revision is free.
-
-We use a dedicated *BTree* to keep a track of the copied pages, which will be reclaimed and
moved to the free pages list once the associated revision will be released.
+Once a new revision is created, the pages that were copied are not anymore in use except
if the revisions they are associated with are still in use. These pages cannot be discarded
and moved 
+to the free list until the associated revision is free.
 
+A dedicated *BTree* is used to keep track of the copied pages, which will be reclaimed and
moved to the free pages list once the associated revision gets released.
 
 ### Managing the free pages
 
-We have a mechanism to manage the *PageIO* that are not anymore in use. This is a linked
list in which the free pages are added. If we need some page, we first look into this list,
and get back as many *PageIO*s as we need - until we reach the end of this list. If we free
some page, we add them at the end of the free list.
+There is a mechanism to manage the *PageIO* that are not anymore in use. This is a linked
list in which the free pages are added. Whenever a new page(s) is needed this list is searched
first and reclaim as many *PageIO*s as needed - until the end of this list is reached. When
a page gets freed that will be added at the end of the free page list.
 
-We always free a logical page, which may be stored into many *PageIO*s. The good thing is
that those *PageIO*s are already linked, so we just need to make the last free *PageIO* to
point on the first freed *PageIO*, and to move the pointer to the last free page to the last
*PageIO* used to store the logical page.
+Note that only logical pages are released, which may be stored in many *PageIO*s. These *PageIO*s
are already linked, hence while adding this logical page to the free page list, the last existing
free *PageIO* will be modified to point to the first freed *PageIO* of this logical page,
and update the pointer of the last free page to the last *PageIO* of this logical page.



Mime
View raw message