directory-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From elecha...@apache.org
Subject svn commit: r1546410 - in /directory/site/trunk/content/mavibot: user-guide.mdtext user-guide/7.3-serializations.mdtext
Date Thu, 28 Nov 2013 17:53:54 GMT
Author: elecharny
Date: Thu Nov 28 17:53:54 2013
New Revision: 1546410

URL: http://svn.apache.org/r1546410
Log:
Updated a page

Modified:
    directory/site/trunk/content/mavibot/user-guide.mdtext
    directory/site/trunk/content/mavibot/user-guide/7.3-serializations.mdtext

Modified: directory/site/trunk/content/mavibot/user-guide.mdtext
URL: http://svn.apache.org/viewvc/directory/site/trunk/content/mavibot/user-guide.mdtext?rev=1546410&r1=1546409&r2=1546410&view=diff
==============================================================================
--- directory/site/trunk/content/mavibot/user-guide.mdtext (original)
+++ directory/site/trunk/content/mavibot/user-guide.mdtext Thu Nov 28 17:53:54 2013
@@ -88,3 +88,4 @@ We are quite interested to improve the c
 * [7 - BTree internals](user-guide/7-btree-internals.html)
     * [7.1 - Logical Structure](user-guide/7.1-logical-structure.html)
     * [7.2 - Physical Structure](user-guide/7.1-physical-structure.html)
+    * [7.3 - Serializations](user-guide/7.3-serializations.html)

Modified: directory/site/trunk/content/mavibot/user-guide/7.3-serializations.mdtext
URL: http://svn.apache.org/viewvc/directory/site/trunk/content/mavibot/user-guide/7.3-serializations.mdtext?rev=1546410&r1=1546409&r2=1546410&view=diff
==============================================================================
--- directory/site/trunk/content/mavibot/user-guide/7.3-serializations.mdtext (original)
+++ directory/site/trunk/content/mavibot/user-guide/7.3-serializations.mdtext Thu Nov 28 17:53:54
2013
@@ -127,7 +127,7 @@ Returns the _raw_ field. This method is 
 
 ### ValueHolder
 
-The _ValueHolder_ data structure will store the list of values associated with a key. As
we may have more than one value, we use an internal structure for that purpose.
+The _ValueHolder_ data structure will store the list of values associated with a key. As
we may have more than one value, we use an internal structure for that purpose. This is a
complex data structure, if we compare it with the _KeyHolder_ one.
 
 In some case, the number of values to store is really big, this we need to use an internal
data structure that allows a quick retrieval of a value, plus we need to be able to copy a
page containing such a value in an efficient way. For these reasons, we use two different
internal data structures :
 * an array up to a threshold
@@ -145,6 +145,53 @@ When we reach the threshold, the array i
 
 It's important to know that the sub-BTree will hold only keys, and no values. The sub-btree
Keys will be the values we have to store.
 
+Here is the description of this class :
+
+<pre>
+public class ValueHolder<V> implements Cloneable
+{
+    /** The deserialized value */
+    private V[] valueArray;
+
+    /** The BTree storing multiple value, if we have moe than a threashold values */
+    private BTree<V, V> valueBtree;
+
+    /** The serialized value */
+    private byte[] raw;
+
+    /** A flag set to true if the values are stored in a BTree */
+    private boolean isSubBtree = false;
+
+    /** The RecordManager */
+    private BTree<?, V> btree;
+
+    /** The Value serializer */
+    private ElementSerializer<V> valueSerializer;
+
+    /** An internal flag used when the values are not yet deserialized */
+    private boolean isRaw = true;
+}
+</pre>
+
+As we can see, we use two different fields to store the data, either the _valueArray_ field
or the _valueBtree_ field. The _raw_ field contains the serialized values when the values
are stored in an array; It's null either if the values are stored in a BTree, or if we already
have deserialized the values.
+
+#### Raw/deserialized values
+
+One key for obtaining good performances is to avoid any useless deserialization. This is
easy to implement for the _KeyHolder_, as we only store one single key. For values, it's slightly
more complex, as we may have more than one value. The following rules should be followed :
+
+* don't deserialized until necessary (ie, when one need to get one value)
+* don't serialize when unnedded (ie until the values must be written back to disk)
+
+In fact, we may be in three different states :
+
+* all the values are serialized (when we just read the page from disk)
+* non of the values are serialized (when we just created a ValueHolder with new values)
+* somewhere in the middle, when we are modifying a ValueHolder which has been read from the
disk
+
+The third case is the complex one. We should consider two different cases though :
+* the values are stored in a sub BTree : we don't have to deal with this problem, it's up
to the sub-btree to deal with it
+* the values are stored in an array : we don't want to store half of the values as byte[],
and half of the values as java instances. We must pick either one form or the other. In this
case, as soon as we have to manipulate values in Java, then we need to deserialize all the
values.
+
 #### ValueHolder operations
 The possible operations on a ValueHolder are the following :
 
@@ -157,3 +204,4 @@ The _add_ algorithm will thus be :
 <pre>
   if the values are not yet deserialized
     then deserialize all the values
+</pre>
\ No newline at end of file



Mime
View raw message