Return-Path: X-Original-To: apmail-directory-commits-archive@www.apache.org Delivered-To: apmail-directory-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4988510920 for ; Sat, 21 Sep 2013 14:55:35 +0000 (UTC) Received: (qmail 17989 invoked by uid 500); 21 Sep 2013 14:55:33 -0000 Delivered-To: apmail-directory-commits-archive@directory.apache.org Received: (qmail 17943 invoked by uid 500); 21 Sep 2013 14:55:32 -0000 Mailing-List: contact commits-help@directory.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@directory.apache.org Delivered-To: mailing list commits@directory.apache.org Received: (qmail 17933 invoked by uid 99); 21 Sep 2013 14:55:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Sep 2013 14:55:31 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Sep 2013 14:55:28 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id F1D682388B3A for ; Sat, 21 Sep 2013 14:55:06 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r879335 - in /websites/staging/directory/trunk/content: ./ mavibot/user-guide/ mavibot/user-guide/images/ Date: Sat, 21 Sep 2013 14:55:06 -0000 To: commits@directory.apache.org From: buildbot@apache.org X-Mailer: svnmailer-1.0.9 Message-Id: <20130921145506.F1D682388B3A@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: buildbot Date: Sat Sep 21 14:55:06 2013 New Revision: 879335 Log: Staging update by buildbot for directory Added: websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf2.graphml (with props) websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf2.png (with props) Modified: websites/staging/directory/trunk/content/ (props changed) websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf.graphml websites/staging/directory/trunk/content/mavibot/user-guide/images/nodeLeaf.png Propchange: websites/staging/directory/trunk/content/ ------------------------------------------------------------------------------ --- cms:source-revision (original) +++ cms:source-revision Sat Sep 21 14:55:06 2013 @@ -1 +1 @@ -1525161 +1525247 Modified: websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html ============================================================================== --- websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html (original) +++ websites/staging/directory/trunk/content/mavibot/user-guide/2.1-file-format.html Sat Sep 21 14:55:06 2013 @@ -214,10 +214,18 @@ it's something we might want to change l

BTrees

Note that each BTreeHeader has at least one root page, even if it contains no data. In this schema, we show the root page just after the BTree it is associated to, but after a few updates, the root page may perfectly well be stored elswhere on the disk.

The Nodes and Leaves

-

Nodes and Leaves are logical BTree pages which are serialized on disk into one to many PageIOs. They have slightly different data structures, as Nodes contains pointers to Leaves, and no data, while Leaves contains data. In any case, both contain the keys.

+

Nodes and Leaves are logical BTree pages which are serialized on disk into one to many PageIOs. They have slightly different data structures, as Nodes contains pointers to Leaves, and no data, while Leaves contains data. In any case, both contain the keys. The Node has one ore value than the Leaf, too.

On disk, each Node and Leaf are stored in PageIOs, as we said. A Node will have pointers to some other logical pages, and on disk, those pointers will be offset of the first PageIO used to store the logical page it points to.

Here is the Node and Leaf data structures once serialized :

Node and Leaf

+

Note that we store the size of the serialized data : this is necessary as we have to know how many PageIOs will be needed to store the logical page.

+

The rootPage is just a Node or a Leaf.

+

Potential improvement

+

We can get better performance by serializing the data differently. Instead of storing keys and values as byte arrays prefixed by their length, we could store an array of keys and values' offsets before the associated byte[]. Here is the resulting data structure, once serialized :

+

Node and Leaf, improved

+

(The Node is not described, as it's basically the same data structure, but with one extra value).

+

It does not need more space to serialize the data this way, as the offsets are ints, and in the previous version, those ints are used to store the length of the keys and values anyway.

+

The gain is that we can have access to a given key and value without having to read all the previous keys and values. Also we can now read a leaf or a node without having to deserialize all the keys and values they contain.