Return-Path: X-Original-To: apmail-directory-commits-archive@www.apache.org Delivered-To: apmail-directory-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 77D9B1011F for ; Sun, 22 Dec 2013 22:30:06 +0000 (UTC) Received: (qmail 68614 invoked by uid 500); 22 Dec 2013 22:30:06 -0000 Delivered-To: apmail-directory-commits-archive@directory.apache.org Received: (qmail 68570 invoked by uid 500); 22 Dec 2013 22:30:06 -0000 Mailing-List: contact commits-help@directory.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@directory.apache.org Delivered-To: mailing list commits@directory.apache.org Received: (qmail 68563 invoked by uid 99); 22 Dec 2013 22:30:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Dec 2013 22:30:06 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Dec 2013 22:30:03 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id A0F8223888FE for ; Sun, 22 Dec 2013 22:29:41 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r891500 - in /websites/staging/directory/trunk/content: ./ mavibot/user-guide/1.1-btree-basics.html mavibot/user-guide/2-btree-types.html Date: Sun, 22 Dec 2013 22:29:41 -0000 To: commits@directory.apache.org From: buildbot@apache.org X-Mailer: svnmailer-1.0.9 Message-Id: <20131222222941.A0F8223888FE@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: buildbot Date: Sun Dec 22 22:29:41 2013 New Revision: 891500 Log: Staging update by buildbot for directory Modified: websites/staging/directory/trunk/content/ (props changed) websites/staging/directory/trunk/content/mavibot/user-guide/1.1-btree-basics.html websites/staging/directory/trunk/content/mavibot/user-guide/2-btree-types.html Propchange: websites/staging/directory/trunk/content/ ------------------------------------------------------------------------------ --- cms:source-revision (original) +++ cms:source-revision Sun Dec 22 22:29:41 2013 @@ -1 +1 @@ -1552893 +1553042 Modified: websites/staging/directory/trunk/content/mavibot/user-guide/1.1-btree-basics.html ============================================================================== --- websites/staging/directory/trunk/content/mavibot/user-guide/1.1-btree-basics.html (original) +++ websites/staging/directory/trunk/content/mavibot/user-guide/1.1-btree-basics.html Sun Dec 22 22:29:41 2013 @@ -152,16 +152,19 @@

1.1 - B-tree Basics

-

A B-tree "tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time." (see Wikipedia)

-

The important point here is the last one : it guarantees O(logn) operations, compared to any other data structures (a hashed data structure offers O(n) average operations, but can degenerate to O(n2), and ordering is not kept.

+

B-tree was invented back in 1971 by Bayer and McCreight (the B does not main anything know, speculations are that it comes either form the B of Bayer or from Boing they were working for back then). It was an extension to binary search tree, which was just able to store 2 elements per page.

+

A B-tree is a "tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic time." (see Wikipedia)

+

The important point here is the last one : it guarantees O(logn) operations, compared to any other data structures (a hashed data structure offers O(n) average operations, but can degenerate to O(n2), and ordering is not kept. But there is more : by gathering N elements in a page, which leads to more comparisons than necessary, but still save a lot of disk accesses, as those comparison will be done in memory when a page is fully loaded. That's the reason for B-trees to have been invented.

B-trees are everywhere : databases, OS, etc. It's a critical data structure when you are to deal with a huge number of data.

1.1.1 - Inside a B-tree

-

A B-Tree contains Nodes and Leaves. A Node points to other Nodes or Leaves. Leaves contains Values. Both Nodes and Leaves have Keys that are associated with Values.

+

A B-Tree contains Nodes and Leaves. A Node points to other Nodes or Leaves. Leaves contains Values. Both Nodes and Leaves have Keys that are associated with Values. Keys are not copied in many pages, they just appear once in the whole B-tree.

Pretty simple !

One last thing : Keys are ordered, and this is the condition for the easy and fast retrieval of Values.

A few more rules are enforced : A Node and a Leaf contains up to N values (N being generally a power of 2, so 2, 4, 8, 16...). You can't have less than N/2 Values or keys in a Leaf or a Node, except for the root Node.

+

1.1.2 - Concurrent access

+

The real problem with B-trees is that we need to use some locks to allow concurrent access to the data, especially when some updates occur. The rational is that when browsing a B-tree, changing it will just break the browse operation. There are some technics to avoid having such an issue, up to a point no read locks are needed, assuming some extra information is added to the pages : a pointer to the next page at the same level. Sadly, it comes with drawbacks, like it's difficult to remove empty pages from the B-tree.