From Thu Jun 16 13:31:22 2011 Return-Path: X-Original-To: Delivered-To: Received: from ( []) by (Postfix) with SMTP id B55DF65ED for ; Thu, 16 Jun 2011 13:31:22 +0000 (UTC) Received: (qmail 76713 invoked by uid 500); 16 Jun 2011 13:31:22 -0000 Delivered-To: Received: (qmail 76666 invoked by uid 500); 16 Jun 2011 13:31:22 -0000 Mailing-List: contact; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list Received: (qmail 76659 invoked by uid 99); 16 Jun 2011 13:31:22 -0000 Received: from (HELO ( by (qpsmtpd/0.29) with ESMTP; Thu, 16 Jun 2011 13:31:22 +0000 X-ASF-Spam-Status: No, hits=-1994.3 required=5.0 tests=ALL_TRUSTED,HTML_MESSAGE,MIME_HTML_ONLY X-Spam-Check-By: Received: from [] (HELO ( by (qpsmtpd/0.29) with ESMTP; Thu, 16 Jun 2011 13:31:20 +0000 Received: from thor (localhost []) by (8.13.8+Sun/8.13.8) with ESMTP id p5GDV0cE022447 for ; Thu, 16 Jun 2011 13:31:00 GMT Date: Thu, 16 Jun 2011 09:31:00 -0400 (EDT) From: To: Message-ID: <28603412.3374.1308231060028.JavaMail.confluence@thor> Subject: [CONF] Apache Directory Server v1.5 > Table and Cursor Implementations MIME-Version: 1.0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Auto-Submitted: auto-generated

Table and Cursor Implementations

Page edited by Emmanuel L=C3=A9charny

Changes (1)

=20 =20
=20 =20
JdbmTable has several helper = classes which include a few Cursor implementations. These Cursors abstract= the JDBM specific browsing mechanism while also adding support for travers= al over duplicate keys if they have been enabled in the Table. There are s= ix main Cursor implementations.

h2. Introduction to Cursors

A *Cursor* is a data structure t= hat allows to access inner data eiher forward or backward. The following sc= hema expose the different kind of operations on elements that can be applie= d to a cursor.


We also have some = other operations that give a status :
* isFirst()
* isBeforeFirst()=
* isLast()
* isAfterLast()
* isClosed()
* available()
and a few more operations :
* before( Element )
* after( Eleme= nt )
* close() and close( Exception )
* setClosureMonitor()

h2. No Duplicates Cursor

Full Content

Table Interface Objectives

A Table is the most atomic element in a (disk or memory) based partition= backed by data structures acting as 2 column lookup tables where the keys = are sorted and can be browsed. Underlying data structures may be disk or me= mory based. They may support duplicate keys where more than one val= ue for a key can be stored. With duplicate keys values are also sorted.

The primary objective of the Table interface is to abstract away and ada= pt data structures to provide a standard syntactic and semantic representat= ion which is used by higher layers. This abstraction enables the reuse of = different data structures in implementations while utilizing the same table= /index structure to store and search entries.

JDBM = Characteristics

JDBM is primarily a disk based B+Tree implementation. JDBM, unlike for = example JE, is limited by the fact that it cannot store more than one value= for a key. Hence JDBM does not support duplicate keys.

JDBM uses a record manager to manage BTrees in a file. Many BTrees may = reside within a JDBM db file. JDBM caches blocks of records in these files= (these are not deserialized representations of entries) as raw bytes. Key= s and values are represented internally as byte arrays.

JDBM sorts keys and utilizes Comparators to compare them while inserting= keys into the B+Tree. It also provides a Serializer interface which can be= used to [de]serialize keys and values to use in place of default Java Seri= alization.

JDBM allows table entry browsing using a simple Browser interface which = returns key value tuples. Both forward and reverse browsing is possible wi= th browser positioning at the front, end, and on specific keys. Note that = the Tuple objects returned by the browser are reused (a new Tuple object is= not created for each step forward or reverse).

J= DBM Table Implementation

The JdbmTable class implements the Table interface. This implementation= abstracts JDBM specific B+Tree implementation details. The majority of th= e code in this class and it's helpers deals with adding duplicate key suppo= rt to JDBM B+Trees.

A Table can be created with or without support for duplicate keys. When= the JdbmTable is created it is designated as either supporting or not supp= orting duplicates. The constructor used determines if support for duplicat= e keys are enabled or disabled.

Since JDBM natively does not support duplicates we mimic support for dup= licate keys using two value storage techniques described below.

Value Storage Technique: Bulk Value Serialization

This first technique uses an in memory BTree implementation to store the= values of the key. Eventually we want the in memory B+Tree implementation= to be pluggable. Presently a linked (a.k.a. threaded) AVL tree optimized = for bulk serialization and traversal is used.

The set of values for a key in this in-memory data structure are seriali= zed and stored into the value of the JDBM B+Tree tuple. When the key is lo= oked up, the data structure is deserialized. Custom serialization is used = here to override default Java Serialization thereby avoid performance issue= s. Note that the in-memory data structure storing the set of values delega= tes value object serialization to a value Serializer set on the Table. Thi= s technique works well up to a point where the number of values for a key a= re small. After some threshold the cost of bulk serialization outweighs th= e disk access costs of the second mechanism below.

Value Storage Technique: BTree Redirection

The second mechanism uses another JDBM B+Tree in the same file to store = the values of duplicate keys. In this secondary BTree all tuple values are = empty. Only the key field of the secondary BTree is used and it stores the= values of the key in the primary table.

The entry in the primary BTree uses a reference in it's value field. So= when the key is looked up in the primary table, a serialized representatio= n of a BTreeRedirect object is found in the value field. This redirect obj= ect contains the record identifier for the secondary BTree within the same = file, managed by the same JDBM record manager.

Just like the in memory representation of multiple values for a key, thi= s JDBM BTree can be browsed.

The JdbmTable uses a threshold parameter to switch between these two mec= hanisms on a per key basis. Meaning both mechanisms may be in use each for= a different key. The JdbmTable detects the mechanism used using a magic n= umber embedded in the value at the first byte. This magic number let's the = implementation know if the value represents an AvlTree or BTreeRedirect obj= ect. Based on the value different marshaling algorithms are used.

Cur= sor Implementations

JdbmTable has several helper classes which include a few Cursor implemen= tations. These Cursors abstract the JDBM specific browsing mechanism while= also adding support for traversal over duplicate keys if they have been en= abled in the Table. There are six main Cursor implementations.

Int= roduction to Cursors

A Cursor is a data structure that allows to access inner data eih= er forward or backward. The following schema expose the different kind of o= perations on elements that can be applied to a cursor.

We also have some other operations that give a status :

  • isFirst()
  • =09
  • isBeforeFirst()
  • =09
  • isLast()
  • =09
  • isAfterLast()
  • =09
  • isClosed()
  • =09
  • available()

and a few more operations :

  • before( Element )
  • =09
  • after( Element )
  • =09
  • close() and close( Exception )
  • =09
  • setClosureMonitor()

No Dup= licates Cursor

The no duplicates Cursor is intended for use only with JdbmTables that h= ave duplicate keys disabled. This Cursor browses Table Tuples containing t= he key and the value. If duplicates are not enabled, calls to cursor() wit= h no arguments return this Cursor implementation.

= Duplicate Container Cursor

The duplicate container cursor is very similar to a no duplicates Cursor= however it is only used by JdbmTables that have duplicate keys enabled. I= nstead of returning Tuples with key value pairs, it returns a Tuple with a = key and container for the values of that key. This container is a wrapper = around either a BTree redirect object or a AVL tree. This Cursor is used i= nternally as an underlying Cursor by the duplicates Cursor which is describ= ed below. It is never directly returned to callers when a Cursor is reques= ted.

Duplicat= es Cursor

The duplicates Cursor is only used by JdbmTables that have duplicate key= s enabled. When duplicate keys are enabled calls to cursor() with no argum= ents returns this kind of Cursor.

The duplicates Cursor returns Tuple objects on browsing operations with = a key and a value object respectively as designated by the generic types us= ed by the Table. This Cursor will transparently browse keys with and witho= ut multiple values returning Tuples with the same key, iff that key contain= s multiple values. Once all the values of a key are traversed it moves on t= o the next key. For example consider the sample table composition below:

(-3, 10)
(52, 25)
(52, 38)
(52, 40)
(97, 58)

When positioned before the first element, the call to next() will positi= on this Cursor over the Tuple (-3, 100). Another next() call will position= it over (52, 25) and yet another will position it over (52, 38) and so on = with (52, 40), and finally (97, 58). This Cursor makes it seem as though d= uplicate key support is provided by the browsing capabilities of the JdbmTa= ble.

S= ame-Key BTree Value Cursor

The same-key BTree value Cursor is intend for browsing only the value or= values of a specific key. It returns only value objects instead of return= ing Tuples on browsing operations. This is because the key is already know= n and does not change.

This Cursor is not critical but rather facilitates an optimization. Use= rs wanting to browse the values of a key need not have to resort to a dupli= cates Cursor which does not start and stop on a specific key automatically.= If a duplicates Cursor is used each browsing operation much check to see = if the Cursor is still on the same intended key. This overhead along with = key value copies into a Tuple can be avoided when using this Cursor.

S= ame-Key BTree Tuple Cursor

This cursor like the same-key BTree value cursor above is bound to trave= rse only Tuples of the same key. The difference however is that this Cursor= returns a key/value Tuple object with the same key and instead of just ret= urning the value object.

AvlTre= e Tuple Cursor

This is similar to the same-key BTree Tuple Cursor above except it trave= rses an in memory AvlTree which only contains the values of the same key an= yway. Instead of returning values this Cursor returns key/value Tuples for= each value in the AvlTree as it traverses it using the same key across all= values traversed.