directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <>
Subject Re: Optimizing DB access with improving key usage
Date Fri, 21 Jul 2006 13:22:34 GMT
Ersin Er wrote:
> Hi all,
> The optimization branch is busy with extensive refactorings and things
> are going well. We had a few convos recently for further improving the
> performance and one of the points of improvement was thought as
> replacing the BigInteger construct which we use as the key of Attributes
> objects in the database.

It will certainly save us memory and processing time as we discussed
because it would only take one lookup to retrieve the master entry.

> What we do when retrieving and Attributes object is first retrieving a
> BigInteger stored in the db for the normalized DN and then doing one
> more lookup for the actual Attributes object using the BigInteger as the
> key.

This is done for two reasons. I chose a integer value as the primary key 
over the DN itself because

(1) [Big]Integers need no normalization,
(2) [Big]Integers take mostly 4 bytes to encode for storage as keys and 
values so matching them up is easy to do without loosing sorting order. 
  So imagine having to store several copies of DN's whose prefixes 
overlap to a large degree.  That means potentially millions more byte 
comparisons to do to match a key across indices.

> The first improvement to be done is to replace BigInteger with int or
> long which cost much less. 

This will not have as much of an impact as you think.  JDBM expects 
objects at it's interfaces so we'll still have to create at least a 
byte[] from the int or long.

But as Alex just (yes, at the time of writing
> this :) ) warned me about that jdbm keys should be objects not
> primitives. 

:( oops you already got that.

So my second suggestion is again replacing the BigInteger
> with more leightweight class like Integer and generating that Integer
> for each LdapDN at the time of construction. 

Well BigInteger is not that heavy.  In most cases for us we generate 4 
or less bytes for a BigInteger.  I have to see just how heavy the 
operations are for generating this byte[] representation but I think it 
would be the same to do for an int at the low capacities we're dealing 
with and expecting.

I think we do not need to
> do one more lookup to retrieve the (Big)Integer key of the Attributes
> object. We can just generate a hash value at LdapDN constructor and
> provide a method for accessing it when we want to store or retrieve
> Attributes objects. 

This is a problem.  Because for #2 above indices return integer ids. 
While searching we may do a lookup on an index and not have the DN 


> WDYT ?

I think this is a good try.  Really I do so please don't get 
discouraged.  But most LDAP server backends based on BTree 
implementations use this scheme for the PK.  We're not inventing 
anything new by the way we do things in the jdbm backend.

Now recently you had a great idea for making the master table a hash 
instead of a btree.  This is a great idea that should be done IMO.


View raw message