directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ersin Er <ersin...@gmail.com>
Subject Re: Optimizing DB access with improving key usage
Date Fri, 21 Jul 2006 13:24:52 GMT
Alex Karasulu wrote:
> Ersin Er wrote:
>> [RESENDING]
>>
>> Hi all,
>>
>> The optimization branch is busy with extensive refactorings and things
>> are going well. We had a few convos recently for further improving the
>> performance and one of the points of improvement was thought as
>> replacing the BigInteger construct which we use as the key of Attributes
>> objects in the database.
>
> It will certainly save us memory and processing time as we discussed
> because it would only take one lookup to retrieve the master entry.
>
>> What we do when retrieving and Attributes object is first retrieving a
>> BigInteger stored in the db for the normalized DN and then doing one
>> more lookup for the actual Attributes object using the BigInteger as the
>> key.
>
> This is done for two reasons. I chose a integer value as the primary 
> key over the DN itself because
>
> (1) [Big]Integers need no normalization,
> (2) [Big]Integers take mostly 4 bytes to encode for storage as keys 
> and values so matching them up is easy to do without loosing sorting 
> order.  So imagine having to store several copies of DN's whose 
> prefixes overlap to a large degree.  That means potentially millions 
> more byte comparisons to do to match a key across indices.
>
>> The first improvement to be done is to replace BigInteger with int or
>> long which cost much less. 
>
> This will not have as much of an impact as you think.  JDBM expects 
> objects at it's interfaces so we'll still have to create at least a 
> byte[] from the int or long.
>
> But as Alex just (yes, at the time of writing
>> this :) ) warned me about that jdbm keys should be objects not
>> primitives. 
>
> :( oops you already got that.
>
> So my second suggestion is again replacing the BigInteger
>> with more leightweight class like Integer and generating that Integer
>> for each LdapDN at the time of construction. 
>
> Well BigInteger is not that heavy.  In most cases for us we generate 4 
> or less bytes for a BigInteger.  I have to see just how heavy the 
> operations are for generating this byte[] representation but I think 
> it would be the same to do for an int at the low capacities we're 
> dealing with and expecting.
>
> I think we do not need to
>> do one more lookup to retrieve the (Big)Integer key of the Attributes
>> object. We can just generate a hash value at LdapDN constructor and
>> provide a method for accessing it when we want to store or retrieve
>> Attributes objects. 
>
> This is a problem.  Because for #2 above indices return integer ids. 
> While searching we may do a lookup on an index and not have the DN 
> available.
>
> ...
>
>> WDYT ?
>
> I think this is a good try.  Really I do so please don't get 
> discouraged.  But most LDAP server backends based on BTree 
> implementations use this scheme for the PK.  We're not inventing 
> anything new by the way we do things in the jdbm backend.
>
> Now recently you had a great idea for making the master table a hash 
> instead of a btree.  This is a great idea that should be done IMO.
This was mostly Emmnuel's idea. Thanks tho :-)
> Alex
-- 
Ersin

Mime
View raw message