directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ole Ersoy <ole.er...@gmail.com>
Subject Re: [ApacheDS][Partition] Using surrogate keys for attributeType aliases and objectClass aliases (was Re: [SCHEMA] Can two different LDAP AttributeType's have the same name?)
Date Fri, 06 Apr 2007 17:20:08 GMT
Alex,

So...um...Does it do the thing?

:-) Just kidding.

Wow - That's what I call an answer.

I think we need a performance design guide,
that's a sub book of a global design guide
for this type of "Smoookkkin" material.

A little later I need to break it down further
so that I understand the whole process from
a sequence Diagram view point.

Let me see if I can re-answer my question now
that I'm more enlightened.  I need to read
your material a few more times though for it to sink in
properly, so this is a "Trial" attempt.

I want to store 200M entries using the same
set of object classes to construct to create
the set of entry attributes.

One of the the entries I'm storing has OID
name alias org.apache.tuscany.DASConfig.baseDN
The OID for this AttributeType is 1.24l2.3.4.2.4 (Just made it up).

My goal is to keep the 200M entries in Memory.
So I want to have them as compact as possible.
When I write the entries using JNDI I'm using

org.apache.tuscany.DASConfig.baseDN as the attribute
key for one of the entry values.

However I would much rather store something shorter
than this in memory, like "1".

I think you are saying the OID name alias,
org.apache.tuscany.DASConfig.baseDN, gets switched
out with the OID by the server.

So instead of storing

[org.apache.tuscany.DASConfig.baseDN, myValue]

in memory, it stores:

[1.24l2.3.4.2.4, myValue]

Am I getting warmer?

Then the other thing I was thinking was
that 1.24l2.3.4.2.4 is still pretty long.

If I had an in memory partition for all the entries
and the entire set of entries only used say "500"
unique AttributeTypes, I think using
surrogate keys numbered from 1 to 500 would
result in a lot of memory savings and a performance increase.

Because we are only storing some number X ranging
from 1-500 200 Million times, rather than a bigger string like
1.24l2.3.4.2.4 200 Million times.

Then whenever a query takes place for
1.24l2.3.4.2.4, it's converted into "1",
and then one is used to look for entry attributes.

Does that make any sense?

Thanks,
- Ole

SNIP

Mime
View raw message