Hi Emmanuel,

On Sun, May 9, 2010 at 5:40 PM, Emmanuel Lecharny <elecharny@gmail.com> wrote:

while doing some perf test on the Add operation, I'm getting blocked after having added around 3800 entries. After investigations, I found that some index are very expensive to update :
- the ObjectClass index quickly get saturated, and adding a new entry into it cost a hell of a time.
- same problem with the OneLevel index
- Same problem with the SubLevel index
- same problem with the RDN index
- from time to time, the UUID index takes 300 ms to synced, but that's random and it may be due to some page split
- all the other index behave perfectly well, assuming that they are not impacted, as we don't add any entry into them.

Ok, now, a blind guess is that those indexes, except the RDN index, will all contain a reference to the newy created entry, assuming I'm adding N entries whith a Dn:cn=test<N>,ou=system. They will be all in the same level (thus the pb with the OneLevel and SubLevel index), with all having the same OC (top and person) thus the problem with the OC index. I'm a bit more surprised by the RDN index problem, except if we consider that the ou=system RDN will point to all its children, then t makes sense that we have the same problem.

Basically, it seems that all the references are added in the same page which is deserialized and serialized for each addition, growing and growing. Do we have a size limit for a page after which it is moved to use a sub-tree ?

No there is no size limit after using a sub-tree for a key's values with over the 512 (I think) limit which switches in memory key values over to the b+tree.
What's the strategy when we will have millions of entries under the 'person' ObjectClass, or millions of entries in a flat directory?

Logic was added to the index implementation to switch data structures for a key having more than a certain threshold of values. If I remember correctly the default for the threshold was 512 values for the same key. So if we have an ou index and the 'Engineering' key has >512 values, the data structure switches from a Collection object to a BTree which only stores values of this key.
I thought this issue was fixed a while ago, and I'm a bit surprised hat it still present in the server, making it totally unusable.

Yeah this should have been solved a long time ago you are right. It might have failed and the problem may have creped back into the picture again. Let's just make sure the data structure switch is in fact taking place after passing the threshold.
Or I'm missing a configuration parameter (like the max number of element in a page before a sub-tree is created).

Don't think so. It defaults to some threshold value. Perhaps the threshold needs to be dropped too but it should not be this bad.

Alex Karasulu
My Blog :: http://www.jroller.com/akarasulu/
Apache Directory Server :: http://directory.apache.org
Apache MINA :: http://mina.apache.org
To set up a meeting with me: http://tungle.me/AlexKarasulu