directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Howard Chu <...@symas.com>
Subject Re: Add perf issues
Date Sun, 09 May 2010 14:51:29 GMT
Emmanuel Lecharny wrote:
> Hi,
>
> while doing some perf test on the Add operation, I'm getting blocked
> after having added around 3800 entries. After investigations, I found
> that some index are very expensive to update :
> - the ObjectClass index quickly get saturated, and adding a new entry
> into it cost a hell of a time.
> - same problem with the OneLevel index
> - Same problem with the SubLevel index
> - same problem with the RDN index
> - from time to time, the UUID index takes 300 ms to synced, but that's
> random and it may be due to some page split
> - all the other index behave perfectly well, assuming that they are not
> impacted, as we don't add any entry into them.
>
> Ok, now, a blind guess is that those indexes, except the RDN index, will
> all contain a reference to the newy created entry, assuming I'm adding N
> entries whith a Dn:cn=test<N>,ou=system. They will be all in the same
> level (thus the pb with the OneLevel and SubLevel index), with all
> having the same OC (top and person) thus the problem with the OC index.
> I'm a bit more surprised by the RDN index problem, except if we consider
> that the ou=system RDN will point to all its children, then t makes
> sense that we have the same problem.
>
> Basically, it seems that all the references are added in the same page
> which is deserialized and serialized for each addition, growing and
> growing. Do we have a size limit for a page after which it is moved to
> use a sub-tree ?
>
> What's the strategy when we will have millions of entries under the
> 'person' ObjectClass, or millions of entries in a flat directory? I
> thought this issue was fixed a while ago, and I'm a bit surprised hat it
> still present in the server, making it totally unusable. Or I'm missing
> a configuration parameter (like the max number of element in a page
> before a sub-tree is created).
>
> Any help would be great !
>
Welcome to my nightmare. ;)

I'm still looking for a fast/cheap index data structure that can be sparsely 
populated, is lossless, and scales to billions of entries. (Currently our 
index is fast/cheap but lossy. If you fill a slot with a million entries and 
then remove every other one, it still records the slot as containing a million 
entries. I.e. we really don't handle sparse indices at those scales.)
-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/

Mime
View raw message