directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Karasulu" <akaras...@apache.org>
Subject Re: [ApacheDS] [JDBM Partition] Why it's a BAD idea to store the Entry + DN in the master table
Date Thu, 07 Aug 2008 02:06:54 GMT
Just an additional note.  We're best off consolidating the Updn and Ndn
indices into a single index that stores a composite object.  This way a
double access is not required to get at the values and a single record
manager can be used with one cache.  The index would be structured like so
for the forward and reverse directions:

forward:   ndn    =>   id
reverse:     id      =>   ndn \0 updn

The reason why the forward direction only contains the ndn value is because
only the ndn is used for name based lookups of entry IDs.  The updn is just
there so we return the entry DN back to the client the way we got it from
them.  Hence why when accessing the index by id we should get back both.

Now there's just on issue with using the reverse index.  When checking the
index to see if it contains a key with a specific value, the matching must
make sure only the ndn value is used for the comparison.  That's the only
penalty and that's not so much.

I think with this single consolidated index and the removal of DN's from the
entry in the master table we're see a considerable improvement in RDN
changes on containers with large amounts of children.

Alex

On Wed, Aug 6, 2008 at 9:50 PM, Alex Karasulu <akarasulu@apache.org> wrote:

> Hi all,
>
> The ServerEntry stores the DN of the entry.  I think this is good for
> better code organization.  However, storing the entry together with it's DN
> into the master table is a very bad idea.  The DN should instead be managed
> in the NDN and DN indices.
>
> The reason why I'm suggesting this is because modifyDN operations will be
> extremely cumbersome when performed on a DN with many children.  It will
> require each child and the target entry to be retreived and written to disk
> to-from the master just to change it's DN.  Plus we still have the updn and
> ndn indices which also get updated so this is wasteful and causes a lot of
> unnecessary access operations.  Also note that we can store a lot more DNs
> in a cached JDBM page then we can entries.  So this will produce more memory
> consumption along with cache turn over.
>
> If the modifyDN operation changes the RDN of the target, a master table
> access is unavoidable because the target's RDN attribute in the entry must
> change. However the children of the target can avoid a master table
> read-write operation since their RDN attributes do not change.  This is
> again only avoidable if we do not store the DN in the master.  Ideally you
> just want to update the indices when entries are moved around.
>
> I've been against this drive to push the DN into the master table combinded
> with the entry from day one along with the drive to remove the NDN and UPDN
> indices.  The obvious reason is due to these issues.  I just did not have
> the time to clarify exactly why until I started looking into this bug which
> was recently introduced:
>
>
>  *DIRSERVER-1224 <https://issues.apache.org/jira/browse/DIRSERVER-1224>*
> As I reviewed the code it was clear what this will cost much more on all
> the flavors of ModifyDN operations.  Just imagine a ModifyDN to rename
> ou=People to ou=Users if it contains 100M users in it.  I'd recommend we
> agree to fix this as recommended then I can push a JIRA on it so this can be
> fixed in the future (but before 2.0 since the correction will cause db
> incompatibilities).
>
> Alex
>
> --
> Microsoft gives you Windows, Linux gives you the whole house ...
>



-- 
Microsoft gives you Windows, Linux gives you the whole house ...

Mime
View raw message