directory-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From conflue...@apache.org
Subject [CONF] Apache Directory Server v1.5 > Schema Subsystem Redesign
Date Tue, 06 Oct 2009 04:39:00 GMT
<html>
<head>
    <base href="http://cwiki.apache.org/confluence">
            <link rel="stylesheet" href="/confluence/s/1519/1/1/_/styles/combined.css?spaceKey=DIRxSRVx11&amp;forWysiwyg=true"
type="text/css">
    </head>
<body style="background-color: white" bgcolor="white">
<div id="pageContent">
<div id="notificationFormat">
<div class="wiki-content">
<div class="email">
     <h2><a href="http://cwiki.apache.org/confluence/display/DIRxSRVx11/Schema+Subsystem+Redesign">Schema
Subsystem Redesign</a></h2>
     <h4>Page <b>edited</b> by             <a href="http://cwiki.apache.org/confluence/display/~elecharny">Emmanuel
Lécharny</a>
    </h4>
     
          <br/>
     <div class="notificationGreySide">
         <div class='panelMacro'><table class='noteMacro'><colgroup><col
width='24'><col></colgroup><tr><td valign='top'><img src="/confluence/images/icons/emoticons/warning.gif"
width="16" height="16" align="absmiddle" alt="" border="0"></td><td><b>Work
in progress</b><br /><p>This site is in the process of being reviewed and
updated.</p></td></tr></table></div>
<table class="sectionMacro" border="0" cellpadding="5" cellspacing="0" width="100%"><tbody><tr>
<td class="confluenceTd" valign="top" width="70%">

<h1><a name="SchemaSubsystemRedesign-Purpose"></a>Purpose</h1>

<p>This document is a functional specification for a new schema subsystem design.  The
new design will enable dynamic yet persistent updates to schema elements within the server.
 Furthermore, entire collections of schema elements referred to as "<b>a schema</b>"
will be [un]loadable on the fly.  This new mechanism will also expose a persistent partition
attached to the namespace from the ou=schema naming context.  It will contain a well structured
entry based view of schema objects managed by the server with better search and administration
capabilities.  This is all in addition to the schemaSubentry demanded by the LDAP protocol
with attribute based descriptions of all schema elements.</p>

<p>Such a new schema subsystem will make it easier to use the server and manage the
schema maintained within it.  Furthermore it will greatly facilitate replication since schema
elements will simply be entries within the server.</p></td>
<td class="confluenceTd" valign="top" width="30%">

<h1><a name="SchemaSubsystemRedesign-DocumentTODO"></a>Document TODO</h1>

<ul>
	<li>Provide in-server execution flow diagrams for schema loading, schema discovery,
schema information reading, etc.</li>
	<li>Consider warnings, information attached by Ersin and merge them into the document.</li>
</ul>
</td></tr></tbody></table>

<h1><a name="SchemaSubsystemRedesign-PresentDaySchemaSubsystem"></a>Present
Day Schema Subsystem</h1>

<p>The schema subsystem today is extremely primitive and only allows for a read-only
schema.  The schema subsystem uses a system of schema object producers to generate schema
objects from Java class files.  These class files are generated from OpenLDAP syntax based
schema files.  For change to be made and take effect recompilation and a restart is required.</p>
<div class='panelMacro'><table class='noteMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/warning.gif" width="16" height="16"
align="absmiddle" alt="" border="0"></td><td>
<p>The actual grammar implemented to parse Schema files is based on OpenLdap syntax,
but the one we should implement must be based on RFC-4512. It has been sum up in <a href="http://cwiki.apache.org/confluence/display/DIRxSRVx11/Schema+loading"
rel="nofollow">Schema Loading</a>)</p></td></tr></table></div>

<p><b>Why such a primitive inflexible solution?</b> This design was originally
intended as a simple bootstrapping mechanism to get a small set of schema elements into memory
so they can be used to intialize a partition.  We hoped that this partition would later store
schema information while allowing for persistent updates to schema objects.  However we never
found enough time to implement this subsystem in totality and we fell back to using this bootstrap
mechanism to store all schema elements for the server.</p>

<h1><a name="SchemaSubsystemRedesign-TwoViewsofSchemaInformation"></a>Two
Views of Schema Information</h1>

<p>In the new design, two separate views will be provided for describing and interacting
with schema information stored within the server.  One view is virtual and uses a single entry,
the subschemaSubentry (<b>SSSE</b>), as required by the protocol to describe schema
elements within attribute values.  The other view is non-virtual and not specified by the
LDAP protocol.</p>

<div class='panelMacro'><table class='infoMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/information.gif" width="16"
height="16" align="absmiddle" alt="" border="0"></td><td><b>RFC4512,
4.2.  Subschema Subentries</b><br /><div class="preformatted panel" style="border-width:
1px;"><div class="preformattedContent panelContent">
<pre>Subschema (sub)entries are used for administering information about
   the directory schema.  A single subschema (sub)entry contains all
   schema definitions (see Section 4.1) used by entries in a particular
   part of the directory tree.

   Servers that follow X.500(93) models SHOULD implement subschema using
   the X.500 subschema mechanisms (as detailed in Section 12 of
   [X.501]), so these are not ordinary object entries but subentries
   (see Section 3.2).  LDAP clients SHOULD NOT assume that servers
   implement any of the other aspects of X.500 subschema.

   Servers MAY allow subschema modification.  Procedures for subschema
   modification are discussed in Section 14.5 of [X.501].

   A server that masters entries and permits clients to modify these
   entries SHALL implement and provide access to these subschema
   (sub)entries including providing a 'subschemaSubentry' attribute in
   each modifiable entry.  This is so clients may discover the
   attributes and object classes that are permitted to be present.  It
   is strongly RECOMMENDED that all other servers implement this as
   well.

   The value of the 'subschemaSubentry' attribute is the name of the
   subschema (sub)entry holding the subschema controlling the entry.

      ( 2.5.18.10 NAME 'subschemaSubentry'
        EQUALITY distinguishedNameMatch
        SYNTAX 1.3.6.1.4.1.1466.115.121.1.12
        SINGLE-VALUE NO-USER-MODIFICATION
        USAGE directoryOperation )

   The 'distinguishedNameMatch' matching rule and the DistinguishedName
   (1.3.6.1.4.1.1466.115.121.1.12) syntax are defined in [RFC4517].

   Subschema is held in (sub)entries belonging to the subschema
   auxiliary object class.

      ( 2.5.20.1 NAME 'subschema' AUXILIARY
        MAY ( dITStructureRules $ nameForms $ ditContentRules $
          objectClasses $ attributeTypes $ matchingRules $
          matchingRuleUse ) )

   The 'ldapSyntaxes' operational attribute may also be present in
   subschema entries.

   Servers MAY provide additional attributes (described in other
   documents) in subschema (sub)entries.

   Servers SHOULD provide the attributes 'createTimestamp' and
   'modifyTimestamp' in subschema (sub)entries, in order to allow
   clients to maintain their caches of schema information.

   The following subsections provide attribute type definitions for each
   of schema definition attribute types.
</pre>
</div></div></td></tr></table></div>

<h2><a name="SchemaSubsystemRedesign-SchemaEntriesintheou%3DschemaPartition"></a>Schema
Entries in the ou=schema Partition</h2>

<p>A special partition will be designed using an LDIFPartition implementation. </p>
<table class="sectionMacro" border="0" cellpadding="5" cellspacing="0" width="100%"><tbody><tr>
<td class="confluenceTd" valign="top">
<p>This partition will contain entries describing individual schema elements and groups
of these elements which we call schemas.  The structure will be rather simple.  Here's quick
look at what it might look like to the right.</p>

<h3><a name="SchemaSubsystemRedesign-NewobjectClassesforSchemaEntities"></a>New
objectClasses for Schema Entities</h3>

<p>Special objectClasses will need to be defined for schema specific entities to be
modeled as entries instead of as attribute values in the schemaSubentry.  Some objectClasses
which will need to be defined are:</p>
<ul>
	<li>attributeType</li>
	<li>objectClass</li>
	<li>ditStructure</li>
	<li>ditContent</li>
	<li>nameForm</li>
	<li>syntax</li>
	<li>matchingRule</li>
	<li>matchingRule</li>
	<li>syntaxChecker</li>
	<li>normalizer</li>
	<li>comparator</li>
</ul>


<p>These objectClasses will model all of the LDAP schema elements and more.  There are
3 additional elements that are ApacheDS specific and are listed last above: syntaxCheckers,
normalizers, and comparators.  ApacheDS uses these low level constructs to build sytaxes and
matchingRules that are used throughout the system.</p>
<div class='panelMacro'><table class='noteMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/warning.gif" width="16" height="16"
align="absmiddle" alt="" border="0"></td><td>
<p>The SyntaxChecker describes the syntax that an attributeValue must respect.</p>

<p>The Comparators are also implementation used by MatchingRules to compare values.</p>

<p>Normalizers are a little bit different : they do a transformation of an attribute
value accordingly to the rules which are given in various RFC for each AttributeType (for
instance, names should not be case sensitive, multiple consecutive white spaces should be
replaced by a single white space, etc ...). There is no formal description of normalizer with
an associated OID.</p></td></tr></table></div>

<p>Normalizers are used by indices and other processes to generate canonical representations
of attributeType values so they can be compared.  Normalizers are used to normalize values
in entries as well as in filter expressions.</p>

<p>Comparators compare values and these constructs implement the java.util.Comparable
interface.  They are used to sort values to insert them into indices and to evaluate greater
and less than expressions where attributeType values must be compared.</p>

<p>SyntaxCheckers are used to constrain values so the correct values are used by entries
for attributeTypes of that syntax.</p></td>
<td class="confluenceTd" valign="top" width="30%">
<p><img src="/confluence/download/attachments/31021/schema-partition.png" align="absmiddle"
border="0" /></p>
<div class='panelMacro'><table class='noteMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/warning.gif" width="16" height="16"
align="absmiddle" alt="" border="0"></td><td><p>The ObjectClass should
also contains the description and the Obsolete flag if set to true.</p></td></tr></table></div>
</td></tr></tbody></table>



<p>All those ObjectClasses are defined in the <a href="/confluence/display/DIRxSRVx11/MetaSchema"
title="MetaSchema">MetaSchema</a> page</p>

<h3><a name="SchemaSubsystemRedesign-NewattributeTypesforSchemaAttributes"></a>New
attributeTypes for Schema Attributes</h3>

<p>Several new attributeTypes will need to be defined to make all this work.  We can
easily contrive a list to do this by transposing the current attributes used in various schema
descriptions.  We've already done this in the image above.  For example we would need the
following attributeType descriptions to properly describe an objectClassDescription:</p>
<ul>
	<li>oid</li>
	<li>desc</li>
	<li>name (MV stands for Multi Valued)</li>
	<li>sup (MV)</li>
	<li>must (MV)</li>
	<li>may (MV)</li>
	<li>obsolete</li>
</ul>


<p>It is obvious from the way these attributes are used that for example the must attributeType,
the MAY list, the SUP list and the NAME are all multivalued.  The others are single valued.
 This derivation needs to be continued for all the various attributes used to describe all
the schema entities in LDAP.</p>
<div class='panelMacro'><table class='warningMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/forbidden.gif" width="16"
height="16" align="absmiddle" alt="" border="0"></td><td><p>So far, and
accordingly to RFC 4512, <b>SUP</b> and <b>NAME</b> are multi valued</p></td></tr></table></div>

<p>All those AttributeTypes are defined in the <a href="/confluence/display/DIRxSRVx11/MetaSchema"
title="MetaSchema">MetaSchema</a> page</p>


<h3><a name="SchemaSubsystemRedesign-SchemaSubsystemStartup"></a>Schema
Subsystem Startup</h3>

<p>The schema subsystem startup is pretty simple. If we already have an existing schemaPartition,
we will read all the schema entities stored in this partition, and initialize the Registries
accordingly. At this point, the schema <b>must</b> be valid, otherwise the server
won't start. If we don't have yet a schemaPartition on disk, then we extract from a Jar all
the base schemaEntities, boot a LdifPartition instance, and load all the schema entities into
the Registries.</p>

<p>The schema subsystem is now ready.</p>

<h3><a name="SchemaSubsystemRedesign-UsingtheX.500AdministrativeModel"></a>Using
the X.500 Administrative Model</h3>

<p>X.500 provides a powerful model for administering schema information within a DIT
directly so different authoriative areas can exist with different schemas in effect.  Doing
so is simple.  At the apex of an SAA (Schema Authoritative Area) a subentry is inserted. 
This subentry contains a simple subtree specification which is simply <b>{}</b>
which means the whole area underneath the apex (which is called the administration point or
AP).</p>
<div class='panelMacro'><table class='warningMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/forbidden.gif" width="16"
height="16" align="absmiddle" alt="" border="0"></td><td><p>What is called
"Schema Authoritative Area" above is called as "Subschema Administrative Area" in the standard
(X.501-2005, 11.5.2).<br/>
What is called "administration point" above is called as "Administrative Point" in the standard
(X.501-2005, 11.5.4).</p></td></tr></table></div>
<p>This subentry will have a cn attribute for it's name which we will default to schema
always.  It will be used as the RDN of the subentry as well.  Besides the cn and subtreeSpecification
attribute this entry usually contains what we are familiar with in the LDAP world.  Basically
other attributes such as:</p>
<ul>
	<li>attributeTypes</li>
	<li>syntaxes</li>
	<li>objectClasses</li>
	<li>matchingRules</li>
	<li>ditContentRules</li>
	<li>ditStructureRules</li>
	<li>nameForms</li>
	<li>matchingRuleUses</li>
</ul>


<p>These are the attributes that go into the subschemaSubentry.  The problem with LDAP
is that it never realized that there can be more than one schema subentry in the portion of
the DIT served by a DSA.  Every entry within a DSA contains a subschemaSubentry attribute
pointing to the schema subentry containing schema information governing that entry.  This
includes the Root DSE.</p>
<div class='panelMacro'><table class='noteMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/warning.gif" width="16" height="16"
align="absmiddle" alt="" border="0"></td><td>
<p>The big problem here is the subschemaSubentry attribute is single valued and that
makes sense.  Only one set of schema rules can govern the structure of an entry at one time.
 The problem however is the fact that most LDAP browsers read the subschemaSubentry in the
RootDSE to find the schema in effect for the whole DIT mastered by the DSA.  This presumes
there is one schema in effect for the entire DIT and there are no SAAs.  Browsers will just
presume this.</p></td></tr></table></div>

<div class='panelMacro'><table class='infoMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/information.gif" width="16"
height="16" align="absmiddle" alt="" border="0"></td><td><b>RFC4512,
4.4.  Subschema Discovery</b><br /><div class="preformatted panel" style="border-width:
1px;"><div class="preformattedContent panelContent">
<pre>To discover the DN of the subschema (sub)entry holding the subschema
   controlling a particular entry, a client reads that entry's
   'subschemaSubentry' operational attribute.  To read schema attributes
   from the subschema (sub)entry, clients MUST issue a Search operation
   [RFC4511] where baseObject is the DN of the subschema (sub)entry,
   scope is baseObject, filter is "(objectClass=subschema)" [RFC4515],
   and the attributes field lists the names of the desired schema
   attributes (as they are operational).  Note: the
   "(objectClass=subschema)" filter allows LDAP servers that gateway to
   X.500 to detect that subentry information is being requested.

   Clients SHOULD NOT assume that a published subschema is complete,
   that the server supports all of the schema elements it publishes, or
   that the server does not support an unpublished element.
</pre>
</div></div></td></tr></table></div>
<p>In X.500 multiple SAA's can exist because of a powerful administrative model.  We
can find a way to merge there worlds together.  Essentially the subentry referenced by the
RootDSE will point to the global schema knowledge that has been enabled within the server.
 That includes every schema under ou=schema which has a schemaEnabled (to be defined) attribute
set to TRUE.  All schema objects under these enabled schemas are loaded into the global registries.
 This subentry referred to by the RootDSE will hence expose all the schema elements within
the global registries as one entry.</p>

<p>Note that this subentry will be completely virtual.  Browsers will hence see all
schema elements enabled in the server yet as we'll see different schema's will be enforced
in different areas of the DIT served by the DSA.</p>

<p>Hence this is the other view of the schema information which we were referring to.
 This view is the one that LDAP clients are used to.  It is also the one that has been the
most cumbersome.  This view will be constructed by the schema interceptor whenever there are
requests to read this global schema subentry.</p>

<p>If no SAAs are defined then the global schema takes effect throughout the entire
DIT served by the DSA.  If however an SAA is defined then a new administrative point is specified
with a schema subentry.  For an example let's use dc=example,dc=com as the AP of the SAA.
 This AP will have a schema subentry which contains a cn used as it's RDN.  It's value will
always be "schema".  This subentry will contain a subtreeSpecification attribute which will
always be set to the value "{}".  It then can contain the following attributes:</p>
<ul>
	<li>schema</li>
	<li>destinationSchema (not readable)</li>
	<li>defaultDestinationSchema</li>
</ul>


<p>The first attribute specifies the schemas that are in effect for this SAA.  By referencing
the commonName of the schemas defined under ou=schema the schema interceptor will inject all
the attributes needed from registries into that schema subentry.  Let me use an example to
be more clear.  Say you have a <b>schema</b> attribute value set to 'samba' in
the subentry cn=schema,dc=example,dc=com.  When the schema subentry is read by a client that
client sees all the attributeTypes, objectClasses, syntaxes etc that would be defined in the
samba schema under cn=samba,ou=system.  The interceptor injects these additional synthetic
attributes into the subentry when it is returned from the server.  Furthermore when schema
checks are enforced on entries in that SAA, the schema values referenced in that subentry
are used to determine the effective schema to use.</p>

<p>So we see we can use the X.500 administrative model and define different SAAs to
handle schema differently in different regions of the DIT while maintaining a global schema.</p>

<h3><a name="SchemaSubsystemRedesign-HandlingSchemaAddModifications"></a>Handling
Schema Add Modifications</h3>

<table class="sectionMacro" border="0" cellpadding="5" cellspacing="0" width="100%"><tbody><tr>
<td class="confluenceTd" valign="top" width="50%">
<p>destinationSchema and defaultDestinationSchema attributes factor in when new schema
objects are added using modify operations via SAA subentries.  On such operations we have
to add new schema elements somewhere under ou=schema but under which schema becomes the question.
 Administrators can specify which schema to add the new schema entity to using this destinationSchema
attribute which is SINGLE-VALUED. destinationSchema attribute cannot be read and can be used
only for schema updates. If its value is not set during a schema update the new schema elements
will be added to the schema specified with the attribute defaultDestinationSchema. defaultDestinationSchema
attribute is again SINGLE-VALUED.</p>

<p>So if destinationSchema is set to the value 'samba'  (or it has not been set but
defaultDestinationSchema set to 'samba') the new entry is created under the samba area (cn=samba,ou=schema)
in the respective position for the type of schema element created.  The interesting thing
is adding this new entity in this SAA would automatically add the entity to the global schema,
and hence the global schema subentry referenced by the RootDSE. Another side effect of this
is that the entity would also appear in the subentry for any SAA that referenced the samba
schema using the <b>schema</b> attribute in it's schema subentry.  If a completely
different schema private to the SAA is desired a novel name can be given and ApacheDS should
create the new schema entry under ou=schema to contain those new elements.</p>

<p>Now schema changes can also be performed on the schema subentry referenced by the
RootDSE.  Let's call this the global schema subentry.  When add mods are performed here we
don't have a destinationSchema available unlike subentries in SAAs.  In this case ApacheDS
can use a schema called 'other' which includes all objects that have not been classified yet.</p></td>
<td class="confluenceTd" valign="top">
<div class='panelMacro'><table class='infoMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/information.gif" width="16"
height="16" align="absmiddle" alt="" border="0"></td><td><b>An alternative
way to using destinationSchema attribute: Standard Schema Extensions</b><br />
<p>Netscape family LDAP server support a schema element description extension called
'X-ORIGIN'. For example the following objectClassDescription as read from objectClasses attribute
says that 'person' object class' origin is RFC  2252:</p>

<div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent
panelContent">
<pre>objectclasses: ( 2.5.6.6 NAME 'person' DESC 'Standard Person 
Object  Class' SUP top MUST (objectlass $ sn $ cn) MAY 
(description $  seealso $ telephoneNumber $ userPassword) 
X-ORIGIN 'RFC 2252'
</pre>
</div></div>

<p>We want to use this mechanism in a better way to handle destination/source schema
information for schema elements. We propose the following extension:</p>

<ul>
	<li>X-SCHEMA-NAME</li>
</ul>


<p>This extension is useful when you need source schema information on read operations
over <tt>cn=schema</tt>. However for bulk updates, destinationSchema attribute
is more useful. We can still only support destinationSchema for schema modifications and we
can provide X-SCHEMA-NAME as a virtual attribute on schema reads. We can even support both
mechanisms. So we have the following implementation possiblities:</p>

<ul>
	<li>Provide support for destinationSchema on schema modify operations and do not support
X-SCHEMA-NAME extension.</li>
	<li>Provide support for X-SCHEMA-NAME extension on both schema read and modify operations
and do not support destinationSchema.</li>
	<li>Provide support for destinationSchema on schema modify operations and provide support
for virtual X-SCHEMA-NAME extension on schema read operations.</li>
	<li>Provide support for destinationSchema on schema modify operations and provide support
for X-SCHEMA-NAME extention on both schema read and modify operations. (This may cause some
conflicts.)</li>
</ul>
</td></tr></table></div></td></tr></tbody></table>

<h3><a name="SchemaSubsystemRedesign-HandlingSchemaDeleteModifications"></a>Handling
Schema Delete Modifications</h3>

<p>When schema elements are deleted we know which schemas they belong to and can appropriately
remove them from under the ou=system area.  Again this has side effects where the global schema
subentry shows the delete, and as well some SAAs may also show the delete if they reference
the schema from which the schema entity was deleted.  These are natural implications.</p>

<p>The replace modifications don't even deserve a section here since it's just a matter
of performing serveral adds and several deletes.</p>

<h2><a name="SchemaSubsystemRedesign-ReviewofServerStartup"></a>Review of
Server Startup</h2>

<ol>
	<li>schema partition starts up</li>
	<li>schema subsystem initialized to load all entries in schema partition as registry
objects</li>
	<li>server starts up nexus, and other partitions including the nexus, schema partition
is added to the nexus</li>
	<li>interceptors are assembled into chain including the new schema interceptor</li>
</ol>


<p>At this point the solid state is reached. The server is ready to service requests
including updates to schema.</p>

<h2><a name="SchemaSubsystemRedesign-PrepackagingSchemaData"></a>Pre-packaging
Schema Data</h2>

<p>We still have a bit of a chicken and egg problem remaining.  We presume the schema
partition under ou=system is pre-populated with all the schemas we desire to toggle as enabled
etc.  This however presumes we ship with this pre-populated data.  This is not to far fetched
an idea to use.</p>
<div class='panelMacro'><table class='noteMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/warning.gif" width="16" height="16"
align="absmiddle" alt="" border="0"></td><td>
<p>Remember the schemaEnabled boolean flag.  If a schema object under ou=schema has
this flag enabled then the schema is visible in the global registry and referencable within
the schema subentry of any SAA.  If this flag is toggled off then the schema elements associated
with that schema immediately vanish from the global registry as well as any schema subentry
of SAAs that reference that schema.</p></td></tr></table></div>

<p>The main problem we must watch out for is to handle both embedded and standalone
configuration of the server.  To cope with both situations the best option is to create a
special apacheds-schema maven module.  This module would use a special plugin to fire up the
schema partition and load entries into it from OpenLDAP schema files.  After loading the db
files with schemas it would assemble them into a schema jar along with some classes that could
be used to unpack them into some directory structure.  This jar is also a good place to put
some hard coded schema elements needed to start up the schema partition.  I guess the schema
paritition can also be packaged into this jar.  Would be nice to package it all together if
possible.</p>

<p>The schema partition on start up would then check if the right files were created
on disk. If so then those files would be used otherwise the new partition files would be unpackaged
from the jar and placed into position on disk.  This would then be used to fire up the schema
partition and begin the initialization process.</p>

<p>Now this solves the problem of both embedded and standalone verions of ApacheDS.</p>

<h2><a name="SchemaSubsystemRedesign-LoadingOpenLDAPSchemas"></a>Loading
OpenLDAP Schemas</h2>

<p>What happens you may ask to the process of loading an OpenLDAP schema file?  With
this great dynamic system which preserves changes to schema over restarts we now loose the
nice ability to load a schema using OpenLDAP schemas.</p>

<p>Really we wanted to avoid having a problem where we had two copies of this data by
building in the ability of the server to load schema from both the schema partition and OpenLDAP
schema files.  The best approach is to have just one authoriative copy even if you have multiple
views on that same data.</p>

<p>This does not however mean that we have to abandon support for these OpenLDAP formatted
schema files.  It's a good thing to use those files interchangably right?  The best thing
we can do is add a tool to the ApacheDS tools module to load an OpenLDAP schema file into
ApacheDS for you.</p>

<p>The same commandline tool that loads the schema directly into the server should also
be able to generate 2 different kinds of LDIF files to be manually applied to the directory
if that's desired.  An LDIF file can be generated to apply the LDIF to the ou=schema area
as add operations or to apply it to the global schema subentry as a modify operation with
attribute additions.  The choice should be yours.</p>

<h2><a name="SchemaSubsystemRedesign-AdditionalFeatures"></a>Additional
Features</h2>

<h4><a name="SchemaSubsystemRedesign-ExtensionforLDAPSyntaxes"></a>Extension
for LDAP Syntaxes</h4>

<p>We will use X-IS-BINARY extension for LDAP Syntax descriptions. This will help clients
to determine whether an attribute is binary or not.</p>

<h2><a name="SchemaSubsystemRedesign-PointsIForgottoMake"></a>Points I Forgot
to Make</h2>

<ol>
	<li>The schema interceptor makes sure the global registries are always in sync with
new additions, deletions or modifications that occur to schema entries in the schema partition.
 This is one of it's responsibilities.  It may also keep other SAA specific registries in
sync as well if we decide we need to maintain separate registries for SAAs.</li>
</ol>


<h2><a name="SchemaSubsystemRedesign-UpdateOnProgress"></a>Update On Progress</h2>

<p>Currently (as of Feb 26th 2007) in the 1.5 branch we've implemented the dynamic schema
subsystem as outlined in this document except for the separate SAAs.  There is one global
schema in effect for now and until this additional feature is requested or someone has an
interest in implementing it we're going to keep it that way.</p>

<h2><a name="SchemaSubsystemRedesign-HandlingModificationAttributesonSchemaSubentry"></a>Handling
Modification Attributes on Schema Subentry</h2>

<p>From and email posted to the mailing list:</p>

<p>Presently the schemaSubentry located at cn=schema is completely virtual (generated
on the fly from the schema registries in the server) and it contains attributes which store
the schema entity descriptions for the server.  The problem we have is to accurately publish
the following attributes to reflects schema changes:</p>

<ul>
	<li>creatorsName</li>
	<li>createTimestamp</li>
	<li>modifiersName</li>
	<li>modifyTimestamp</li>
</ul>


<p>The first two are really easy.  The creatorsName will always be the administrator's
DN: uid=admin,ou=system.  The creatorsTimestamp should be the creatorsName on the ou=schema
entry.  The rational is this virtual entry is valid at the point the schema subsystem was
created.  This timestamp will reflect the time when the server was last built as it should
since this is when the default schema is created.  It's natural to use the admin user for
the creatorsName attribute.</p>

<p>The modifiersName and modifyTimestamp are not that easy.  Any time there is a change
under ou=schema these fields need to be modified and persisted.  So storing them in the virtual
entry is not an option since these values must persist across server restarts.  I'm thinking
best way to store this information would be to use a special entry under the ou=schema namingContext
to store the following attributes:</p>

<ul>
	<li>schemaModifiersName</li>
	<li>schemaModifyTimestamp</li>
	<li>subschemaSubentryName</li>
</ul>


<p>It's tempting to store more information here in this entry like the schema entity
or the schema that is modified however a schema operation may modify more than one schema
entity perhaps in multiple schemas.  And a modify operation may perform different kinds of
operations on the each of the modified schema entities and this is far too much to track in
a single entry.  So it's not worth while tracking this information here but in a change log
implemented for this purpose at a later date. </p>

<p>So let's keep it simple and do just what we have to do with this special entry. 
The entry can have it's own objectClass and a simple cn for it's RDN attribute.  Here's what
I propose for the schema of this entry:</p>

<div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent
panelContent">
<pre>attributetype ( TBD NAME 'schemaModifyTimestamp'
    DESC 'time which schema was modified'
    SUP modifyTimestamp )

attributetype ( TBD NAME 'schemaModifiersName'
    DESC 'the DN of the modifier of the schema'
    SUP modifiersName )

attributetype ( TBD NAME 'subschemaSubentryName'
    DESC 'the DN of the schema subentry this modification information corresponds to'
    EQUALITY distinguishedNameMatch
    SYNTAX 1.3.6.1.4.1.1466.115.121.1.15{32768} )

objectclass ( TBD NAME 'schemaModificationAttributes'
        DESC 'a special entry tracking schema modification attributes'
        SUP top STRUCTURAL
        MUST ( cn $ subschemaSubentryName $ schemaModifyTimestamp $ schemaModifiersName )
)
</pre>
</div></div>

<p>The reason why I use attributes besides modifiersName and modifyTimestamp is to prevent
collisions between these injected attributes for the entry itself.  Secondly looking for these
attributes will also return the other schema schema related attributes since they extend modifiersName
and modifyTimestamp respectively.</p>

<p>Any change to the schema entity entries under the ou=schema namingContext will update
these schema specific operational attributes as well.  When the schema subentry is read these
values will be read and populated into the virtual schema subentry dynamically by the schema
service.  This will lead to the desired effect of correctly informing clients of changes to
the global schema.</p>

<p>Note the schemaSubentryName corresponds to the DN of the subentry that these modification
attributes correspond to.  For our present purposes this will be cn=schema for now until we
introduce multiple SAAs.  More is discussed about this attribute in the drawback section below.
 Basically this attribute is here for extension purposes when more than one SAA exists.</p>
     </div>
     <div id="commentsSection" class="wiki-content pageSection">
       <div style="float: right;">
            <a href="http://cwiki.apache.org/confluence/users/viewnotifications.action"
class="grey">Change Notification Preferences</a>
       </div>

       <a href="http://cwiki.apache.org/confluence/display/DIRxSRVx11/Schema+Subsystem+Redesign">View
Online</a>
       |
       <a href="http://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=31021&revisedVersion=32&originalVersion=31">View
Change</a>
            </div>
</div>
</div>
</div>
</div>
</body>
</html>

Mime
View raw message