directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lecharny <>
Subject Re: [ApacheDS] [Schema] New schema subsystem specification
Date Fri, 24 Nov 2006 08:53:05 GMT
Norval Hope a écrit :

> Sorry this thread is getting so long (I seem to have that effect)...

yeah, a little bit :) Ersin smartly suggested at this point that you 
should put your specification for the VD you are planning. That seems 
the best solution.

> <snip/>
> My point is that in VD like cases like mine, AD is merely a custodian
> of a schema for a custom partition and is in no sense managing it:

Right. That's a good use-case.

>    a. It should be treated as read-only by AD, there is point in
> changing anywhere other then at the target system to which the custom
> partition communicates. The authorative source is the target system.
> AD is just acting as a pass-through.

In this case, and if we consider that ADS will be able to store the 
schema's elements as if they were entries, we could perfectly think of a 
mechanism where you control the shcema loading mechanism. You will need 
to manage more than one schemas. That something we must consider seriously.

>    b. For the same reason it doesn't make sense for AD to persist the
> schema information in this case, the custom partition may be
> explicitly removed while AD is running or its deployment bundle
> removed and AD restarted, in which case I'd want all trace of the
> schema info to disappear from AD when its associated partition
> disappeared.


> Even in non-VD cases, I imagine the bulk of the schemas currently
> imported into AD are best considered static in the sense the end-user
> modification of them at runtime could easily destabilise the server.
> When a schema is governed by an RFC or a spec authored by a third
> party, it would seem to be end-user modifications of it (except
> perhaps additions) would be generally outlawed. Where such schemas are
> used internally by the server, then updating them implies needing to
> update the server's code at the same time, no?

Not necessarily. We have at least 3 kinds of elements, in our vision :
1) Bootstrap elements, the minimum set of OC and AT needed to load the 
schema as if they were Ldap AT and OC.
2) immutable elements : the RFC's described elements, for instance. They 
can't be changed, except by an admin, and even then, the admin must have 
the possibility to recover them ('restore default schemas', for instance)
3) User defined elements : all the other elements.

Anyway, the last 2 must be available as files (Ldif ?) and loadable 
without changing the code. If you have compiled java classes loaded in 
LDAP as entry attribute values (SyntaxChecker, for instance), then no 
need to change the server code.

> Ok, if you're against reading .schema files (or "schema+ " files that
> contain the extra information you mention) then it sounds like I'll
> need to keep my support as a custom patch to AD instead.

Well, this is not exactly our position :) In my mind, your VD scenario 
is a very valid one. We should find a way to build this VD without 
having to support patches, which are complicated to deal with. The ADS 
architecture should support your need :)

> On normalizers, syntaxCheckers etc am I right in thinking that
> regardless of syntax of the text file you use you're going to use as
> your initial source, there is the problem that ultimately you need to
> bind code / behaviour to their definitions: other then name(s) and OID
> etc a normalizer is basically the code that implements the
> normalization, right? If so then allowing people to add there own ones
> (not included in the AD release) is going to involve classloading
> issues etc, as well as dealing with textual descriptive file.

You don't have that many normalizers and syntax checkers. But, 
basically, yes, if a user want to define its own syntax checker and 
normalizer, you have to deal with all this java protection (including 
java policy to avoid injection of malevolent code into the server)

> I apologize if I'm talking crap, just trying to understand these other
> objects a bit better.

Alex started a page on cwiki where he puts a sum up of a discussion we 
have had 2 days ago about schema initialization and everything related :

May be it can help you to better understand what are those special 
objects good for.
Fell free to add comment in it.


> With meta information like schema isn't the problem a bit worse
> though? What I'm thinking about is this sort of case (given MINA
> worker threads are executing concurrently):
>    a. user1 submits modify of attr "a" of object o1 of objectclass c1
> (MINA thread 1)
>    b. user2 submits delete of attr "a" from schema for c1 (MINA thread 2)
> where b. implies a lock on any attempts to change attr "a" in any
> instance of c1, and a. implies a lock on changing the schema for c1
> (or at least modifying type of /deleting attr "a" anyway).
> So isn't it a bit different because locks need to flow forward to /
> back from meta information?

Well, theorically, we may have problem. But you must first consider that 
the user 2 will be admin, and not a lambda user. We may add special 
priviledge to this user.
Second, access to schema is done through a single object (let's call it 
"registries"), so it's possible we put a lock on it to manage the 
described issue.
This will be a special lock, where an admin can lock any users coming 
after him, but will have to wait until all the previous users have 
finished their work

T1 : U1, U2, U3 threads are running
T2: Adm comes in, and all the follwing users will be queued until Adm 
modification is done
T3 : when U1, U2, U3 have finished their requests, Adm can modify the schema
T4 : Adm has modified the schema : we can now unqueue the queued users, 
and executre their requests.

>> As for the import utility it can just generate an LDIF of that you can
>> load on startup.  You can provide schemas in LDIF format for your users.
>>   The good thing with AD is that if you load an LDIF on startup AD marks
>> that LDIF file as already having been loaded and will not load it again.
>> It keeps a record of what was loaded when under the ou=system area.
> Understood.
> My problem is that one of my design goals is to keep work required by
> my client custom partition  writers to an absolute miminum. Currently
> they deploy a bundle which can optionally include a .schema file and
> that's it. I need to maintain that simplicity, so whether its in the
> core AD code (looking very unlikely I gather) or via a custom patch to
> AD that I maintain, I have to hide any steps required to encorporate a
> new schema into the server.
> Also the fact the the LDIF is information is persisted and guarded
> from reloading is actually a minus in my case, because:
>    a) I want to reload to schema information each time, because it is
> maintained by author of a custom partition bundle who may have updated
> it in line with an updated version of their bundle code
>    b) If the schema information for a custom partition is persisted
> then I have a problem getting rid of it when AD starts up next time
> and this custom partition is no longer deployed.
> I planned to deal with the extra info (normalizers etc you mention
> above) by looking for code associated with .schema files that defined
> the required extra java classes. The need for custom additions to the
> existing schema files in this space seems very much a boundary case to
> me anyway, these are the stats on such extensions that exist today:
> Apache.schema:
>    comparators: 3, matching rule: 3, normalizer: 3, syntax checker:
> 0, syntax producer: 0
> NIS.schema:
>    comparators: 1, matching rule: 1, normalizer: 1, syntax checker:
> 2, syntax producer: 2
> Inetorgperson:
>    comparators: 4, matching rule: 4, normalizer: 4, syntax checker:
> 0, syntax producer: 0
> System:
>    comparators: 27, matching rule: 28, normalizer: 27, syntax
> checker: 59, syntax producer: 59
> where a fair number of the implementations of these various extensions
> look like stubs. As I raised earlier in this diatribe, isn't it very
> likely that any such extensions required for a thirdparty schema will
> require their own custom code?

At this point, it would be best to have a clear description of the VD 
specifications :) Wiki ?

> <Snip/>
> At any rate, it seems like my requirements are completely disjoint
> from what you want to achieve in the schema subsystem redesign.

Not necessarily. We *need* a dynamic schema mechanism (just because a 
maven-plugin, whatever quality is maven), is obviously overkilling. The 
main difference in our vision and your need is persistence (AFAIK): you 
want to be able to discard a schema that has been loaded, and to allow 
the user to reload it. There are two problems I see with your need, 
regarding the actual base code :
1) we consider that modifying the schema will have impact on data, thus 
we thought that their must be some kind of control over it
2) ADS currently stores only one schema set, so you can't have many of 
them loaded, when you need a kind of multi-instance schema in ADS with 
only one ADS running (like a web-app where you have one container and 
many apps)

There is nothing forbidden us to include extensions in ADS to support 
your need, except time and workforce.

For 1), we can have a special admin mode where if you discard a schema, 
then you also remove the data (or to be a little bit more radical : you 
don't give a damn of data, only schema is meaningfull). That's not a big 
deal, it may just be a flag to set.

For 2), this is a little bit more tricky, but definitively worth the 
work, and it's somwhere back in our mind since at least 2 years for me, 
and may be 5 years in Alex mind :)

> I already have a solution meeting my requirements by removing the need
> for the existing Maven schema plugin, and instead allowing schema
> content to be imported to an in-memory representation at start-up.
> This solution is only a stepping stone to a more dynamic one, which
> required doing away with the BootstrapRegistries stuff amongst other
> things.
> I can help implement your plan and then rejig my current scheme on top
> of the new code, but I can't pretend that I'm not a little
> disappointed that there isn't a solution addressing both the core
> directory's and my "pass-through" type requirements at the same time.

There are always solutions. But you have to consider that we don't have 
the clear vision of what you want to do, and that you don't haver a 
clear vision of the internal of ADS, and what we want to do. At this 
point, we just discuss our options, and how to merge them in a 
sustainable timeframe for both of us.

As usual, it's all about sharing information, and then taking decisions. 
It's easier to take the right decisions if we have all elements put on 
table ;)


View raw message