directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Karasulu" <akaras...@apache.org>
Subject Re: [ServerEntry new API] Q about BasicServerAttribute
Date Fri, 14 Dec 2007 17:59:04 GMT
Hi Emmanuel,

On Dec 14, 2007 10:53 AM, Emmanuel Lecharny <elecharny@gmail.com> wrote:

> Very valid points, Alex. We have had the same discussion a while back
> about DN parsing...
>

Yeah I think we talked about this too a while back while annotating this
experimental code with ideas in the java docs.


>
> My personal guess is that you are almost fully right, but there might be
> cases where we may want to check some parts of the values. The H/R
> aspect, for instance, directly drives the type of value we will create.


Yeah we're not completely free of having to do something I agree.  We just
want to minimalize just how much schema checking we want to enforce.  We do
what we have to do to remove some headaches but it's not our primary
objective in this region of the code.


>
> We can let the Schema intercerptor deal with normalization and syntax
> checking, instead of asking the EntryAttribute to do the checking. That
> means we _must_ put this interceptor very high in the chain.
>

Right now I think this is split into two interceptors.  The first one which
is executed immediately is the Normalization interceptor.  It's really an
extension of the schema subsystem.  Normalization cannot occur without
schema information and the process of normalization automatically enforces
value syntax.  This is because to normalize most parsers embedded in a
normalizer must validate the syntax to transform the value to a cannonical
representation using String prep rules.

The big difference that has evolved between the Normalization interceptor
and the Schema interceptor is that the Normalization interceptor is not
designed to fully check schema.  It does *ONLY* what it needs to do to
evaluate the validity of a request against the DIT.  For example the DN is
normalized and the filter expression is normalized early to determine if we
can short this process with a rapid return.  This reduces latency and weeds
out most incorrect requests.  Now with normalized parameters the Exception
interceptor can more accurately do it's work to determine whether or not the
request makes sense: i.e. does the entry that is being deleted actually
exist?  Then the request goes deeper into the interceptor chain for further
processing.  The key concept in terms of normalization and schema checking
is lazy execution.

Lazy execution makes sense most of the time but from the many converstations
we've had it seems this might actually be harming us since we're doing many
of the same computations over and over again while discarding the results,
especially where normalization is concerned.


>
> Here are the possible checks we can have on a value for an attribute :



>
> H/R : could be done when creating the attribute or adding some value into
> it


Yes this will have to happen very early within the codec I guess right?


>
> Syntax checking : SchemaInterceptor
> Normalization : SchemaInterceptor


Right now request parameters are normalized in within the Normalization
interceptor and the these other aspects (items) are being handled in the
Schema interceptor.


>
> Single value : SchemaInterceptor
>
> So I would say we should simply test the H/R flag in EntryAttribute.


Yes this sounds like something we must do to create the correct entry
composition in the codec.  Otherwise we would need an intermediate
representation which is a waste of memory and cycles.


>
> It brings to my mind another concern :
> let's think about what could happen if we change the schema : we will
> have to update all the existing Attributes, which is simply not
> possible. Thus, storing the AttributeType within the EntryAttribute does
> not sound good anymore. (unless we kill all the current requests before
> we change the schema). It would be better to store an accessor to the
> schema sub-system, no ?


This is a big concern.  For this reason I prefer holding references to high
level service objects which can swap out things like registries when the
schema changes.  This is especially important within services and
interceptors that depend in particular on the schema service.  I would
rather spend an extra cycle to do more lookups than with lazy resolution
which leads to a more dynamic architecture.  Changes to components are
reflected immediately this way and have little impact in terms of leaving
stale objects around which may present problems and need to be cleaned up.

However on the flip side there's a line we need to draw.  Where we draw
this line will determine the level of isolation we want.  Let me draw out a
couple of specific scenarios to clarify.

Scenario 1
========

A client binds to the server and pulls the schema at version 1, then before
issuing an add operation for a specific objectClass the schema changes and
one of the objectClasses in the entry to be added is no longer present.  The
request will fail and should since the schema changed.  Incidentally a smart
client should check the subscemaSubentry timestamps before issing write
operations to see if needs to check for schema changes that make the request
invalid.

Scenario 2
========

A client binds to the server and pulls schema at version 1, then issues an
add request, as the add request is being processed by the server the schema
changes and one of the objectClass in the entry to be added is no longer
present.

Scenario 1 is pretty clear and easy to handle.  It will be handled
automatically for us anyway without having to explicitly code the correct
behavior.  Scenario 2 is a bit tricky.  First of all we have to determine
the correct behavoir that needs to be exhibited.  Before confirming with the
specifications (which we need to do) my suspicions would incline me to think
that this add request should be allowed since it was issued and received
before the schema change was committed.  In this case it's OK for the add
request to contain handles on schema data which might be old but consistent
with the time at which that request was issued.

So to conclude I think it's OK, prefered and efficient for request
parameters and intermediate derived data structures used to evaluate
requests to have and leverage schema information that is not necessarily up
to date with the last schema change.  This brings up a slew of other
problems we have to tackle btw but we can talk about this in another thread.

SNIP ...


>
> > If the answer is apply all schema checks then how do we deal with
> > situations where the entry is inconsistent during composition but will
> > be consistent at the end?  For example you have an inetOrgPerson that
> > requires sn and cn attributes.  The user adds the objectClass
> > attribute with the inetOrgPerson value into the Entry.  If we have
> > schema checks enabled then this user action will trigger a violation
> > error.   Likewise if they add sn or cn before they add the objectClass
> > attribute since these attributes will not be in the must may list yet.
> That's not exactly what we want to introduce into the Entry class. This
> is clearly done by the Schema interceptor system. But it was not my
> initial concern, too, as I was specifically mentioning the
> EntryAttribute alone, not the Entry as a whole. So we are on the same
> page here.


I was just trying to say, if we start doing schema checks where do we stop.
However we may want to do these early too for very specific attributes like
for example the objectClass attribute.

Another mode of thinking may suggest performing all schema checks
immediately in one place since circumstances force us to deal with a part of
the problem anyway.  This line of thinking favors the benefit of keeping
such code associated with a specific function together in one place.

I don't know what is the correct answer here but was expressing to you the
different ways we can approach this problem.  I know you were talking about
attribute values but soon you'll find this pulls us into the conversation
about schema checks at the entry level.


>
> > So I think we open a Pandora's box if we try to overload too much
> > functionality into this Entry/Attribute API whose primary purpose is
> > with respect to managing entry composition.
> yeah. We need some balance. This is the reason I asked before doing
> stupid things :) At the end, this will be less work  for me ;)


Oh you're more right then you can imagine. This is why I'm being overly
analytical myself.  Slipping up here will have repercussions all over.
There's no right answer though even though some answers will be very very
detrimental.  The top few best answers will also have tradeoffs associated
with them and evaluating this and comming to a conclusion on how best to
proceed is what makes this such a difficult design problem.

Alex

Mime
View raw message