directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel L├ęcharny <elecha...@apache.org>
Subject Re: Partition and Backend confusion
Date Fri, 08 Jul 2011 06:23:30 GMT
On 7/8/11 12:10 AM, Alex Karasulu wrote:
> On Wed, Jun 29, 2011 at 3:10 PM, Emmanuel Lecharny<elecharny@gmail.com>  wrote:
>
> SNIP ...
>
>> We currently have a common Partition interface, which is the base on which
>> all the backend implementations are built. It's also used as an interface
>> for the Nexus.
> Yes.
>
>> In fact, we can split the Partition implementations in two categories :
>> 1) those which are manipulation an opertation context (AddContext,
>> DeleteOperationContext, etc)
>> 2) those which are interacting with the underlying store
> This does not make any sense to me at all. I can't see these as being
> two distinct categories. I must not be understanding you, can you
> elaborate?
Sure. What I'm saying is that we have one layer which takes methods with 
OperationContext parameters, and transforms them to what is expected by 
the under layer. To me, those two layers are two different things.
>> The current hierarchy is (<XXX>  : interface, [YYY] : abstract class) :
>> <Partition>
>>   [AbstractPartition]
>>     [BTreePartition<ID>]
>>       [AbstractLdifPartition]
>>         LdifPartition
>>         ReadOnlyConfigurationPartition
>>         SingleFileLdifPartition
>>       [AbstractXdbmPartition<ID>]
>>         AvlPartition
>>         JdbmPartition
>>     DefaultPartitionNexus (also implement<PartitionNexus>)
>>     NullPartition
>>     SchemaPartition
>>
>> Some few remarks :
>> - the BTreePartition<ID>  should be renamed AbstractBTreePartition
>> - we should have a BTreePartition interface
> Why?
All the abstract classes we have in ADS are prefixed by Abstract. For 
consistency reasons, I do think that we should rename BTreePartition to 
AbstractBTreePartition.

Also as it exposes methods which are specific to BTrees, an interface 
would be a good way to isolate the BTree behaviors.

Nothing big here, just clarification.
>> I'm also wondering if we should not make a better distinction between what
>> is backed by a store (ie, BTreePartition and SchemaPartition) and what is
>> not (ie PartitionNexus). Morever, why should the PartitionNexus extend the
>> Partition interface ? Does it make sense?
> The PartitionNexus is a proxy to partitions so it implements the
> interface. It's a single point to apply operations and have the route
> to the appropriate partition.
Makes sense.
> There's work to be done in this area for sure. First off I'd like to
> see partitions that hash entries across other partitions and some that
> contain entries and still can nest other partitions: acting both as
> entry stores and routers of operations. For example I've wanted a root
> partition that could also mount (nest) other partitions while still
> storing entries so the root DSE can be mastered in it and we can
> manage other subentries for the server in it instead of at the
> namingContext level.
With you.
> Incidentally the store interface might be able to be gotten rid of.
Hmmm, can you elaborate ?
> The key to several things we're going to do down the line around
> partitions rests around having entry ID be globally unique rather than
> unique within just the partition. After this is done it opens the door
> to several solutions ... including solutions to a couple recent
> problems:
>
>    (1) aliases referring to entry targets across partitions
>    (2) moddn operations across partitions
>    (3) virtualization, via views, and other constructs need it
Just wondering how badly we need to get rid of those IDs. They are not 
unique, each partition has it own, but AFAICT, if we transit one entry 
from a partition to another one (moddn), we don't care too much about 
the ID.

Regarding Aliases, I'm not sure (yet) we have to deal with them at this 
layer. Still have to think about it.

Moddn ops can be leveraged across partitions even if we keep the ID 
around. One partition does not have to know anything about the other 
partition's ID. We are just moving full entries (and all the associated 
index) from one partition to the other, as if it was a delete on one 
side, and an add on the other.

Virtualization is most certainly handled at an upper layer, and should 
probably don't have to know anything about the storage.
> ....
>
> There's more. But first we need a globally unique UUID for entries as
> the PK and we need to get rid of using long partition specific entry
> IDs as the PK.
Ok. I'm not sure we need to get rid of IDs right now, but I may be 
missing some element sin the big picture atm. It needs some serious 
consideration anyway. This is not something we should do lightly, and 
certainly not for 2.0. However, if we need to do this move and we may 
perfectly have to do it, then we need a stabilized base to work on.
> I would not change around interfaces right now. It's just going shift
> things without a clear direction and as you said yourself you're new
> to this code. Class renames and a few interface changes just to get
> familiar and comfortable with the code base is not going to help down
> the line.
I don't think either we need to change a hell lots of things ATM. Far 
too dangerous, and probably overkilling. As you said, this is a part of 
the code I don't know well, and I'm just pushing some ideas around to 
see where it's bringing me. I already paid the price once by killing one 
week on a reverse table removal for nothing, I certainly would like to 
avoid such waste of time again.
> Let's go global on the UUID and look at the big partition picture. We
> can redesign things to best suite small steps to get to our ultimate
> destination.
Sure. Right now, I'm pushing ideas. I don't want them to be pushed into 
the server, it's way too far fetched, and I may miss the target at 
large. In any case, I don't want to jeopardize 2.0, when what we need to 
make it solid is just a couple of features (namely, replication and DSR).

Atm, I'm just trying to get aliases work smoothly but if it requires 
some huge refactoring, then I'll let it down for 2.0. We don't need 
aliases for 2.0, we just need replication.

If we have to refactor heavily the backend to get aliases working fine, 
then I'm fine for a 2.1 or a 3.0. In any case, no urgency.

-- 
Regards,
Cordialement,
Emmanuel L├ęcharny
www.iktek.com


Mime
View raw message