directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Seelmann <seelm...@apache.org>
Subject Re: [ApacheDS] [XDBM Partition] Using a global UUID instead of partition specific Long ID PK
Date Sun, 09 May 2010 20:54:42 GMT
No objection at all.

I updated XDBM to use an <ID> type parameter to be flexible for
different ID types. The reason was that I wanted to use UUID for the
HBase partition. If we would use UUID in general for all partitions we
can remove that type parameter again.

Kind Regards,
Stefan


Alex Karasulu wrote:
> Hi all,
> 
> Any thoughts about using the globally visible UUID in the XDBM partition
> design for the primary key for Entries instead of using a partition
> specific  Long ID?
> 
> I'm thinking we need one day to implement certain features. Let me list
> then and also point out why using the globally unique UUID might be
> advantageous:
> 
> (1) System wide DN and Entry Cache 
> 
>       Rather than having each partition manage it's own cache a central
> DN and Entry cache makes sense. In this case a global identifier for an
> entry might come in handy for hashing cached values.
> 
> (2) Nested Partitions, Default Root Partition, Hash Partitioning and
> Range Partitioning 
> 
>       At some point we will want to have nestable partitions. This means
> we can have one ADS Partition mounted under another ADS Partition with
> operation routing taking place properly to the nested partition where
> appropriate.  
> 
>       Nested partitions will also allow us to also have a default root
> partition from which we can mount other partitions.  The default root
> partition is nice to have since it allows us to add administrative areas
> and their administrative points with subentries onto the root empty
> string DN.  It also makes it so the RootDSE is now stored in this
> partition properly with persistence.  Right now the RootDSE is generated
> and not mutable.
> 
>       Hash partitioning and range partitioning entails distributing
> entries across partitions under some container entry based on some
> value. Hash partitioning uses the value's hash to distribute entries
> where as range partitioning uses ranges of values to distribute the
> entries.  So it's not really the DN that determines which partition the
> entry is pushed into but this hash or range value. This makes it so we
> can scale to very large numbers of entries in the DIT while also
> distributing the disk access load across several disk spindles as does
> Oracle's RDBMS in these kinds of configurations.
> 
> (3) Global Indices
> 
>       If we use a globally unique UUID instead of a partition specific
> Long ID then we can expose index segments managed by partitions to
> higher layers to construct global indices.  These global indices can
> then be used to conduct searches outside of the partition one step
> higher.  This makes it possible for us to implement certain virtual
> directory strategies irregardless of the partition implementations used
> in a server's configuration.  The XDBM search algorithm can leverage
> these global indices or delegate sub partition search to a partition if
> a partition uses it's own search mechanism.  There's a lot to be said
> here but this is neither the time or the place to expand on this topic.
> But global indices is a key factor for several things including
> virtualization.
> 
> Thoughts?
> 
> -- 
> Alex Karasulu
> My Blog :: http://www.jroller.com/akarasulu/
> Apache Directory Server :: http://directory.apache.org
> Apache MINA :: http://mina.apache.org
> To set up a meeting with me: http://tungle.me/AlexKarasulu


Mime
View raw message