Return-Path: Delivered-To: apmail-directory-dev-archive@www.apache.org Received: (qmail 52868 invoked from network); 8 May 2010 07:43:51 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 May 2010 07:43:51 -0000 Received: (qmail 36291 invoked by uid 500); 8 May 2010 07:43:51 -0000 Delivered-To: apmail-directory-dev-archive@directory.apache.org Received: (qmail 36094 invoked by uid 500); 8 May 2010 07:43:50 -0000 Mailing-List: contact dev-help@directory.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Apache Directory Developers List" Delivered-To: mailing list dev@directory.apache.org Received: (qmail 36087 invoked by uid 99); 8 May 2010 07:43:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 May 2010 07:43:49 +0000 X-ASF-Spam-Status: No, hits=0.5 required=10.0 tests=AWL,FREEMAIL_FROM,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of akarasulu@gmail.com designates 209.85.161.50 as permitted sender) Received: from [209.85.161.50] (HELO mail-fx0-f50.google.com) (209.85.161.50) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 May 2010 07:43:45 +0000 Received: by fxm20 with SMTP id 20so1421347fxm.37 for ; Sat, 08 May 2010 00:43:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=iB1uZA7KIOede0POs4hOyqEXnKU5YvQ/kOg9/OVXg3c=; b=VzZDadv8g28/5Dj92T7qj4twrdcVcPpksrbSHIyEuIXMmVq61l75fUlzJFnwyUh4w7 susy+axikGAulWET6qPY3TgaWWThlWhNeCaG1gNc/ATyt4mp5smckdJDEUAK5q3zgkDP Hl4nNMRTre9mjdtdRBiJUfuy5XgSdzQvGj+7c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; b=NPf6ap6SOw961U0TFWG+icMrv93U0GKSRAvhKSsAiII0ATlFikpj/KM/8SeeE7hq+S s1l+HzJFlNwRfOkn7str4YnFJEMjiWyKxmmqeD8cth6sYr+na9y9+zHrJWraaycLqj/v 3oxD5fT+/wSZ7vtp4F/SNLnY6o8xvmQ7YrJu8= MIME-Version: 1.0 Received: by 10.239.142.10 with SMTP id e10mr116234hba.113.1273304603593; Sat, 08 May 2010 00:43:23 -0700 (PDT) Sender: akarasulu@gmail.com Received: by 10.239.189.77 with HTTP; Sat, 8 May 2010 00:43:23 -0700 (PDT) Date: Sat, 8 May 2010 10:43:23 +0300 X-Google-Sender-Auth: UJLVNqtN9ESk0d-AmxE8PGr4Qxk Message-ID: Subject: [ApacheDS] [XDBM Partition] Using a global UUID instead of partition specific Long ID PK From: Alex Karasulu To: Apache Directory Developers List Content-Type: multipart/alternative; boundary=001485f95ffa968be5048610543e --001485f95ffa968be5048610543e Content-Type: text/plain; charset=ISO-8859-1 Hi all, Any thoughts about using the globally visible UUID in the XDBM partition design for the primary key for Entries instead of using a partition specific Long ID? I'm thinking we need one day to implement certain features. Let me list then and also point out why using the globally unique UUID might be advantageous: (1) System wide DN and Entry Cache Rather than having each partition manage it's own cache a central DN and Entry cache makes sense. In this case a global identifier for an entry might come in handy for hashing cached values. (2) Nested Partitions, Default Root Partition, Hash Partitioning and Range Partitioning At some point we will want to have nestable partitions. This means we can have one ADS Partition mounted under another ADS Partition with operation routing taking place properly to the nested partition where appropriate. Nested partitions will also allow us to also have a default root partition from which we can mount other partitions. The default root partition is nice to have since it allows us to add administrative areas and their administrative points with subentries onto the root empty string DN. It also makes it so the RootDSE is now stored in this partition properly with persistence. Right now the RootDSE is generated and not mutable. Hash partitioning and range partitioning entails distributing entries across partitions under some container entry based on some value. Hash partitioning uses the value's hash to distribute entries where as range partitioning uses ranges of values to distribute the entries. So it's not really the DN that determines which partition the entry is pushed into but this hash or range value. This makes it so we can scale to very large numbers of entries in the DIT while also distributing the disk access load across several disk spindles as does Oracle's RDBMS in these kinds of configurations. (3) Global Indices If we use a globally unique UUID instead of a partition specific Long ID then we can expose index segments managed by partitions to higher layers to construct global indices. These global indices can then be used to conduct searches outside of the partition one step higher. This makes it possible for us to implement certain virtual directory strategies irregardless of the partition implementations used in a server's configuration. The XDBM search algorithm can leverage these global indices or delegate sub partition search to a partition if a partition uses it's own search mechanism. There's a lot to be said here but this is neither the time or the place to expand on this topic. But global indices is a key factor for several things including virtualization. Thoughts? -- Alex Karasulu My Blog :: http://www.jroller.com/akarasulu/ Apache Directory Server :: http://directory.apache.org Apache MINA :: http://mina.apache.org To set up a meeting with me: http://tungle.me/AlexKarasulu --001485f95ffa968be5048610543e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi all,

Any thoughts about using the globally visible UU= ID in the XDBM partition design for the primary key for Entries instead of = using a partition specific =A0Long ID?

I'm thi= nking we need one day to implement certain features. Let me list then and a= lso point out why using the globally unique UUID might be advantageous:

(1) System wide DN and Entry Cache=A0

=A0=A0 =A0 =A0Rather than having each partition manage it's own= cache a central DN and Entry cache makes sense. In this case a global iden= tifier for an entry might come in handy for hashing cached values.

(2) Nested Partitions, Default Root Partition, Hash Par= titioning and Range Partitioning=A0

=A0=A0 =A0 =A0= At some point we will want to have nestable partitions. This means we can h= ave one ADS Partition mounted under another ADS Partition with operation ro= uting taking place properly to the nested partition where appropriate. =A0<= /div>

=A0=A0 =A0 =A0Nested partitions will also allow us to a= lso have a default root partition from which we can mount other partitions.= =A0The default root partition is nice to have since it allows us to add ad= ministrative areas and their administrative points with subentries onto the= root empty string DN. =A0It also makes it so the RootDSE is now stored in = this partition properly with persistence. =A0Right now the RootDSE is gener= ated and not mutable.

=A0=A0 =A0 =A0Hash partitioning and range partitioning = entails distributing entries across partitions under some container entry b= ased on some value. Hash partitioning uses the value's hash to distribu= te entries where as range partitioning uses ranges of values to distribute = the entries. =A0So it's not really the DN that determines which partiti= on the entry is pushed into but this hash or range value. This makes it so = we can scale to very large numbers of entries in the DIT while also distrib= uting the disk access load across several disk spindles as does Oracle'= s RDBMS in these kinds of configurations.

(3) Global Indices

=A0=A0 =A0 = =A0If we use a globally unique UUID instead of a partition specific Long ID= then we can expose index segments managed by partitions to higher layers t= o construct global indices. =A0These global indices can then be used to con= duct searches outside of the partition one step higher. =A0This makes it po= ssible for us to implement certain virtual directory strategies irregardles= s of the partition implementations used in a server's configuration. = =A0The XDBM search algorithm can leverage these global indices or delegate = sub partition search to a partition if a partition uses it's own search= mechanism. =A0There's a lot to be said here but this is neither the ti= me or the place to expand on this topic. But global indices is a key factor= for several things including virtualization.

Thoughts?
--001485f95ffa968be5048610543e--