Return-Path: X-Original-To: apmail-directory-dev-archive@www.apache.org Delivered-To: apmail-directory-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E036F6155 for ; Fri, 24 Jun 2011 08:10:46 +0000 (UTC) Received: (qmail 65312 invoked by uid 500); 24 Jun 2011 08:10:46 -0000 Delivered-To: apmail-directory-dev-archive@directory.apache.org Received: (qmail 65235 invoked by uid 500); 24 Jun 2011 08:10:46 -0000 Mailing-List: contact dev-help@directory.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Apache Directory Developers List" Delivered-To: mailing list dev@directory.apache.org Received: (qmail 65214 invoked by uid 99); 24 Jun 2011 08:10:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Jun 2011 08:10:46 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of akarasulu@gmail.com designates 74.125.82.44 as permitted sender) Received: from [74.125.82.44] (HELO mail-ww0-f44.google.com) (74.125.82.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Jun 2011 08:10:39 +0000 Received: by wwe5 with SMTP id 5so2640427wwe.1 for ; Fri, 24 Jun 2011 01:10:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; bh=ytXLcEhgdUrVYaojruxWLi0nzQK7vcz851fydcJ2qCM=; b=vIbiLTXG9cOCmhMy2KwUzQFzSVSJG6LaiP+e2sJaulfb+YbSe4EO+miCDJrIbddTCC rX+dGBULlSDvh6SUYFO7UiMtvF7CtD7r0NI2u5amf+4BlWGNqPFQPhTDKRsxss0zW2/7 qcw4Nv+Zk2ZbARdjisZaqNPvKk4X3vE8+oNkM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=gvGN+Slc/BSKPk9+Dgt5NnjtnAKLB1ozbRbHVB5Ppdlo8Pr7LD7Gczywz6OUrVyiaS SoDPn8C9gk34MteUhf34JCCtKyTNoC6iutHZDirao6Jj5o5K54DzfpilHLrs/JXWiR7/ wGs5Lb9FHJkaN1Ggt1bBtZK9iG8lwFDuz5v5o= MIME-Version: 1.0 Received: by 10.216.63.17 with SMTP id z17mr347858wec.98.1308903018554; Fri, 24 Jun 2011 01:10:18 -0700 (PDT) Sender: akarasulu@gmail.com Received: by 10.216.13.74 with HTTP; Fri, 24 Jun 2011 01:10:18 -0700 (PDT) In-Reply-To: <4E0445C3.90201@apache.org> References: <4E0367C5.6040705@gmail.com> <4E04411C.70302@apache.org> <4E04443B.8060506@apache.org> <4E0445C3.90201@apache.org> Date: Fri, 24 Jun 2011 11:10:18 +0300 X-Google-Sender-Auth: QjO6M_hQd8dBDJ995lMk3uGl1H8 Message-ID: Subject: Re: Index reverse tables : are they useful ? From: Alex Karasulu To: Apache Directory Developers List , elecharny@apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Fri, Jun 24, 2011 at 11:07 AM, Emmanuel L=E9charny wrote: > On 6/24/11 10:04 AM, Alex Karasulu wrote: >> >> On Fri, Jun 24, 2011 at 11:00 AM, Emmanuel L=E9charny >> =A0wrote: >>> >>> On 6/24/11 9:51 AM, Alex Karasulu wrote: >>>>>> >>>>>> The reverse index has no duplicate keys. The only way to get a >>>>>> duplicate key in the reverse index is if the same entry (i.e. 37) >>>>>> contained the same value ('foo') for the same (sn) attribute. And th= is >>>>>> we know is not possible. So the lookups against the reverse table wi= ll >>>>>> be faster. >>>>> >>>>> I was thinking about something a bit different : as soon as you have >>>>> grabbed >>>>> the list of entry's ID from the first index, looking into the other >>>>> indexes >>>>> will also return a list of Entry's ID. Checking if those IDs are vali= d >>>>> candidate can then be done in one shot : do the intersection of the t= wo >>>>> sets >>>>> (they are ordered, so it's a O(n) operation) and just get the matchin= g >>>>> entries. >>>>> >>>>> Compared to the current processing (ie, accessing the reverse index f= or >>>>> *each* candidate), this will be way faster, IMO. >>>> >>>> This is a VERY interesting idea. Maybe we should create a separate >>>> thread for this and drive deeper into it. You got something I think >>>> here. >>>> >>> have a look at >>> >>> https://cwiki.apache.org/confluence/display/DIRxSRVx11/Index+and+IndexE= ntry, >>> where I added some paragraphs explaining this idea. We can comment on >>> this >>> page. >> >> Nice pictures - what did you use for that? Reading further ... > > YeD >> >> Also if you're doing this in a branch, hence we're not yet committed >> on the approach, can you please do this on a separate page so you >> don't alter the existing documentation? > > Right now, I just updated the existing doco with some extended explanatio= n > on the existing code. I can move suggestions to another page, sure. Nah you did good - I spoke too soon. Thanks, Alex