From dev-return-38454-apmail-directory-dev-archive=directory.apache.org@directory.apache.org Fri Jun 24 08:04:59 2011 Return-Path: X-Original-To: apmail-directory-dev-archive@www.apache.org Delivered-To: apmail-directory-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 01126600F for ; Fri, 24 Jun 2011 08:04:59 +0000 (UTC) Received: (qmail 58754 invoked by uid 500); 24 Jun 2011 08:04:58 -0000 Delivered-To: apmail-directory-dev-archive@directory.apache.org Received: (qmail 58442 invoked by uid 500); 24 Jun 2011 08:04:52 -0000 Mailing-List: contact dev-help@directory.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Apache Directory Developers List" Delivered-To: mailing list dev@directory.apache.org Received: (qmail 58428 invoked by uid 99); 24 Jun 2011 08:04:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Jun 2011 08:04:51 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of akarasulu@gmail.com designates 74.125.82.44 as permitted sender) Received: from [74.125.82.44] (HELO mail-ww0-f44.google.com) (74.125.82.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Jun 2011 08:04:44 +0000 Received: by wwe5 with SMTP id 5so2635988wwe.1 for ; Fri, 24 Jun 2011 01:04:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; bh=mmulZRg5t/5zmqAXG0on5nE6SzvzG0htq1j9F74ZIxc=; b=uQRroza42s4OCTxxcn0obvzYGopYo9nwGhNT5l8jJKBOU7OIx6QPm8YVzPWTpRWagL ucFnsNs+D9awrmMzQ8rs7hzuxucaghmZ9PkfQO1VDuki/orSXU/PjFohbmGfu7Xjlyz6 IGvv4Ro1P7tPVicbHeVrGvOhPDZhuZmX36kMg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=Ty5e+ZC7oXdztF9klWrdT9Bn3HTg/jjdI/tP1mO89fflp7zPUsTLBcYbYidn8B8Jnz 9XnIm6nxAY9NXWkhz6R/J+En7R5NhR3ZeWucFqZSo7h1XM9ABdKBRpfxgC0f828FeDr9 VRoNqCRwf6COrGewnmDJKaD3wV6eMqJw1gIpc= MIME-Version: 1.0 Received: by 10.216.63.17 with SMTP id z17mr342634wec.98.1308902663093; Fri, 24 Jun 2011 01:04:23 -0700 (PDT) Sender: akarasulu@gmail.com Received: by 10.216.13.74 with HTTP; Fri, 24 Jun 2011 01:04:22 -0700 (PDT) In-Reply-To: <4E04443B.8060506@apache.org> References: <4E0367C5.6040705@gmail.com> <4E04411C.70302@apache.org> <4E04443B.8060506@apache.org> Date: Fri, 24 Jun 2011 11:04:22 +0300 X-Google-Sender-Auth: Oj0fAk9Ap7v66XExKDHcGTzQg-M Message-ID: Subject: Re: Index reverse tables : are they useful ? From: Alex Karasulu To: Apache Directory Developers List , elecharny@apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Fri, Jun 24, 2011 at 11:00 AM, Emmanuel L=E9charny wrote: > On 6/24/11 9:51 AM, Alex Karasulu wrote: >> >>>> The reverse index has no duplicate keys. The only way to get a >>>> duplicate key in the reverse index is if the same entry (i.e. 37) >>>> contained the same value ('foo') for the same (sn) attribute. And this >>>> we know is not possible. So the lookups against the reverse table will >>>> be faster. >>> >>> I was thinking about something a bit different : as soon as you have >>> grabbed >>> the list of entry's ID from the first index, looking into the other >>> indexes >>> will also return a list of Entry's ID. Checking if those IDs are valid >>> candidate can then be done in one shot : do the intersection of the two >>> sets >>> (they are ordered, so it's a O(n) operation) and just get the matching >>> entries. >>> >>> Compared to the current processing (ie, accessing the reverse index for >>> *each* candidate), this will be way faster, IMO. >> >> This is a VERY interesting idea. Maybe we should create a separate >> thread for this and drive deeper into it. You got something I think >> here. >> > have a look at > https://cwiki.apache.org/confluence/display/DIRxSRVx11/Index+and+IndexEnt= ry, > where I added some paragraphs explaining this idea. We can comment on thi= s > page. Nice pictures - what did you use for that? Reading further ... Also if you're doing this in a branch, hence we're not yet committed on the approach, can you please do this on a separate page so you don't alter the existing documentation? Thanks, Alex