Return-Path: X-Original-To: apmail-directory-dev-archive@www.apache.org Delivered-To: apmail-directory-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 435EB6001 for ; Mon, 4 Jul 2011 20:49:23 +0000 (UTC) Received: (qmail 55616 invoked by uid 500); 4 Jul 2011 20:49:23 -0000 Delivered-To: apmail-directory-dev-archive@directory.apache.org Received: (qmail 55515 invoked by uid 500); 4 Jul 2011 20:49:22 -0000 Mailing-List: contact dev-help@directory.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Apache Directory Developers List" Delivered-To: mailing list dev@directory.apache.org Received: (qmail 55503 invoked by uid 99); 4 Jul 2011 20:49:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jul 2011 20:49:22 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of akarasulu@gmail.com designates 74.125.82.178 as permitted sender) Received: from [74.125.82.178] (HELO mail-wy0-f178.google.com) (74.125.82.178) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Jul 2011 20:49:14 +0000 Received: by wyf19 with SMTP id 19so4458999wyf.37 for ; Mon, 04 Jul 2011 13:48:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=eMnZICH1P9/ygMOxKJiweivRQHfQyHSlOGCPIyC8fZI=; b=U0I5kPR33hfn6cEDM5g9u6O/whUSrZ3Pml1kmHr/a4cf34ZXIZeOV5nDGtss4ETqY4 7/2ZgFUd9TE9e4te+Wi3EqJvMfyjN9w61TclHln1wIyVUHdtnT8EdHjXnYBejSNT725d +o4RffA7zqXfter8yGDgzg/lyr2O0IXapJngw= MIME-Version: 1.0 Received: by 10.216.231.89 with SMTP id k67mr5405031weq.113.1309812534250; Mon, 04 Jul 2011 13:48:54 -0700 (PDT) Sender: akarasulu@gmail.com Received: by 10.216.13.74 with HTTP; Mon, 4 Jul 2011 13:48:54 -0700 (PDT) In-Reply-To: <4E10CF11.2060308@gmail.com> References: <4E10CF11.2060308@gmail.com> Date: Mon, 4 Jul 2011 23:48:54 +0300 X-Google-Sender-Auth: 8N4B8R3gcwSQbE5G0fMS9cZ0dJE Message-ID: Subject: Re: About the reverse table removal. From: Alex Karasulu To: Apache Directory Developers List , elecharny@apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org On Sun, Jul 3, 2011 at 11:20 PM, Emmanuel Lecharny wrote: > Hi guys, > > I had time yesterday and today to think a bit more about the reverse table > we use in index. I thought we were able to remove them and saving the time > it costs to generate them. > > However, one day and a half without coding make me realize that those > reverse table are a critical part of an efficient evaluation engine. I > initially thought that using some Sets computation could allow us to remove > those table, and I was right, put to the point we don't have to manipulate > big sets of IDs. For instance, with a huge database with millions of > entries, a OneLevel or SubLevel index will potentially contains millions of > IDs associated with one ID. In this case, doing an intersection operation on > this huge set will simply be overkilling in term of memory (as we wlll have > to keep the resulting set in memory). > > OTOH, the reverse table on a OneLevel or SubLevel index will only contain a > few IDs associated with one entry, unless the tree is extremely deep. > > This is an issue not only for the OneLevel or SubLevel index, but also for > the ObjectClass index. > > As of now, I don't really think that continuing any further is worthily. The > best is probably to keep going with trunk, and eventually re-inject the few > fixes I have applied i my experimental branch. > > In any case, this was an interesting exercise : it helped me to get deep > into this part of the code I was not familiar with, and at the end, validate > the initial approach. Good to have your feedback on this matter. I was hoping that this simplification could work but did not have the time to actually think it through. Thanks, Alex