Return-Path: X-Original-To: apmail-directory-users-archive@www.apache.org Delivered-To: apmail-directory-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2641B9FFD for ; Wed, 14 Mar 2012 16:08:43 +0000 (UTC) Received: (qmail 58180 invoked by uid 500); 14 Mar 2012 16:08:42 -0000 Delivered-To: apmail-directory-users-archive@directory.apache.org Received: (qmail 58155 invoked by uid 500); 14 Mar 2012 16:08:42 -0000 Mailing-List: contact users-help@directory.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@directory.apache.org Delivered-To: mailing list users@directory.apache.org Received: (qmail 58138 invoked by uid 99); 14 Mar 2012 16:08:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Mar 2012 16:08:42 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of akarasulu@gmail.com designates 209.85.212.172 as permitted sender) Received: from [209.85.212.172] (HELO mail-wi0-f172.google.com) (209.85.212.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 14 Mar 2012 16:08:37 +0000 Received: by wibhj6 with SMTP id hj6so7409030wib.1 for ; Wed, 14 Mar 2012 09:08:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=zLgZYh4nAx0DFaC8g3AENpD7h2X2zdAkwXjM0vrClUo=; b=V6bgGKOgdRC1Qp8wpN9ot7qnP+rIfXfNHy4pMa5khMEjr3KQ0ENdsx89k4p0xgqohw eUpIzSh4tJewlHk+aZ5TBuYQTcoTvalsSVwuxluCd/IVHOciS3QS6/S8mOFHvEwVf4lD txC04W4sqW0/SW/7z2JXz/H0uh0mWo4zRN8p02wosmiLRRScTdR4kS4VIh5LuNu0PUOX ddXHpsQOKLTCucPMP1Z6nxpqlLUiP2ehtCXshOXAZV8uCbEO500Ixct5DjVdG1IDvZoL oN4i+xRyp57SPPf7FwqIZ6Vy+elnWQT5ZebuvVl+OtOoyLqyors/Dx4HnDYe+zkiLirF GQxA== MIME-Version: 1.0 Received: by 10.180.102.100 with SMTP id fn4mr7749241wib.1.1331741296468; Wed, 14 Mar 2012 09:08:16 -0700 (PDT) Sender: akarasulu@gmail.com Received: by 10.180.103.7 with HTTP; Wed, 14 Mar 2012 09:08:16 -0700 (PDT) In-Reply-To: <2BE7E81B77921F43A6A273C2DF2FA6A43C107D77EE@IBSMBX.ibs-ag.com> References: <2BE7E81B77921F43A6A273C2DF2FA6A43C107D77EE@IBSMBX.ibs-ag.com> Date: Wed, 14 Mar 2012 18:08:16 +0200 X-Google-Sender-Auth: A0i_9zKhJF3UiN8oCmgoDgPuWVM Message-ID: Subject: Re: Is it faster/better to include one objectclass or all in query? From: Alex Karasulu To: users@directory.apache.org Content-Type: multipart/alternative; boundary=f46d0444ef4fe8a30404bb362e7e X-Virus-Checked: Checked by ClamAV on apache.org --f46d0444ef4fe8a30404bb362e7e Content-Type: text/plain; charset=ISO-8859-1 On Wed, Mar 14, 2012 at 4:51 PM, wrote: > Hi, when searching for a user having this objectclass hierarchy > > top > |_person > |_organizationalPerson > |_inetOrgPerson > > and uid = 'jsmith' > > Which query would be less expensive or better/faster? Thanks! > > (& > (objectclass=inetOrgPerson) > (uid=jsmith) > ) > This would be faster and more efficient since the evaluation is on a more specific objectClass which reduces the search space from the get go. To understand this you need to know about how the optimizer works with scan counts that are returned. LDAP search filters are expanded out into an AST (abstract syntax tree) with the leaves of the tree being assertions the branch nodes being operators. Then the optimizer annotates this AST with scan counts, which basically is asking each index, "Hey how many results would you return for this assertion?" So the more specific inetOrgPerson is more likely to return a smaller scan count. Now if you have an index on uid then the scan count on this will be 1 since UID should be unique (our DSA does not enforce this tho). Once the optimizer is done annotating, then a leaf node is selected in the entire AST to act as the candidate generator and is used for iterations. The leaf node with the smallest scan count is selected for this. The driving reason for this is that it is cheaper to iterate and lookup on less than it is more candidates. The rest of the leaf assertion nodes are used by lookup based assertion evaluators. So in this case with a uid index you will use this uid=jsmith to return one candidate and then do a lookup to see if the returned candidates are also matched by objectClass=inetOrgPerson. In this case I would just use (uid=jsmith) since you have the uid index. It will prevent the need for another lookup to check if it's an inetOrgPerson. If UID's are unique and your peeps are inetOrgPersons then this is the best filter for you. If you do not have an index on uid I suggest you index it. But if you don't then the candidates will be generated off the objectClass index which always exists since it is a system index. The server will then iterate through the entire set of inetOrgPersons in your DIB and de-serialize the entry from the master table then check (after normalizing the uid attribute) if it is in fact equal to jsmith. This could be huge. So index your uids and don't bother with the objectClass stuff if you don't vary the OC of the people in your DIB. Cheers, Alex > > OR > > (& > (&(objectclass=top) > (objectclass=person) > (objectclass= organizationalPerson) > (objectclass=inetOrgPerson)) > (uid=jsmith) > ) > > > -- Best Regards, -- Alex --f46d0444ef4fe8a30404bb362e7e--