Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6A2FF17C13 for ; Thu, 25 Jun 2015 01:48:07 +0000 (UTC) Received: (qmail 81156 invoked by uid 500); 25 Jun 2015 01:48:06 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 81099 invoked by uid 500); 25 Jun 2015 01:48:06 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 81087 invoked by uid 99); 25 Jun 2015 01:48:06 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Jun 2015 01:48:06 +0000 Date: Thu, 25 Jun 2015 01:48:06 +0000 (UTC) From: "ASF subversion and git services (JIRA)" To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (LUCENE-6585) Make ConjunctionDISI flatten sub ConjunctionDISI instances MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-6585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14600523#comment-14600523 ] ASF subversion and git services commented on LUCENE-6585: --------------------------------------------------------- Commit 1687406 from [~rcmuir] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1687406 ] LUCENE-6585: add some tests that coord is applied properly for nested conjunctions > Make ConjunctionDISI flatten sub ConjunctionDISI instances > ---------------------------------------------------------- > > Key: LUCENE-6585 > URL: https://issues.apache.org/jira/browse/LUCENE-6585 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Adrien Grand > Priority: Minor > Attachments: LUCENE-6585.patch > > > Today ConjunctionDISI wraps some sub (two-phase) iterators. I would like to improve it by flattening sub iterators when they implement ConjunctionDISI. In practice, this would make "+A +(+B +C)" be executed more like "+A +B +C" (only in terms of matching, scoring would not change). > My motivation for this is that if we don't flatten and are unlucky, we can sometimes hit some worst cases. For instance consider the 3 following postings lists (sorted by increasing cost): > A: 1, 1001, 2001, 3001, ... > C: 0, 2, 4, 6, 8, 10, 12, 14, ... > B: 1, 3, 5, 7, 9, 11, 13, 15, ... > If we run "+A +B +C", then everything works fine, we use A as a lead, and advance B 1000 by 1000 to find the next match (if any). > However if we run "+A +(+B +C)", then we would iterate B and C 2 by 2 over the entire doc ID space when trying to find the first match which occurs on or after A:1. > This is an extreme example which is unlikely to happen in practice, but flattening would also help a bit on some more common cases. For instance imagine that A, B and C have respective costs of 100, 10 and 1000. If you search for "+A +(+B +C)", then we will use the most costly iterator (C) to confirm matches of B (the least costly iterator, used as a lead) while it would have been more efficient to confirm matches of B with A first, since A is less costly than C. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org