Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 27446 invoked from network); 26 May 2009 15:18:04 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 26 May 2009 15:18:04 -0000 Received: (qmail 11607 invoked by uid 500); 26 May 2009 15:18:15 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 11550 invoked by uid 500); 26 May 2009 15:18:15 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 11542 invoked by uid 99); 26 May 2009 15:18:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 May 2009 15:18:15 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 May 2009 15:18:06 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 83826234C004 for ; Tue, 26 May 2009 08:17:46 -0700 (PDT) Message-ID: <1922046838.1243351066523.JavaMail.jira@brutus> Date: Tue, 26 May 2009 08:17:46 -0700 (PDT) From: "Shai Erera (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean In-Reply-To: <848873297.1240634970628.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713048#action_12713048 ] Shai Erera commented on LUCENE-1614: ------------------------------------ About ConjunctionScorer.doNext() (this also applies to FilteredQuery.advanceToCommon()). I've changed it, following Mike's proposal to this: {code} private boolean doNext() throws IOException { int first = 0; lastDoc = scorers[scorers.length - 1].docID(); Scorer firstScorer; while ((firstScorer = scorers[first]).docID() < lastDoc) { lastDoc = firstScorer.advance(lastDoc); first = first == scorers.length - 1 ? 0 : first + 1; } return lastDoc != NO_MORE_DOCS; } {code} This indeed gets rid of 'more', the check for 'more' in the while condition and also the assignment to more. But now I think it may introduce a different inefficiency. Let's say that firstScorer.advance() returns NO_MORE_DOCS. The next scorer's docID is obviously smaller, and therefore the following call will be (first line in the 'while' body): *lastDoc = firstScorer.advance(Integer.MAX_VALUE);*. There are Scorers which cannot implement that efficiently. With 'more' this would not have happened, since the while condition would terminate before that. Are we sure that that's a worthwhile enhancement. BTW, the code for FilteredQuery looks like this: {code} while (scorerDoc != disiDoc) { if (scorerDoc < disiDoc) { if ((scorerDoc = scorer.advance(disiDoc)) == NO_MORE_DOCS) { return NO_MORE_DOCS; } } else { if ((disiDoc = docIdSetIterator.advance(scorerDoc)) == NO_MORE_DOCS) { return NO_MORE_DOCS; } } } return scorerDoc; {code} And I thought to change it to this: {code} while (scorerDoc != disiDoc) { if (scorerDoc < disiDoc) { scorerDoc = scorer.advance(disiDoc); } else { disiDoc = docIdSetIterator.advance(scorerDoc); } } return scorerDoc; {code} What do you think? > Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean > ---------------------------------------------------------------------------------------------------- > > Key: LUCENE-1614 > URL: https://issues.apache.org/jira/browse/LUCENE-1614 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Reporter: Shai Erera > Fix For: 2.9 > > Attachments: LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch > > > See http://www.nabble.com/Another-possible-optimization---now-in-DocIdSetIterator-p23223319.html for the full discussion. The basic idea is to add variants to those two methods that return the current doc they are at, to save successive calls to doc(). If there are no more docs, return -1. A summary of what was discussed so far: > # Deprecate those two methods. > # Add nextDoc() and skipToDoc(int) that return doc, with default impl in DISI (calls next() and skipTo() respectively, and will be changed to abstract in 3.0). > #* I actually would like to propose an alternative to the names: advance() and advance(int) - the first advances by one, the second advances to target. > # Wherever these are used, do something like '(doc = advance()) >= 0' instead of comparing to -1 for improved performance. > I will post a patch shortly -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org