lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean
Date Wed, 20 May 2009 11:07:45 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12711097#action_12711097
] 

Shai Erera commented on LUCENE-1614:
------------------------------------

Thanks Mike for the clarification. One thing though is the comment I added to BS about not
being able to call scorer.doc() since we may hit NPE. I hit it when it used ReqExclScorer.
According to the latter, doc will hit NPE if next() or skipTo() returned false (or in our
case nextDoc or advance return MAX_VAL).

{code}
  public int doc() {
    return reqScorer.doc(); // reqScorer may be null when next() or skipTo() already return
false
  }
{code}

If we drop sub.done in BS and more in ConjScorer, calling doc() will hit NPE. I.e. in BS the
first line cannot exist, however I'm not sure I can start with nextDoc() before checking the
current doc() < end, which may hit NPE. 

{code}
int doc = scorer.doc();
while (doc < end) {
  sub.collector.collect(doc);
  doc = scorer.nextDoc();
}
{code}

And similarly in ConjunctionScorer dropping 'more' may hit NPE if that Scorer is used.

There are a couple of ways to handle it:

* In ReqExclScorer.doc() check for reqScorer == null and return MAX_VAL if it is. I count
here on doc() not being called very frequently after this patch, since nextDoc() and advance()
return the document. However that may not be the case (see ConjunctionScorer which uses doc()
in doNext).

* In ReqExclScorer make sure that reqScorer is never nullified, but then we'll need to figure
out a different way to mark that there are no more docs (perhaps change the comparison of
reqScorer to null to a boolean moreDocs?)


> Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead
of boolean
> ----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1614
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1614
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Shai Erera
>             Fix For: 2.9
>
>         Attachments: LUCENE-1614.patch
>
>
> See http://www.nabble.com/Another-possible-optimization---now-in-DocIdSetIterator-p23223319.html
for the full discussion. The basic idea is to add variants to those two methods that return
the current doc they are at, to save successive calls to doc(). If there are no more docs,
return -1. A summary of what was discussed so far:
> # Deprecate those two methods.
> # Add nextDoc() and skipToDoc(int) that return doc, with default impl in DISI (calls
next() and skipTo() respectively, and will be changed to abstract in 3.0).
> #* I actually would like to propose an alternative to the names: advance() and advance(int)
- the first advances by one, the second advances to target.
> # Wherever these are used, do something like '(doc = advance()) >= 0' instead of comparing
to -1 for improved performance.
> I will post a patch shortly

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message