lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1614) Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead of boolean
Date Mon, 01 Jun 2009 12:20:07 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715072#action_12715072
] 

Michael McCandless commented on LUCENE-1614:
--------------------------------------------

bq. So you mean add something like this to the javadocs "after NO_MORE_DOCS was returned,
you should not call this method again, or it may result in unpredicted behavior"?

Right.  And, optimizing the core Scorers based on this (ie, to stop
checking if they had already returned NO_MORE_DOCS, in
nextDoc/advance).

bq. You mean that this will remove the d == -1 check?

No, I think that check must remain, even once we switch to the new
semantics.  What I meant was, currently, if I call nextDoc() on
DocIdBitSet's DISI after nextDoc had already returned NO_MORE_DOCS, it
looks like I'll get into trouble because that will then call
BitSet.nextSetBit(NO_MORE_DOCS+1), ie,
BitSet.nextSetBit(Integer.MIN_VALUE), which'll throw an exception.
Ie, this DISI already requires that you don't call nextDoc() again
after it's returned NO_MORE_DOCS.

bq. So you mean to change the last line here:

Actually, I meant in the case where filterDoc == scorerDoc, you ought
to advance both filter & scorer, not just filter (you know at that
point that both must advance).

I think the new while loop is buggy: if scorerDoc is not == filterDoc,
and you ask filterDocIdIter to advance to scorerDoc, it may advance to
become ==, yet you then illegally advance scorerDoc to filterDoc?

bq. Also Mike - the patch you posted is 152KB, while my last patch is 201KB. It's hard to
compare our patches since the order of the classes is different, so until I apply the patch
and check it, I wanted to make sure you included all the changes in the patch. 

I'm pretty sure this is OK: it's because I changed the eol-style to
native (in my local checkout), which then caused "svn diff" to produce
more reasonable diffs.  In your patch, there are several files where
the entire file is deleted, and then the entire new file is added,
because the eol-style wasn't set.  If you "grep Index: patch.txt | wc"
of yours & mine, they should the same.


> Add next() and skipTo() variants to DocIdSetIterator that return the current doc, instead
of boolean
> ----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1614
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1614
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Shai Erera
>             Fix For: 2.9
>
>         Attachments: LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch,
LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch,
LUCENE-1614.patch
>
>
> See http://www.nabble.com/Another-possible-optimization---now-in-DocIdSetIterator-p23223319.html
for the full discussion. The basic idea is to add variants to those two methods that return
the current doc they are at, to save successive calls to doc(). If there are no more docs,
return -1. A summary of what was discussed so far:
> # Deprecate those two methods.
> # Add nextDoc() and skipToDoc(int) that return doc, with default impl in DISI (calls
next() and skipTo() respectively, and will be changed to abstract in 3.0).
> #* I actually would like to propose an alternative to the names: advance() and advance(int)
- the first advances by one, the second advances to target.
> # Wherever these are used, do something like '(doc = advance()) >= 0' instead of comparing
to -1 for improved performance.
> I will post a patch shortly

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message