lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1652) Enhancements to Scorers following the changes to DocIdSetIterator
Date Mon, 25 May 2009 14:44:45 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712733#action_12712733
] 

Shai Erera commented on LUCENE-1652:
------------------------------------

bq. If we did these, you could upgrade to 2.9, fix all deprecations, then upgrade to 3.0,
recompile just fine ...

I'm not sure about it. In 3.0, we'll make nextDoc() abstract (for sure, since the default
impl calls next()) and probably advance() also. So when you upgrade to 2.9, you can switch
to calling nextDoc() and advance(), but if you implemented DISI, you won't be required to
implement nextDoc() and/or advance(), so when you upgrade to 3.0 your code won't compile.

When upgrading, I think we should assume (or even require) users reading CHANGES. When they
notice that DISI has changed and that they need to implement two new methods, they should
also notice the change in semantics of doc().

I take it that by "catastrophic" you mean that you're ok with people upgrading to 3.0 and
don't compile, since that will force them to read CHANGES or javadocs and understand what
they are now supposed to implement. Therefore if document() documents the new semantics, it
is ok for us to rely on that, and if something fails, it's the user's problem.

bq. maybe we add DISI.document(), with the new semantics

If we add document() (note the longer method name, compared to doc()) we can implement it
following the new semantics and take advantage of that in 2.9 already (I think?). For example:

{code}
public abstract class DocIdSetIterator {

  private int doc = -1;

  public int document() { return doc; }

  public int nextDoc() throws IOException {
    if (next()) {
       doc = doc();
    } else {
      doc = NO_MORE_DOCS;
    }
    return doc;
  }

  public int advance() throws IOException {
    while ((doc = nextDoc()) < target) {}
    return doc;
  }
}
{code}

We also move to call document() internally. I think this should work?

If this indeed should work, where should I do it - in this issue (I need 1614 to be committed
first) or in 1614?

> Enhancements to Scorers following the changes to DocIdSetIterator
> -----------------------------------------------------------------
>
>                 Key: LUCENE-1652
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1652
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Shai Erera
>             Fix For: 3.0
>
>
> In LUCENE-1614, we changed the semantics of DocIdSetIterator's methods to return a sentinel
NO_MORE_DOCS (= Integer.MAX_VALUE) when the iterator has exhausted. Due to backward compatibility
issues, we couldn't implement that semantics in doc(). Therefore this issue, which can be
introduced in 3.0 only will:
> # Implement the new semantics in all extending classes, such that doc() will return NO_MORE_DOCS
when the iterator has exhausted.
> # Change BooleanScorer to take advantage of that by removing sub.done from SubScorer
and operate under the assumption that NO_MORE_DOCS is larger than any doc ID (Integer.MAX_VALUE).
> # Change ConjunctionScorer to operate under the same assumptions and remove 'more'.
> # Change ReqExclScorer to not rely on reqScorer in doc(), since the latter may be null.
> # Make more changes to ConjunctionScorer's init() and remove 'firstTime' to improve the
performance of nextDoc(), score(), advance().
> # Add start()/finish() to DISI?
> A snippet from LUCENE-1614 regarding the change in BooleanScorer
> {code}
> int doc = sub.done ? -1 : scorer.doc();
> while (!sub.done && doc < end) {
>   sub.collector.collect(doc);
>   doc = scorer.nextDoc();
>   sub.done = doc < 0;
> }
> {code}
> To this:
> {code}
> int doc = scorer.doc();
> while (doc < end) {
>   sub.collector.collect(doc);
>   doc = scorer.nextDoc();
> }
> {code}
> And in ConjunctionScorer, change this:
> {code}
> while (more && (firstScorer=scorers[first]).doc() < (lastDoc=lastScorer.doc()))
{
>   more = firstScorer.advance(lastDoc) >= 0;
>   lastScorer = firstScorer;
>   first = (first == (scorers.length-1)) ? 0 : first+1;
> }
> return more;
> {code}
> To this:
> {code}
> while ((firstScorer=scorers[first]).doc() < (lastDoc=lastScorer.doc())) {
>   firstScorer.advance(lastDoc);
>   lastScorer = firstScorer;
>   first = (first == (scorers.length-1)) ? 0 : first+1;
> }
> return lastDoc != DOC_SENTINEL;
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message