lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (LUCENE-1821) Weight.scorer() not passed doc offset for "sub reader"
Date Sat, 22 Aug 2009 21:47:14 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746513#action_12746513
] 

Mark Miller edited comment on LUCENE-1821 at 8/22/09 2:46 PM:
--------------------------------------------------------------

{quote}Looks like Filter should have another method added getDocIdSet(IndexSearcher searcher,
IndexReader reader) (deprecating getDocIdSet(IndexReader))
new method would call old method by default (with little harm done in general) {quote}

Its the corner cases. Someone's class calls the deprecated method directly - someone is using
that, plus a new class that overrides the none deprecated method - which never gets called,
cause the other code is calling the dep method directly. Technically, everything has to be
covered (Depending on how consensus goes anyway ... always depending ...). Its a pain in the
butt just thinking about it. 

*edit*

In your example deprecation this is actually the opposite - someone calls the new code directly,
but other code you are using overrides the deprecated code. The override is now not called.

      was (Author: markrmiller@gmail.com):
    {quote}Looks like Filter should have another method added getDocIdSet(IndexSearcher searcher,
IndexReader reader) (deprecating getDocIdSet(IndexReader))
new method would call old method by default (with little harm done in general) {quote}

Its the corner cases. Someone's class calls the deprecated method directly - someone is using
that, plus a new class that overrides the none deprecated method - which never gets called,
cause the other code is calling the dep method directly. Technically, everything has to be
covered (Depending on how consensus goes anyway ... always depending ...). Its a pain in the
butt just thinking about it. 
  
> Weight.scorer() not passed doc offset for "sub reader"
> ------------------------------------------------------
>
>                 Key: LUCENE-1821
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1821
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Tim Smith
>             Fix For: 2.9
>
>         Attachments: LUCENE-1821.patch
>
>
> Now that searching is done on a per segment basis, there is no way for a Scorer to know
the "actual" doc id for the document's it matches (only the relative doc offset into the segment)
> If using caches in your scorer that are based on the "entire" index (all segments), there
is now no way to index into them properly from inside a Scorer because the scorer is not passed
the needed offset to calculate the "real" docid
> suggest having Weight.scorer() method also take a integer for the doc offset
> Abstract Weight class should have a constructor that takes this offset as well as a method
to get the offset
> All Weights that have "sub" weights must pass this offset down to created "sub" weights
> Details on workaround:
> In order to work around this, you must do the following:
> * Subclass IndexSearcher
> * Add "int getIndexReaderBase(IndexReader)" method to your subclass
> * during Weight creation, the Weight must hold onto a reference to the passed in Searcher
(casted to your sub class)
> * during Scorer creation, the Scorer must be passed the result of YourSearcher.getIndexReaderBase(reader)
> * Scorer can now rebase any collected docids using this offset
> Example implementation of getIndexReaderBase():
> {code}
> // NOTE: more efficient implementation can be done if you cache the result if gatherSubReaders
in your constructor
> public int getIndexReaderBase(IndexReader reader) {
>   if (reader == getReader()) {
>     return 0;
>   } else {
>     List readers = new ArrayList();
>     gatherSubReaders(readers);
>     Iterator iter = readers.iterator();
>     int maxDoc = 0;
>     while (iter.hasNext()) {
>       IndexReader r = (IndexReader)iter.next();
>       if (r == reader) {
>         return maxDoc;
>       } 
>       maxDoc += r.maxDoc();
>     } 
>   }
>   return -1; // reader not in searcher
> }
> {code}
> Notes:
> * This workaround makes it so you cannot serialize your custom Weight implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message