lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1821) Weight.scorer() not passed doc offset for "sub reader"
Date Wed, 19 Aug 2009 03:09:14 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744853#action_12744853
] 

Mark Miller commented on LUCENE-1821:
-------------------------------------

bq. he's not using external ids, he's using the internal lucene docIds

Let me try a response to this one once more:

If you try and make a filter that always matches docs 0-10, you could have made a filter that
just sets bits 0-10. You are technically using 'internal' lucene doc ids. 

With the new per segment search though, you will find that you match the first 10 docs in
every segment, not just the first 10 docs in the multi-reader virtual id space. This is what
I call using the internal doc ids externally. You are counting on a single id space covering
the whole index for the reader. This was never promised though. So just like this type of
filter was not *really* supported and no longer works - this method of relying on the IndexReader
to support one id space across the whole index no longer works as well. The Searcher supports
the whole index, but a given IndexReader was never promised to do so. We could have passed
base doc ids to the filters so that they could reconstruct the multi-reader virtual ids, and
then just actually match docs 0-10 - but thats exactly the opposite of what we are trying
to achieve. We switched to per segment to get away from that.

> Weight.scorer() not passed doc offset for "sub reader"
> ------------------------------------------------------
>
>                 Key: LUCENE-1821
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1821
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Tim Smith
>
> Now that searching is done on a per segment basis, there is no way for a Scorer to know
the "actual" doc id for the document's it matches (only the relative doc offset into the segment)
> If using caches in your scorer that are based on the "entire" index (all segments), there
is now no way to index into them properly from inside a Scorer because the scorer is not passed
the needed offset to calculate the "real" docid
> suggest having Weight.scorer() method also take a integer for the doc offset
> Abstract Weight class should have a constructor that takes this offset as well as a method
to get the offset
> All Weights that have "sub" weights must pass this offset down to created "sub" weights

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message