lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <>
Subject [jira] Commented: (LUCENE-1821) Weight.scorer() not passed doc offset for "sub reader"
Date Wed, 19 Aug 2009 00:55:14 GMT


Mark Miller commented on LUCENE-1821:

Sorting is internal. To allow this switch to per segment we implemented a new HitCollector
that can collect from multiple readers - sorting across multiple segments still needed to
be supported, and custom comparators still needed to be supported. All of the ids are manged
internally though - when I say internally, I mean within Lucene. If you implement a custom
FieldComparator, you are still respecting Lucene's internal id usage. We map priority queue
values so that they can be compared with the values in a different Reader, but again, all
of the ids are managed internally. All caching and everything is still done per segment. All
FieldCaches are per Reader and per segment.

The goal is to move all caches to the segment level in Lucene - we don't want to encourage
users to cache per multi-reader by providing API help to do so.

If you need index wide stats, you use the Weight.

You are trying to use the internal ids externally - you are caching from external id to ord
- its really not something I think we intend to support. The fact that we don't support it
is why we were able to make this change. The FieldCache is the caching mechanism that Lucene
supports with internal ids - and it supports it per segment.

> Weight.scorer() not passed doc offset for "sub reader"
> ------------------------------------------------------
>                 Key: LUCENE-1821
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Tim Smith
> Now that searching is done on a per segment basis, there is no way for a Scorer to know
the "actual" doc id for the document's it matches (only the relative doc offset into the segment)
> If using caches in your scorer that are based on the "entire" index (all segments), there
is now no way to index into them properly from inside a Scorer because the scorer is not passed
the needed offset to calculate the "real" docid
> suggest having Weight.scorer() method also take a integer for the doc offset
> Abstract Weight class should have a constructor that takes this offset as well as a method
to get the offset
> All Weights that have "sub" weights must pass this offset down to created "sub" weights

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message