lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-1999) Match spotter for all query types
Date Wed, 21 Oct 2009 12:14:59 GMT


Michael McCandless commented on LUCENE-1999:

I see, it sounds like your use case is different from the typical
highlighting use case in that 1) you don't need the positions of the
matches (just whether a given clause matched the doc or not), and 2)
you need it for every single doc visited by the query, not just for
the handful of docs that are being presented to the user on the
current "page".

bq. This would suggest that you might need 2 query expressions - one for execution and one
for adding highlighter instrumentation.

I'm thinking it's the same query, but we fix the Scorer API for all
queries (= big change!!) to be able to produce match details on
demand, where those match details look something like what getSpans
now returns.  But for the normal case (only highlighting the docs
being shown on current page), we'd only get the match details for that
small set of docs.

Then we ideally would not need a separate mirrored set of span
queries.  Ie, SpanTermQuery would be absorbed into TermQuery, etc.

But I could easily be being too naive here :) Maybe there is some
serious performance cost to even adding the optional API in.

> Match spotter for all query types
> ---------------------------------
>                 Key: LUCENE-1999
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>    Affects Versions: 2.9
>            Reporter: Mark Harwood
>         Attachments: matchflagger.patch
> Related to LUCENE-1929 and the current inability to highlight NumericRangeQuery, spatial,
cached term filters and other exotica.
> This patch provides the ability to wrap *any* Query objects and record match info as
flags encoded in the overall document score.
> Using this approach it would be possible to understand (and therefore highlight) which
fields matched clauses in a query.
> The match encoding approach loses some precision in scores as noted here:
> Avoiding these precision issues would require a change to Lucene core to record docId,
score AND a matchFlag byte in ScoreDoc objects and collector APIs.
> This may be something we should consider.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message