lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Harwood (JIRA)" <>
Subject [jira] Commented: (LUCENE-1999) Match spotter for all query types
Date Wed, 21 Oct 2009 10:57:59 GMT


Mark Harwood commented on LUCENE-1999:

bq. couldn't the wrapper make a separate data structure for tracking which clause matched

I was trying to keep the processing cost super-low with no object allocations because this
is in a very tight loop. We don't really want to be generating a lot of state/processing while
we're still evaluating potentially millions of candidate matches.
That seems to be the challenge doing this instrumentation in-line with the query execution.

bq. Also: doesn't highlighter run, separately, on each doc? And so it's OK if the scores are

The use case I'm tackling right now involves search forms with lots of optional fields (spatial,
numeric, "choice" etc) and I only needed a yes/no match flag for each field. This approach
should give me these answers back immediately without impacting query processing speeds significantly.

However, I can see the value in core Lucene capturing a richer data structure than a simple
boolean where you choose to do a seperate "highlight" pass on the top N documents. This would
suggest that you might need 2 query expressions - one for execution and one for adding highlighter
instrumentation. I suppose the client could add the instrumentation requests to the initial
query which are passive during a Lucene "results-selection" mode and become active in "highlight

> Match spotter for all query types
> ---------------------------------
>                 Key: LUCENE-1999
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>    Affects Versions: 2.9
>            Reporter: Mark Harwood
>         Attachments: matchflagger.patch
> Related to LUCENE-1929 and the current inability to highlight NumericRangeQuery, spatial,
cached term filters and other exotica.
> This patch provides the ability to wrap *any* Query objects and record match info as
flags encoded in the overall document score.
> Using this approach it would be possible to understand (and therefore highlight) which
fields matched clauses in a query.
> The match encoding approach loses some precision in scores as noted here:
> Avoiding these precision issues would require a change to Lucene core to record docId,
score AND a matchFlag byte in ScoreDoc objects and collector APIs.
> This may be something we should consider.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message