lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1999) Match spotter for all query types
Date Wed, 21 Oct 2009 10:25:59 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12768163#action_12768163
] 

Michael McCandless commented on LUCENE-1999:
--------------------------------------------

Very clever!

Since you are wrapping arbitrary query objs, couldn't the wrapper make a separate data structure
for tracking which clause matched (instead of encoding it into the score)?

Also: doesn't highlighter run, separately, on each doc?  And so it's OK if the scores are
affected?  Ie, I would run my main search with a normal query, get the 10 results for the
current page, then step through each of those 10 doc IDs make a single-doc-IndexSearcher,
and run this wrapper?

{quote}
Avoiding these precision issues would require a change to Lucene core to record docId, score
AND a matchFlag byte in ScoreDoc objects and collector APIs.
This may be something we should consider.
{quote}

+1  I would love to see the Scorer API extended to optionally provide details on matches.
 Not just which clause matched which docs/fields, but the positions within the field where
the match occurred.  I think we could do this by absorbing *SpanQuery into their normal Query
counterparts, making the getSpans API [somehow] optional so that if you didn't invoke it you
don't pay a performance price.

> Match spotter for all query types
> ---------------------------------
>
>                 Key: LUCENE-1999
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1999
>             Project: Lucene - Java
>          Issue Type: New Feature
>    Affects Versions: 2.9
>            Reporter: Mark Harwood
>         Attachments: matchflagger.patch
>
>
> Related to LUCENE-1929 and the current inability to highlight NumericRangeQuery, spatial,
cached term filters and other exotica.
> This patch provides the ability to wrap *any* Query objects and record match info as
flags encoded in the overall document score.
> Using this approach it would be possible to understand (and therefore highlight) which
fields matched clauses in a query.
> The match encoding approach loses some precision in scores as noted here: http://tinyurl.com/ykt8nx7
> Avoiding these precision issues would require a change to Lucene core to record docId,
score AND a matchFlag byte in ScoreDoc objects and collector APIs.
> This may be something we should consider.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message