lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Created: (LUCENE-2684) it's not possible to access sub-query's freq information if BooleanScorer is use
Date Mon, 04 Oct 2010 09:42:32 GMT
it's not possible to access sub-query's freq information if BooleanScorer is use
--------------------------------------------------------------------------------

                 Key: LUCENE-2684
                 URL: https://issues.apache.org/jira/browse/LUCENE-2684
             Project: Lucene - Java
          Issue Type: Bug
          Components: Search
            Reporter: Michael McCandless
             Fix For: 3.1, 4.0


LUCENE-2590 added an advanced feature, allowing an app to gather all sub-scorers for any Query.

This is powerful because then, during collection, the app can get some details about how each
sub-query "participated" in the overall match for the given document.

However, I think this is completely broken if the BooleanQuery uses BooleanScorer, because
that scorer is not doc-at-once.  Instead, it batch processes chunks of 2048 sequential docIDs
per scorer.  This is a big performance gain, but it means that the sub scorers will all be
positioned to the end of the 2048 doc chunk while the docs that matched within that chunk
are collected.

I don't think we can easily fix this... likely the "fix" is to make it easy(ier) to force
BQ to use BooleanScorer2 (which is doc-at-once)?  It is actually possible to force this, today,
by having your collector return false from acceptDocsOutOfOrder...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message