After reading through the IndexSearcher code, it seems I have to do the following:
- implement a custom Collector to collect not just the doc IDs and score, but the fields I care about as well
- extend ScoreDoc to hold the extra fields
- when I get back a TopDocs from a search() call, I can go through the TopDocs and apply the constraints I need to
I think this will work, but have some concern about performance. What would you think?
On Apr 06, 2012, at 10:06 AM, Tri Cao <email@example.com> wrote:
What would be the best approach for a custom scoring that requires a "global" view of the result set. For example, I have a field call "color" and I would like to have constraints that there are at most 3 docs with color:red, 4 docs with color:blue in the first 16 hits. And the items should still be sorted in by their relevance scores after the constraints are applied.