lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <>
Subject [jira] [Updated] (LUCENE-3395) FreqFilteringScorerWrapper and min/max freq options on TermQuery
Date Mon, 22 Aug 2011 18:07:29 GMT


Hoss Man updated LUCENE-3395:

    Attachment: LUCENE-3395.patch

patch containing FreqFilteringScorerWrapper and a test.  I haven't yet done the work on TermQuery
to add options for this -- wanted to see what people thought of it first and get some code
review ... been a while since i touched code this deep in the stack.

a few things to note:

* entire class is marked experimental since it's whole existence depends on an experimental
method of the Scorer API.  that said: even if we rip out Scorer.freq, i think we can still
support this as a TermQuery feature since freq info will always be available from TermScorer.
* test currently has some nocommit's related to an NPE when trying to check the edge case
of wrapping a Scorer that matches nothing.  i think the problem relates to some code i cut/paste
from TestTermScorer for getting a Scorer from a Query+Searcher to use in the test, but it
seems to optimize the Scorer to null when it matches nothing  (even if i didn't have this
NPE, that getScorer method would be marked nocommit until someone verified it was in fact
a "valid" way for a test to get direct access to a  Scorer)

> FreqFilteringScorerWrapper and min/max freq options on TermQuery
> ----------------------------------------------------------------
>                 Key: LUCENE-3395
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>            Reporter: Hoss Man
>         Attachments: LUCENE-3395.patch
> A Solr User was asking about how specify a minimum tf when searching for a term (ie:
documents matching "dog" at least 3 times).
> Based on a conversation with rmuir on IRC, that led me to realize that we now explicitly
expose a general "freq()" method on Scorer, and that min/max freq constraints could be implemented
as a general Scorer Wrapper.
> I propose that we add such a wrapper, and add setMinFreq(float)/setMaxFreq(float) methods
to TermQuery (similar to the minNumShouldMatches and disableCoord type setters in BooleanQuery)
that cause it to be used automatically.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message