lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <>
Subject Re: using solr to do a 'match'
Date Wed, 11 Apr 2012 08:32:01 GMT

This use case is similar to matching boolean expression problem. You can
find recent thread about it. I have an idea that we can introduce
disjunction query with dynamic mm (minShouldMatch parameter
i.e. 'match these clauses disjunctively but for every document use
from field cache of field xxxCount as a minShouldMatch parameter'. Also
norms can be used as a source for dynamics mm values.


On Wed, Apr 11, 2012 at 10:08 AM, Li Li <> wrote:

> it's not possible now because lucene don't support this.
> when doing disjunction query, it only record how many terms match this
> document.
> I think this is a common requirement for many users.
> I suggest lucene should divide scorer to a matcher and a scorer.
> the matcher just return which doc is matched and why/how the doc is
> matched.
> especially for disjuction query, it should tell which term matches and
> possible other
> information such as tf/idf and the distance of terms(to support proximity
> search).
> That's the matcher's job. and then the scorer(a ranking algorithm) use
> flexible algorithm
> to score this document and the collector can collect it.
> On Wed, Apr 11, 2012 at 10:28 AM, Chris Book <> wrote:
> > Hello, I have a solr index running that is working very well as a search.
> >  But I want to add the ability (if possible) to use it to do matching.
>  The
> > problem is that by default it is only looking for all the input terms to
> be
> > present, and it doesn't give me any indication as to how many terms in
> the
> > target field were not specified by the input.
> >
> > For example, if I'm trying to match to the song title "dust in the wind",
> > I'm correctly getting a match if the input query is "dust in wind".  But
> I
> > don't want to get a match if the input is just "dust".  Although as a
> > search "dust" should return this result, I'm looking for some way to
> filter
> > this out based on some indication that the input isn't close enough to
> the
> > output.  Perhaps if I could get information that that the number of input
> > terms is much less than the number of terms in the field.  Or something
> > else along those line?
> >
> > I realize that this isn't the typical use case for a search, but I'm just
> > looking for some suggestions as to how I could improve the above example
> a
> > bit.
> >
> > Thanks,
> > Chris
> >

Sincerely yours
Mikhail Khludnev


View raw message