lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emir Arnautovic <emir.arnauto...@sematext.com>
Subject Re: Phrase field matches not counting towards minimum match
Date Fri, 24 Feb 2017 09:43:14 GMT
Hi,

mm applies to qf only and pf2/3 is about boosting results that are 
matched. What you can do is play with additional fields in qf and/or try 
making it work close to your requirement with autoRelax parameter. Note 
that in case of autorelax it might result in unexpected results if one 
field is using stop words and other do not or if different stopwords. In 
your case, manufacturer_stop would get single token so it would cause 
that any document matching that token becoming acceptable.

What you can also do is run stricter query and run followup query with 
more relaxed fields/parameters. You can run it also as a single query 
that is OR-ed, just make sure with boost factors that scores for first 
query are much higher then for second query.

HTH,
Emir


On 23.02.2017 22:14, dboychuck wrote:
> Ok let me explain what I am trying to do first since there may be a better
> approach. Recently I had been trying to increase solr's matching precision
> by requiring that all of the words in a field match before allowing a match
> on a field. I am using edismax as my query parser and since it tokenizes on
> white space there's no way to make sure that if my query is q=foo bar and I
> have a field named somefield indexed as a text field with foo bar that foo
> doesn't match and bar doesn't match but the phrase "foo bar" does match.
>
> I feel like I'm not explaining this very well but basically what I want to
> do has already been done by Lucid works:
> https://lucidworks.com/2014/07/02/automatic-phrase-tokenization-improving-lucene-search-precision-by-more-precise-linguistic-analysis/
>
> However their solution requires that you use a pluggable query parser which
> is not an extension of edismax. Now I haven't done a deep comparison but I'm
> assuming I would lose access to all of edismax's parameters if I used their
> pluggable query parser.
>
> So instead I tried to replicate this functionality using edismax's pf2 and
> pf3 parameters. It all works beautifully the way I have it setup except that
> phrase field matches don't count towards my mm count.
>
> Ok so now I will go into detail about how I have my index setup for this
> specific example.
>
> I am using solr's default text field to index a field named manufacturer2
>
> here are the relevant parameters of my search
>
> q=livex lighting 8193
> qf=productid, manufacturer_stop
> pf2=manufacturer2
> mm=3<-1 5<-2 6<90%
>
> now I am stopping the word lighting from my manufacturer_stop field using
> stopwords so only livex is matching in the manufacturer_stop field
>
> However "livex lighting" is matching in the manufacturer2 field using phrase
> field matching in the pf2 parameter.
>
> so my matches are the following:
> MATCH livex in manufacturer_stop field
> MATCH 8193 in productid field
> MATCH "livex lighting" in manufacturer 2 field as a phrase field match
>
> so I have three matches... however the phrase field match doesn't seem be be
> counting towards my mm match requirement of 3 tokens passed 3 must match. If
> I change my mm to require only 2 tokens must match I get the expected
> result. But I want my phrase field to count towards my mm match requirement
> since lighting is matching in my phrase field.
>
> Any assistance would be appreciated.... Or if someone could suggest a better
> approach that would also be appreciated.
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Phrase-field-matches-not-counting-towards-minimum-match-tp4322066.html
> Sent from the Solr - User mailing list archive at Nabble.com.

-- 
Monitoring * Alerting * Anomaly Detection * Centralized Log Management
Solr & Elasticsearch Support * http://sematext.com/


Mime
View raw message