lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <abenede...@apache.org>
Subject Re: SOLR ranking
Date Tue, 16 Feb 2016 12:50:04 GMT
You can describe the pf field as an exact phrase query : ""~0 .
But
You can specify the slop with :

The ps Parameter

Default amount of slop on phrase queries built with pf, pf2 and/or pf3 fields
(affects boosting).

Just take a look to the edismax page in the wiki, it seems well described :

https://cwiki.apache.org/confluence/display/solr/The+Extended+DisMax+Query+Parser

But if this is what you want :

Query : A B

Results :

1) A B

2) A C B

3) A C C B

...

4) A C C C C C C C C C B


It's not going to be simple.

On 16 February 2016 at 12:33, Binoy Dalal <binoydalal93@gmail.com> wrote:

> By my understanding, it will depend on whether you're explicitly running
> the phrase query or whether you're also searching for the terms
> individually.
> In the first case, it will not match.
> In the second case, it will match just as long as your field contains all
> the terms.
>
> On Tue, 16 Feb 2016, 17:52 Modassar Ather <modather1981@gmail.com> wrote:
>
> > In that case will a phrase with a given slop match a document having the
> > terms of the given phrase with more than the given slop in between them
> > when pf field and mm=100%? Per my understanding as a phrase it will not
> > match for sure.
> >
> > Best,
> > Modassar
> >
> >
> > On Tue, Feb 16, 2016 at 5:26 PM, Alessandro Benedetti <
> > abenedetti@apache.org
> > > wrote:
> >
> > > If I remember well , it is going to be as a phrase query ( when you use
> > the
> > > "quotes") .
> > > So the close proximity means a match of the phrase with 0 tolerance (
> so
> > > the terms must respect the position distance in the query).
> > > If I remember well I debugged that recently.
> > >
> > > Cheers
> > >
> > > On 16 February 2016 at 11:42, Modassar Ather <modather1981@gmail.com>
> > > wrote:
> > >
> > > > Actually you can get it with the edismax.
> > > > Just set mm to 100% and then configure a pf field ( or more) .
> > > > You are going to search all the search terms mandatory and boost
> > phrases
> > > > match .
> > > >
> > > > @Alessandro Thanks for your insight.
> > > > I thought that the document will be boosted if all of the terms
> appear
> > in
> > > > close proximity by setting pf. Not sure how much is meant by the
> close
> > > > proximity. Checked it on dismax query parser wiki too.
> > > >
> > > > Best,
> > > > Modassar
> > > >
> > > > On Tue, Feb 16, 2016 at 3:36 PM, Alessandro Benedetti <
> > > > abenedetti@apache.org
> > > > > wrote:
> > > >
> > > > > Binoy, the omitTermFreqAndPositions is set only for text_ws which
> is
> > > used
> > > > > only on the "indexed_terms" field.
> > > > > The text_general fields seem fine to me.
> > > > >
> > > > > Are you omitting norms on purpose ? To be fair it could be relevant
> > in
> > > > > title or short topic searches to boost up short field values,
> > > containing
> > > > a
> > > > > lot of terms from the searched query.
> > > > >
> > > > > To respond Modassar :
> > > > >
> > > > > I don't think the phrase will be searched as individual ANDed terms
> > > until
> > > > > > the query has it like below.
> > > > > > "Eating Disorders" OR (Eating AND Disorders).
> > > > > >
> > > > >
> > > > > Actually you can get it with the edismax.
> > > > > Just set mm to 100% and then configure a pf field ( or more) .
> > > > > You are going to search all the search terms mandatory and boost
> > > phrases
> > > > > match .
> > > > >
> > > > > Cheers
> > > > >
> > > > > On 16 February 2016 at 07:57, Emir Arnautovic <
> > > > > emir.arnautovic@sematext.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Nitin,
> > > > > > You can use pf parameter to boost results with exact phrase.
You
> > can
> > > > also
> > > > > > use pf2 and pf3 to boost results with bigrams (phrase matches
> with
> > 2
> > > > or 3
> > > > > > words in case input is with more than 3 words)
> > > > > >
> > > > > > Regards,
> > > > > > Emir
> > > > > >
> > > > > >
> > > > > > On 16.02.2016 06:18, Nitin.K wrote:
> > > > > >
> > > > > >> I am using edismax parser with the following query:
> > > > > >>
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> localhost:8983/solr/tgl/select?q=eating%20disorders&wt=xml&tie=1.0&rows=200&q.op=AND&indent=true&defType=edismax&stopwords=true&lowercaseOperators=true&debugQuery=true&qf=topic_title%5E100+subtopic_title%5E40+index_term%5E20+drug%5E15+content%5E3&pf2=topTitle%5E200+subTopTitle%5E80+indTerm%5E40+drugString%5E30+content%5E6
> > > > > >>
> > > > > >> Configuration of schema.xml
> > > > > >>
> > > > > >> <field name="topic_title" type="text_general" indexed="true"
> > > > > stored="true"
> > > > > >> />
> > > > > >> <field name="topTitle" type="string" indexed="true"
> > stored="false"/>
> > > > > >>
> > > > > >> <field name="subtopic_title" type="text_general" indexed="true"
> > > > > >> stored="true"/>
> > > > > >> <field name="subTopTitle" type="string" indexed="true"
> > > > stored="false"/>
> > > > > >>
> > > > > >> <field name="index_term" type="text_ws" indexed="true"
> > stored="true"
> > > > > >> multiValued="true"/>
> > > > > >> <field name="indTerm" type="string" indexed="true"
> stored="false"
> > > > > >> multiValued="true"/>
> > > > > >>
> > > > > >> <field name="drug" type="text_ws" indexed="true" stored="true"
> > > > > >> multiValued="true"/>
> > > > > >> <field name="drugString" type="string" indexed="true"
> > stored="false"
> > > > > >> multiValued="true"/>
> > > > > >>
> > > > > >> <field name="content" type="text_general" indexed="true"
> > > > stored="true"/>
> > > > > >>
> > > > > >> <copyField source="topic_title" dest="topTitle"/>
> > > > > >> <copyField source="subtopic_title" dest="subTopTitle"/>
> > > > > >> <copyField source="index_term" dest="indTerm"/>
> > > > > >> <copyField source="drug" dest="drugString"/>
> > > > > >>
> > > > > >> <fieldType name="text_general" class="solr.TextField"
> > > > > >> positionIncrementGap="100" omitNorms="true">
> > > > > >>         <analyzer type="index">
> > > > > >>                         <tokenizer
> > > > > class="solr.StandardTokenizerFactory"/>
> > > > > >>                         <filter class="solr.StopFilterFactory"
> > > > > >> ignoreCase="true"
> > > > > >> words="stopwords.txt" />
> > > > > >>                         <filter
> > > class="solr.LowerCaseFilterFactory"/>
> > > > > >>         </analyzer>
> > > > > >>         <analyzer type="query">
> > > > > >>                         <tokenizer
> > > > > class="solr.StandardTokenizerFactory"/>
> > > > > >>                         <filter class="solr.StopFilterFactory"
> > > > > >> ignoreCase="true"
> > > > > >> words="stopwords.txt" />
> > > > > >>                         <filter
> class="solr.SynonymFilterFactory"
> > > > > >> synonyms="synonyms.txt"
> > > > > >> ignoreCase="true" expand="true"/>
> > > > > >>                         <filter
> > > class="solr.LowerCaseFilterFactory"/>
> > > > > >>         </analyzer>
> > > > > >> </fieldType>
> > > > > >> <fieldType name="text_ws" class="solr.TextField"
> > > > > >> positionIncrementGap="100"
> > > > > >> omitTermFreqAndPositions="true" omitNorms="true">
> > > > > >>         <analyzer>
> > > > > >>                         <tokenizer
> > > > > >> class="solr.WhitespaceTokenizerFactory"/>
> > > > > >>                         <filter class="solr.StopFilterFactory"
> > > > > >> ignoreCase="true"
> > > > > >> words="stopwords.txt" />
> > > > > >>                         <filter
> > > class="solr.LowerCaseFilterFactory"/>
> > > > > >>         </analyzer>
> > > > > >> </fieldType>
> > > > > >>
> > > > > >>
> > > > > >> I want , if user will search for a phrase then that pharse
> should
> > > > always
> > > > > >> takes the priority in comaprison to the individual words;
> > > > > >>
> > > > > >> Example: "Eating Disorders"
> > > > > >>
> > > > > >> First it will search for "Eating Disorders" together and
then
> the
> > > > > >> individual
> > > > > >> words "Eating" and "Disorders"
> > > > > >> but while searching for individual words, it will always
return
> > > those
> > > > > >> documents where both the words should exist for which i
am
> already
> > > > using
> > > > > >> q.op="AND" in my query.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Nitin
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> View this message in context:
> > > > > >>
> > > >
> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257510.html
> > > > > >> Sent from the Solr - User mailing list archive at Nabble.com.
> > > > > >>
> > > > > >
> > > > > > --
> > > > > > Monitoring * Alerting * Anomaly Detection * Centralized Log
> > > Management
> > > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > --------------------------
> > > > >
> > > > > Benedetti Alessandro
> > > > > Visiting card : http://about.me/alessandro_benedetti
> > > > >
> > > > > "Tyger, tyger burning bright
> > > > > In the forests of the night,
> > > > > What immortal hand or eye
> > > > > Could frame thy fearful symmetry?"
> > > > >
> > > > > William Blake - Songs of Experience -1794 England
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > --------------------------
> > >
> > > Benedetti Alessandro
> > > Visiting card : http://about.me/alessandro_benedetti
> > >
> > > "Tyger, tyger burning bright
> > > In the forests of the night,
> > > What immortal hand or eye
> > > Could frame thy fearful symmetry?"
> > >
> > > William Blake - Songs of Experience -1794 England
> > >
> >
> --
> Regards,
> Binoy Dalal
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message