lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <abenede...@apache.org>
Subject Re: SOLR ranking
Date Fri, 19 Feb 2016 10:25:49 GMT
Ok Binoy, now it is clearer :)
Yes, if add sorting and faceting as additional optional requirements, doing
2 queries could be a perilous path !

Cheers

On 19 February 2016 at 09:24, Ere Maijala <ere.maijala@helsinki.fi> wrote:

> If he needs faceting or something (I didn't see that specified), doing two
> queries won't do, of course..
>
> --Ere
>
>
> 19.2.2016, 2.22, Binoy Dalal kirjoitti:
>
>> Hi Alessandro,
>> Don't get me wrong. Using mm, ps and pf can and absolutely will solve his
>> problem.
>>
>> Like I said above, my solution is meant to be a quick and dirty fix. It's
>> really not that complex and shouldn't take more than an hour to setup at
>> the app level. Moreover I suggested it because he said it was urgent for
>> him and setting up a proper config with mm, pf and ps might take him much
>> longer.
>>
>> Hope this clears things up :)
>>
>> On Fri, 19 Feb 2016, 05:31 Alessandro Benedetti <abenedetti@apache.org>
>> wrote:
>>
>> Hey Binoi ,
>>> can't understand why such complexity to be honest :/
>>> Can you explain me why playing with :
>>>
>>> edismax
>>> mm ( percentage of query terms you want to be in the results)
>>> pf ( the fields you want to be boosted if phrase matches )
>>> ps ( slop to allow)
>>>
>>> Should not solve the problem instead of the 2 phases query ?
>>>
>>> Cheers
>>>
>>> On 18 February 2016 at 18:09, Binoy Dalal <binoydalal93@gmail.com>
>>> wrote:
>>>
>>> Here's an alternative solution that may be of some help.
>>>> Here I'm assuming that you are not directly outputting the search
>>>> results
>>>> to the user and have some sort of layer between the results from solr
>>>> and
>>>> presentation to the user where some additional processing can be
>>>>
>>> performed.
>>>
>>>>
>>>> 1) You already know that you want phrase matches to show up higher than
>>>> single matches. In this case, why not do an explicit phrase match first,
>>>> with some slop or as is based on how close you want the phrase terms be
>>>>
>>> to
>>>
>>>> each other.
>>>> 2) Once you have the results from the first query, fire an OR query with
>>>> your terms and get those results.
>>>> 3) Put results from (2) after (1) and present to the user. This happens
>>>>
>>> in
>>>
>>>> the app layer.
>>>>
>>>> This is essentially the same as running a query as such: "Rheumatoid
>>>> Arthritis"~slop OR (Rhuematoid AND Arthritis) but you don't need to
>>>> worry
>>>> about the ordering because you're sorting your results.
>>>>
>>>> Now, this will obviously take more time since you're querying twice and
>>>> then doing the addtional processing in the app layer, but provided your
>>>> architecture is balanced enough and can cope with a little extra load, I
>>>>
>>> do
>>>
>>>> not think that your performance will take that bad a hit. Moreover since
>>>> you're in a hurry, you could implement this as a quick and dirty
>>>> solution
>>>> to meet the project goals, provided it fits the acceptance parameters
>>>> and
>>>> then later play around with the scoring/sorting and figure out the best
>>>> possible setup to suit your needs.
>>>>
>>>> On Thu, Feb 18, 2016 at 4:22 PM Emir Arnautovic <
>>>> emir.arnautovic@sematext.com> wrote:
>>>>
>>>> Hi Nitin,
>>>>> Can you send us how your parsed query looks like (from debug output).
>>>>>
>>>>> Thanks,
>>>>> Emir
>>>>>
>>>>> On 17.02.2016 08:38, Nitin.K wrote:
>>>>>
>>>>>> Hi Binoy,
>>>>>>
>>>>>> We are searching for both phrases and individual words
>>>>>> but we want that only those documents which are having phrases will
>>>>>>
>>>>> come
>>>>
>>>>> first in the order and then the individual app.
>>>>>>
>>>>>> termPositions = true is also not working in my case.
>>>>>>
>>>>>> I have also removed the string type from copy fields. kindly look
>>>>>>
>>>>> into
>>>
>>>> the
>>>>>
>>>>>> changed configuration below:
>>>>>>
>>>>>> Hi Emir,
>>>>>>
>>>>>> I have changed the cofiguration as per your suggestion, added pf2
/
>>>>>>
>>>>> pf3.
>>>>
>>>>> Yes, i saw the difference but still the ranking is not getting
>>>>>>
>>>>> followed
>>>
>>>> correctly in case of phrases.
>>>>>>
>>>>>> Changed configuration;
>>>>>>
>>>>>> <field name="topic_title" type="text_general" indexed="true"
>>>>>>
>>>>> stored="true"
>>>>>
>>>>>> />
>>>>>> <field name="topTitle" type="text_phrase" indexed="true"
>>>>>>
>>>>> stored="false"
>>>
>>>> />
>>>>>
>>>>>>
>>>>>> <field name="subtopic_title" type="text_general" indexed="true"
>>>>>> stored="true"/>
>>>>>> <field name="subTopTitle" type="text_phrase" indexed="true"
>>>>>>
>>>>> stored="false"/>
>>>>>
>>>>>>
>>>>>> <field name="index_term" type="text_ws" indexed="true" stored="true"
>>>>>> multiValued="true"/>
>>>>>> <field name="indTerm" type="text_phrase" indexed="true"
>>>>>>
>>>>> stored="false"
>>>
>>>> multiValued="true"/>
>>>>>>
>>>>>> <field name="drug" type="text_ws" indexed="true" stored="true"
>>>>>> multiValued="true"/>
>>>>>> <field name="drugString" type="text_phrase" indexed="true"
>>>>>>
>>>>> stored="false"
>>>>
>>>>> multiValued="true"/>
>>>>>>
>>>>>> <field name="tglData" type="text_phrase" indexed="true"
>>>>>>
>>>>> stored="false"/>
>>>>
>>>>>
>>>>>> Copy fields again for the reference :
>>>>>>
>>>>>> <copyField source="topic_title" dest="topTitle"/>
>>>>>> <copyField source="subtopic_title" dest="subTopTitle"/>
>>>>>> <copyField source="index_term" dest="indTerm"/>
>>>>>> <copyField source="drug" dest="drugString"/>
>>>>>> <copyField source="content" dest="tglData"/>
>>>>>>
>>>>>> Added following field type:
>>>>>>
>>>>>> <fieldType name="text_phrase" class="solr.TextField"
>>>>>> positionIncrementGap="100" omitNorms="true">
>>>>>>        <analyzer>
>>>>>>                <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>>>>                <filter class="solr.StopFilterFactory"
>>>>>>
>>>>> ignoreCase="true"
>>>
>>>> words="stopwords.txt" />
>>>>>>                <filter class="solr.LowerCaseFilterFactory"/>
>>>>>>        </analyzer>
>>>>>> </fieldType>
>>>>>>
>>>>>> Removed the string type from the copy fields.
>>>>>>
>>>>>> Changed Query :
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>> http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis&wt=xml&tie=1.0&rows=200&q.op=AND&indent=true&defType=edismax&stopwords=true&lowercaseOperators=true&debugQuery=true&
>>>
>>>> pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
>>>>>> pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
>>>>>> pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6&
>>>>>> qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3
>>>>>>
>>>>>> After making these changes, I am able to get my search results
>>>>>>
>>>>> correctly
>>>>
>>>>> for
>>>>>
>>>>>> a single term but in case of phrase search, i am still not able to
>>>>>>
>>>>> get
>>>
>>>> the
>>>>>
>>>>>> results in the correct order.
>>>>>>
>>>>>> Hi Modassar,
>>>>>>
>>>>>> I tried using mm=100, but the order is still the same.
>>>>>>
>>>>>> Hi Alessandro,
>>>>>>
>>>>>> I have not yet tried the slope parameter. By default it is taking
it
>>>>>>
>>>>> as
>>>
>>>> 1.0
>>>>>
>>>>>> when i looked it in debug mode. Will revert you definitely. So, let
>>>>>>
>>>>> me
>>>
>>>> try
>>>>>
>>>>>> this option too.
>>>>>>
>>>>>> All,
>>>>>>
>>>>>> Please suggest if anyone is having any other suggestion on this.
I
>>>>>>
>>>>> have
>>>
>>>> to
>>>>>
>>>>>> implement it on urgent basis and i think i am very close to it.
>>>>>>
>>>>> Thanks
>>>
>>>> all
>>>>>
>>>>>> of you. I have reached to this level just because of you guys.
>>>>>>
>>>>>> Thanks and Regards,
>>>>>> Nitin
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>>
>>>>> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257782.html
>>>>>
>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>>
>>>>>
>>>>> --
>>>>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
>>>>> Solr & Elasticsearch Support * http://sematext.com/
>>>>>
>>>>> --
>>>>>
>>>> Regards,
>>>> Binoy Dalal
>>>>
>>>>
>>>
>>>
>>> --
>>> --------------------------
>>>
>>> Benedetti Alessandro
>>> Visiting card : http://about.me/alessandro_benedetti
>>>
>>> "Tyger, tyger burning bright
>>> In the forests of the night,
>>> What immortal hand or eye
>>> Could frame thy fearful symmetry?"
>>>
>>> William Blake - Songs of Experience -1794 England
>>>
>>>
> --
> Ere Maijala
> Kansalliskirjasto / The National Library of Finland
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message