Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 77FC518518 for ; Fri, 19 Feb 2016 10:26:01 +0000 (UTC) Received: (qmail 28952 invoked by uid 500); 19 Feb 2016 10:25:51 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 28879 invoked by uid 500); 19 Feb 2016 10:25:50 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 28868 invoked by uid 99); 19 Feb 2016 10:25:50 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 19 Feb 2016 10:25:50 +0000 Received: from mail-oi0-f54.google.com (mail-oi0-f54.google.com [209.85.218.54]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 882261A01AB for ; Fri, 19 Feb 2016 10:25:50 +0000 (UTC) Received: by mail-oi0-f54.google.com with SMTP id m82so9105899oif.1 for ; Fri, 19 Feb 2016 02:25:50 -0800 (PST) X-Gm-Message-State: AG10YOR8+tCuKSZ+GhkODm4A8KXU2f5gXhU2ua/nW6xOcHIDVWa4Wo8xd4FNlkGeYtF8REanImNxuWMrJy7h4g== MIME-Version: 1.0 X-Received: by 10.202.56.86 with SMTP id f83mr10759096oia.64.1455877549709; Fri, 19 Feb 2016 02:25:49 -0800 (PST) Received: by 10.202.48.137 with HTTP; Fri, 19 Feb 2016 02:25:49 -0800 (PST) In-Reply-To: <56C6DF56.9060005@helsinki.fi> References: <1455559819167-4257420.post@n3.nabble.com> <1455599921170-4257510.post@n3.nabble.com> <56C2D64F.9040801@sematext.com> <1455694681103-4257782.post@n3.nabble.com> <56C5A263.6080708@sematext.com> <56C6DF56.9060005@helsinki.fi> Date: Fri, 19 Feb 2016 10:25:49 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: SOLR ranking From: Alessandro Benedetti To: "solr-user@lucene.apache.org" Content-Type: multipart/alternative; boundary=001a113cc1142fe6e5052c1ce863 --001a113cc1142fe6e5052c1ce863 Content-Type: text/plain; charset=UTF-8 Ok Binoy, now it is clearer :) Yes, if add sorting and faceting as additional optional requirements, doing 2 queries could be a perilous path ! Cheers On 19 February 2016 at 09:24, Ere Maijala wrote: > If he needs faceting or something (I didn't see that specified), doing two > queries won't do, of course.. > > --Ere > > > 19.2.2016, 2.22, Binoy Dalal kirjoitti: > >> Hi Alessandro, >> Don't get me wrong. Using mm, ps and pf can and absolutely will solve his >> problem. >> >> Like I said above, my solution is meant to be a quick and dirty fix. It's >> really not that complex and shouldn't take more than an hour to setup at >> the app level. Moreover I suggested it because he said it was urgent for >> him and setting up a proper config with mm, pf and ps might take him much >> longer. >> >> Hope this clears things up :) >> >> On Fri, 19 Feb 2016, 05:31 Alessandro Benedetti >> wrote: >> >> Hey Binoi , >>> can't understand why such complexity to be honest :/ >>> Can you explain me why playing with : >>> >>> edismax >>> mm ( percentage of query terms you want to be in the results) >>> pf ( the fields you want to be boosted if phrase matches ) >>> ps ( slop to allow) >>> >>> Should not solve the problem instead of the 2 phases query ? >>> >>> Cheers >>> >>> On 18 February 2016 at 18:09, Binoy Dalal >>> wrote: >>> >>> Here's an alternative solution that may be of some help. >>>> Here I'm assuming that you are not directly outputting the search >>>> results >>>> to the user and have some sort of layer between the results from solr >>>> and >>>> presentation to the user where some additional processing can be >>>> >>> performed. >>> >>>> >>>> 1) You already know that you want phrase matches to show up higher than >>>> single matches. In this case, why not do an explicit phrase match first, >>>> with some slop or as is based on how close you want the phrase terms be >>>> >>> to >>> >>>> each other. >>>> 2) Once you have the results from the first query, fire an OR query with >>>> your terms and get those results. >>>> 3) Put results from (2) after (1) and present to the user. This happens >>>> >>> in >>> >>>> the app layer. >>>> >>>> This is essentially the same as running a query as such: "Rheumatoid >>>> Arthritis"~slop OR (Rhuematoid AND Arthritis) but you don't need to >>>> worry >>>> about the ordering because you're sorting your results. >>>> >>>> Now, this will obviously take more time since you're querying twice and >>>> then doing the addtional processing in the app layer, but provided your >>>> architecture is balanced enough and can cope with a little extra load, I >>>> >>> do >>> >>>> not think that your performance will take that bad a hit. Moreover since >>>> you're in a hurry, you could implement this as a quick and dirty >>>> solution >>>> to meet the project goals, provided it fits the acceptance parameters >>>> and >>>> then later play around with the scoring/sorting and figure out the best >>>> possible setup to suit your needs. >>>> >>>> On Thu, Feb 18, 2016 at 4:22 PM Emir Arnautovic < >>>> emir.arnautovic@sematext.com> wrote: >>>> >>>> Hi Nitin, >>>>> Can you send us how your parsed query looks like (from debug output). >>>>> >>>>> Thanks, >>>>> Emir >>>>> >>>>> On 17.02.2016 08:38, Nitin.K wrote: >>>>> >>>>>> Hi Binoy, >>>>>> >>>>>> We are searching for both phrases and individual words >>>>>> but we want that only those documents which are having phrases will >>>>>> >>>>> come >>>> >>>>> first in the order and then the individual app. >>>>>> >>>>>> termPositions = true is also not working in my case. >>>>>> >>>>>> I have also removed the string type from copy fields. kindly look >>>>>> >>>>> into >>> >>>> the >>>>> >>>>>> changed configuration below: >>>>>> >>>>>> Hi Emir, >>>>>> >>>>>> I have changed the cofiguration as per your suggestion, added pf2 / >>>>>> >>>>> pf3. >>>> >>>>> Yes, i saw the difference but still the ranking is not getting >>>>>> >>>>> followed >>> >>>> correctly in case of phrases. >>>>>> >>>>>> Changed configuration; >>>>>> >>>>>> >>>>> >>>>> stored="true" >>>>> >>>>>> /> >>>>>> >>>>> >>>>> stored="false" >>> >>>> /> >>>>> >>>>>> >>>>>> >>>>> stored="true"/> >>>>>> >>>>> >>>>> stored="false"/> >>>>> >>>>>> >>>>>> >>>>> multiValued="true"/> >>>>>> >>>>> >>>>> stored="false" >>> >>>> multiValued="true"/> >>>>>> >>>>>> >>>>> multiValued="true"/> >>>>>> >>>>> >>>>> stored="false" >>>> >>>>> multiValued="true"/> >>>>>> >>>>>> >>>>> >>>>> stored="false"/> >>>> >>>>> >>>>>> Copy fields again for the reference : >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Added following field type: >>>>>> >>>>>> >>>>> positionIncrementGap="100" omitNorms="true"> >>>>>> >>>>>> >>>>>> >>>>> >>>>> ignoreCase="true" >>> >>>> words="stopwords.txt" /> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Removed the string type from the copy fields. >>>>>> >>>>>> Changed Query : >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> http://localhost:8983/solr/tgl/select?q=rheumatoid%20arthritis&wt=xml&tie=1.0&rows=200&q.op=AND&indent=true&defType=edismax&stopwords=true&lowercaseOperators=true&debugQuery=true& >>> >>>> pf=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6& >>>>>> pf2=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6& >>>>>> pf3=topTitle^200 subtopTitle^80 indTerm^40 drugString^30 tglData^6& >>>>>> qf=topic_title^100 subtopic_title^40 index_term^20 drug^15 content^3 >>>>>> >>>>>> After making these changes, I am able to get my search results >>>>>> >>>>> correctly >>>> >>>>> for >>>>> >>>>>> a single term but in case of phrase search, i am still not able to >>>>>> >>>>> get >>> >>>> the >>>>> >>>>>> results in the correct order. >>>>>> >>>>>> Hi Modassar, >>>>>> >>>>>> I tried using mm=100, but the order is still the same. >>>>>> >>>>>> Hi Alessandro, >>>>>> >>>>>> I have not yet tried the slope parameter. By default it is taking it >>>>>> >>>>> as >>> >>>> 1.0 >>>>> >>>>>> when i looked it in debug mode. Will revert you definitely. So, let >>>>>> >>>>> me >>> >>>> try >>>>> >>>>>> this option too. >>>>>> >>>>>> All, >>>>>> >>>>>> Please suggest if anyone is having any other suggestion on this. I >>>>>> >>>>> have >>> >>>> to >>>>> >>>>>> implement it on urgent basis and i think i am very close to it. >>>>>> >>>>> Thanks >>> >>>> all >>>>> >>>>>> of you. I have reached to this level just because of you guys. >>>>>> >>>>>> Thanks and Regards, >>>>>> Nitin >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> >>>>> http://lucene.472066.n3.nabble.com/SOLR-ranking-tp4257367p4257782.html >>>>> >>>>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>>>> >>>>> >>>>> -- >>>>> Monitoring * Alerting * Anomaly Detection * Centralized Log Management >>>>> Solr & Elasticsearch Support * http://sematext.com/ >>>>> >>>>> -- >>>>> >>>> Regards, >>>> Binoy Dalal >>>> >>>> >>> >>> >>> -- >>> -------------------------- >>> >>> Benedetti Alessandro >>> Visiting card : http://about.me/alessandro_benedetti >>> >>> "Tyger, tyger burning bright >>> In the forests of the night, >>> What immortal hand or eye >>> Could frame thy fearful symmetry?" >>> >>> William Blake - Songs of Experience -1794 England >>> >>> > -- > Ere Maijala > Kansalliskirjasto / The National Library of Finland > -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England --001a113cc1142fe6e5052c1ce863--