Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 95F61200BFE for ; Mon, 16 Jan 2017 10:48:06 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9015F160B30; Mon, 16 Jan 2017 09:48:06 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B38BD160B22 for ; Mon, 16 Jan 2017 10:48:05 +0100 (CET) Received: (qmail 38755 invoked by uid 500); 16 Jan 2017 09:48:04 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 38743 invoked by uid 99); 16 Jan 2017 09:48:03 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Jan 2017 09:48:03 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 1AB641A0551 for ; Mon, 16 Jan 2017 09:48:03 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.889 X-Spam-Level: X-Spam-Status: No, score=-0.889 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, HTML_MESSAGE=2, RP_MATCHES_RCVD=-2.999, SPF_PASS=-0.001, T_DKIM_INVALID=0.01, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=fail (1024-bit key) reason="fail (body has been altered)" header.d=extravision.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id BhhFboc_cZFy for ; Mon, 16 Jan 2017 09:47:58 +0000 (UTC) Received: from mta.extravision.com (mta.extravision.com [193.133.125.8]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 248E85FB33 for ; Mon, 16 Jan 2017 09:47:58 +0000 (UTC) Received: from [192.168.2.122] ([192.168.2.122]) (authenticated bits=0) by mta.extravision.com (8.13.8/8.14.3) with ESMTP id v0G9li43000689 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO) for ; Mon, 16 Jan 2017 09:47:45 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=extravision.com; s=default; t=1484560066; bh=Hs53RtBwHsi/zeLkShHOMv0wYf6A46gAzdKAnP10ty4=; h=Subject:To:References:From:Date:In-Reply-To; b=Y0cNzox8RRQ9fvQ/23iJcLnKItyuFdvzDmfpWZMFCeAhg+3l5e4HhhZ25j2QvtCQL py/PKhaB9aAlx1EImRIcuk9MERz7kHNDvGqsKPTxBpGjP22H3MDbABNqb85Gt2GJnD E+cUhm4Yt5O58qX+RrCDmpBkGQ8nH97n7JIQTs4M= Subject: Re: Trouble boosting a field To: solr-user@lucene.apache.org References: <1c6b49a6-7d77-e455-2e9b-c92522f94c26@extravision.com> <53934558-4442-4D64-BADA-E0C0FDF2FD7B@wunderwood.org> <955CEC75-4D1E-40FD-8EBD-7804B962C86B@flax.co.uk> From: Tom Chiverton Message-ID: <24bb4c68-d1e0-3f64-8e3c-427342bcf5a3@extravision.com> Date: Mon, 16 Jan 2017 09:47:43 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <955CEC75-4D1E-40FD-8EBD-7804B962C86B@flax.co.uk> Content-Type: multipart/alternative; boundary="------------9BFFD13A011E7398CC990B06" archived-at: Mon, 16 Jan 2017 09:48:06 -0000 --------------9BFFD13A011E7398CC990B06 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Ohh, that's handy ! But it needs Solr/ElasticSearch to be publicly accessible ? On 14/01/17 09:23, Alan Woodward wrote: > http://splainer.io/ from the gents at OpenSourceConnections is pretty good for this sort of thing, I find… > > Alan Woodward > www.flax.co.uk > > >> On 13 Jan 2017, at 16:35, Tom Chiverton wrote: >> >> Well, I've tried much larger values than 8, and it still doesn't seem to do the job ? >> >> For now, assume my users are searching for exact sub strings of a real title. >> >> Tom >> >> >> On 13/01/17 16:22, Walter Underwood wrote: >>> I use a boost of 8 for title with no boost on the content. Both Infoseek and Inktomi settled on the 8X boost, getting there with completely different methodologies. >>> >>> You might not want the title to completely trump the content. That causes some odd anomalies. If someone searches for “ice age 2”, do you really want every title with “2” to come before “ice age two”? Or a search for “steve jobs” to return every article with “job” or “jobs” in the title first? >>> >>> Also, use “edismax”, not “dismax”. Dismax was obsolete in Solr 3.x, five years ago. >>> >>> wunder >>> Walter Underwood >>> wunder@wunderwood.org >>> http://observer.wunderwood.org/ (my blog) >>> >>> >>>> On Jan 13, 2017, at 7:10 AM, Tom Chiverton wrote: >>>> >>>> I have a few hundred documents with title and content fields. >>>> >>>> I want a match in title to trump matches in content. If I search for "connected vehicle" then a news article that has that in the content shouldn't be ranked higher than the page with that in the title is essentially what I want. >>>> >>>> I have tried dismax with qf=title^2 as well as several other variants with the standard query parser (like q="title:"foo"^2 OR content:"foo") but documents without the search term in the title still come out before those with the term in the title when ordered by score. >>>> >>>> Is there something I am missing ? >>>> >>>> From the docs, something like q=title:"connected vehicle"^2 OR content:"connected vehicle" should have worked ? Even using ^100 didn't help. >>>> >>>> I tried with the dismax parser using >>>> >>>> "q": "Connected Vehicle", >>>> "defType": "dismax", >>>> "indent": "true", >>>> "qf": "title^2000 content", >>>> "pf": "pf=title^4000 content^2", >>>> "sort": "score desc", >>>> "wt": "json", >>>> >>>> but that was not better. if I remove content from pf/qf then documents seem to rank correctly. >>>> Example query and results (content omitted) : http://pastebin.com/5EhrRJP8 with managed-schema http://pastebin.com/mdraWQWE >>>> >>>> -- >>>> >>>> >>>> >>>> Tom Chiverton >>>> Lead Developer >>>> >>>> e: tc@extravision.com >>>> p: 0161 817 2922 >>>> t: @extravision >>>> w: www.extravision.com >>>> >>>> >>>> >>>> Registered in the UK at: 107 Timber Wharf, 33 Worsley Street, Manchester, M15 4LD. >>>> Company Reg No: 0‌‌5017214 VAT: GB 8‌‌24 5386 19 >>>> >>>> This e-mail is intended solely for the person to whom it is addressed and may contain confidential or privileged information. >>>> Any views or opinions presented in this e-mail are solely of the author and do not necessarily represent those of Extravision Ltd. >>>> >>> ______________________________________________________________________ >>> This email has been scanned by the Symantec Email Security.cloud service. >>> For more information please visit http://www.symanteccloud.com >>> ______________________________________________________________________ > > ______________________________________________________________________ > This email has been scanned by the Symantec Email Security.cloud service. > For more information please visit http://www.symanteccloud.com > ______________________________________________________________________ --------------9BFFD13A011E7398CC990B06--