Return-Path: Delivered-To: apmail-lucene-solr-user-archive@locus.apache.org Received: (qmail 17936 invoked from network); 21 May 2008 18:36:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 21 May 2008 18:36:09 -0000 Received: (qmail 78177 invoked by uid 500); 21 May 2008 18:36:01 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 78153 invoked by uid 500); 21 May 2008 18:36:01 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 78138 invoked by uid 99); 21 May 2008 18:36:01 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 May 2008 11:36:01 -0700 X-ASF-Spam-Status: No, hits=2.6 required=10.0 tests=DNS_FROM_OPENWHOIS,SPF_HELO_PASS,SPF_PASS,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lists@nabble.com designates 216.139.236.158 as permitted sender) Received: from [216.139.236.158] (HELO kuber.nabble.com) (216.139.236.158) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 May 2008 18:35:07 +0000 Received: from isper.nabble.com ([192.168.236.156]) by kuber.nabble.com with esmtp (Exim 4.63) (envelope-from ) id 1Jyt9s-0006AN-NX for solr-user@lucene.apache.org; Wed, 21 May 2008 11:35:28 -0700 Message-ID: <17367660.post@talk.nabble.com> Date: Wed, 21 May 2008 11:35:28 -0700 (PDT) From: peter360 To: solr-user@lucene.apache.org Subject: dismax handler and WordDelimiterFilterFactory MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-From: peter360@fastmail.us X-Virus-Checked: Checked by ClamAV on apache.org Hi, Let's say I have an index with two fields: f1 and f2, and queries to both are analyzed using WhiteSpaceTokenizerFactory and WordDelimiterFilterFactory. I use dismax handler for queries and observed the following anomally. Suppose I have a document with f1="american" and f2="idol". Then a search "q=american+idol&qt=dismax&qf=f1+f2" matches. However, the search "q=american-idol&qt=dismax&qf=f1+f2" does not, even though the analyzer (WordDelimiterFilterFactory) turns "american-idol" into "american idol". Upon closer look, the dismax handler is parsing the first query as something like +(f1:american f2:american) +(f1:idol f2:idol) while parsing the second as something like f1:"american idol" f2:"american idol" I feel this is an anormaly because from end user point of view american-idol should be treated the same as american idol. How do I achieve this? One possible solution is to index f1 and f2 as one field, but I want to be able to give separate boosts to them, such as "qf=f1^2+f2". Any ideas? Do people feel this is a bug in the dismax handler? -- View this message in context: http://www.nabble.com/dismax-handler-and-WordDelimiterFilterFactory-tp17367660p17367660.html Sent from the Solr - User mailing list archive at Nabble.com.