Return-Path: Delivered-To: apmail-lucene-solr-dev-archive@locus.apache.org Received: (qmail 97380 invoked from network); 8 Nov 2006 18:36:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 8 Nov 2006 18:36:45 -0000 Received: (qmail 9723 invoked by uid 500); 8 Nov 2006 18:36:56 -0000 Delivered-To: apmail-lucene-solr-dev-archive@lucene.apache.org Received: (qmail 9700 invoked by uid 500); 8 Nov 2006 18:36:56 -0000 Mailing-List: contact solr-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-dev@lucene.apache.org Delivered-To: mailing list solr-dev@lucene.apache.org Received: (qmail 9691 invoked by uid 99); 8 Nov 2006 18:36:56 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Nov 2006 10:36:56 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of wunderwood@netflix.com designates 216.35.131.152 as permitted sender) Received: from [216.35.131.152] (HELO mx2.netflix.com) (216.35.131.152) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Nov 2006 10:36:43 -0800 Received: from message.netflix.com (exchangeav [10.1.122.79]) by mx2.netflix.com (8.12.11.20060308/8.12.11) with ESMTP id kA8IaAd0018491 for ; Wed, 8 Nov 2006 10:36:10 -0800 Received: from Superfly.netflix.com ([10.1.122.93]) by message.netflix.com with Microsoft SMTPSVC(6.0.3790.1830); Wed, 8 Nov 2006 10:36:20 -0800 Received: from 10.2.164.65 ([10.2.164.65]) by superfly.netflix.com ([10.1.122.93]) with Microsoft Exchange Server HTTP-DAV ; Wed, 8 Nov 2006 18:36:22 +0000 User-Agent: Microsoft-Entourage/11.2.5.060620 Date: Wed, 08 Nov 2006 10:38:33 -0800 Subject: Re: Adding Phonetic Search to Solr From: Walter Underwood To: Message-ID: Thread-Topic: Adding Phonetic Search to Solr Thread-Index: AccDZRfYVqQgfG9YEduIBQAUUTF+rA== In-Reply-To: Mime-version: 1.0 Content-type: text/plain; charset="US-ASCII" Content-transfer-encoding: 7bit X-OriginalArrivalTime: 08 Nov 2006 18:36:20.0141 (UTC) FILETIME=[C8A7EDD0:01C70364] X-Brightmail-Tracker: AAAAAQAAA+k= X-Language-Identified: TRUE X-Virus-Checked: Checked by ClamAV on apache.org On 11/8/06 10:30 AM, "Chris Hostetter" wrote: > : Also, the phonetic matches are ranked a bit high, so I'm trying a > : sub-1.0 boost. I was expecting the lower idf to fix that automatically. > : The metaphone will almost always have a lower idf because multiple > : words are mapped to one metaphone, so the encoded term occurs in more > : documents than the surface terms. > > That all makes sense, and yet it's not what you are observing ... which > leads me to believe you (and I since i want to agree with you) are missing > something subtle .... what does the the Explanation look like for two > documenets where you feel like one should score higher then the other but > they don't? That is my next step. Maybe create some test documents in my corpus and spend some quality time with Explain and grokking DisMax. I need to customize Similarity anyway. wunder -- Walter Underwood Search Guru, Netflix