Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 60845 invoked from network); 27 Sep 2006 18:41:30 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 27 Sep 2006 18:41:30 -0000 Received: (qmail 52568 invoked by uid 500); 27 Sep 2006 18:41:24 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 52447 invoked by uid 500); 27 Sep 2006 18:41:24 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 52436 invoked by uid 99); 27 Sep 2006 18:41:24 -0000 Received: from idunn.apache.osuosl.org (HELO idunn.apache.osuosl.org) (140.211.166.84) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Sep 2006 11:41:24 -0700 Authentication-Results: idunn.apache.osuosl.org smtp.mail=hossman_lucene@fucit.org; spf=permerror X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests= Received-SPF: error (idunn.apache.osuosl.org: domain fucit.org from 169.229.70.167 cause and error) Received: from [169.229.70.167] ([169.229.70.167:36653] helo=rescomp.berkeley.edu) by idunn.apache.osuosl.org (ecelerity 2.1.1.8 r(12930)) with ESMTP id 2A/B3-29789-1D5CA154 for ; Wed, 27 Sep 2006 11:41:22 -0700 Received: by rescomp.berkeley.edu (Postfix, from userid 1007) id C96035B766; Wed, 27 Sep 2006 11:41:12 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by rescomp.berkeley.edu (Postfix) with ESMTP id C5BFD7F403 for ; Wed, 27 Sep 2006 11:41:12 -0700 (PDT) Date: Wed, 27 Sep 2006 11:41:12 -0700 (PDT) From: Chris Hostetter To: java-user@lucene.apache.org Subject: Re: Very high fieldNorm for a field resulting in bad results In-Reply-To: <865c77680609261944h335583acy6b6900ecb898e8eb@mail.gmail.com> Message-ID: References: <865c77680609250327j265039aeo43b8d771e405a5dc@mail.gmail.com> <865c77680609261944h335583acy6b6900ecb898e8eb@mail.gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N : 1. Can I do away with index-time boosting for fields & tweak : query-time boosting for them ? I understand that doc level boosting is : very useful while indexing. : But for fields, both index-boost & query-boost are mutiples which lead : to the score, so would it be safe to say that I can replace the : index-time boost with query-time boosting. This allows me a lot of : freedom to test different values without re-indexing which takes me : about 6 hours. it depends on your goal. index time field boosts are a way to express things like "this documents title is worth twice as much as the title of most documents" query time boosts are a way to express "i care about matches on this clause of my query twice as much as i do about matches to other clauses of my query. : 2. When searching through the archive I had read a post by you, saying : its possible to give exact matches much higher weightage by indexing : the START & END : from : http://www.nabble.com/What-are-norms--tf1919250.html#a5335856 the context was that even if you turn off field norms you can still some score benefits/restrictions of matches on shorter fields vs longer fields by indexing marker tokens (things you wouldn't expect to be regular tokens; i used START and END just for convinience) at the begining and ending of hte field, and then including them in your phrase or span near query with lots of slop ... so a values like... Duke Ellington The Duke Ellington Band get indexed as the tokens... {START} {duke} {ellington} {END} {START} {the} {duke} {ellington} {band} {END} ...when doing a sloppy phrase or span near search for [ START, duke, ellington, END ] both of those values will match, but the first will have a higher score. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org