Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 96283 invoked from network); 25 Nov 2003 19:08:41 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 25 Nov 2003 19:08:41 -0000 Received: (qmail 53155 invoked by uid 500); 25 Nov 2003 18:50:49 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 53131 invoked by uid 500); 25 Nov 2003 18:50:49 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 53100 invoked from network); 25 Nov 2003 18:50:48 -0000 Received: from unknown (HELO ghoul.scms.waikato.ac.nz) (130.217.241.35) by daedalus.apache.org with SMTP; 25 Nov 2003 18:50:48 -0000 Received: from theta.cs.waikato.ac.nz ([130.217.244.196] helo=cs.waikato.ac.nz) by ghoul.scms.waikato.ac.nz with esmtp (Exim 4.23) id 1AOiGy-00042F-FS for lucene-user@jakarta.apache.org; Wed, 26 Nov 2003 07:50:52 +1300 Message-ID: <3FC3A48C.1060807@cs.waikato.ac.nz> Date: Wed, 26 Nov 2003 07:50:52 +1300 From: Gerret Apelt User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031023 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Lucene Users List Subject: Re: Score References: <1B76D3F17F4BE44283AD3C1C92AB590B01D1E689@EMSS04M14.us.lmco.com> In-Reply-To: <1B76D3F17F4BE44283AD3C1C92AB590B01D1E689@EMSS04M14.us.lmco.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Pleasant, Tracy wrote: >I tried using Boost but that did absolutely nothing. > >The documents I am using: >Plain text >PDF Documents >(I have two indexes) > > I'm not sure what's causing your scores to be off -- unless, of course, your scores just look wrong to you but they're in fact just what you should be getting :) One bug in my code was that for an unrelated reason, terms in one field would never be matched. But since other fields contained the same term, the document was still being reported as a hit -- with a lower-than-expected score. Maybe you want to double check that the content of each field is getting tokenized properly.. when you have a term t in the title field that is unique to a particular document (i.e. not contained in any of the other fields of that document) do you still get a hit on the document when searching for t?Boost factors don't help of course if there's no hit in the first place. >When you say you use different analyzers for different fields in your >index, how would you accomplish that? When I create the index it has a >parameter for analyzer.. unless you create different indexes , how do >you use two different ones? > > Use PerFieldAnalyzerWrapper: http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/analysis/PerFieldAnalyzerWrapper.html cheers Gerret > > >-----Original Message----- >From: Gerret Apelt [mailto:ga11@cs.waikato.ac.nz] >Sent: Monday, November 24, 2003 3:25 PM >To: Lucene Users List >Subject: Re: Score > > >Tracey -- > >it would help if you could give more detail on the types of documents, >fields and analyzers you're using. Also what do you mean by "Multi Field > >Search"? I presume you're using the MultiFieldQueryParser to have query >terms in a user-submitted query be searched for in each field in your >index. > >If I am understanding your problem, then it might be the same one I had >a few weeks ago -- highly relevant matches would not receive a high >ranking. (This paragraph will apply to you only if you use more than >just one Analyzer for the set of your fields). I had six fields in my >index, most of which were populated with a standard analyzer. I used >self-made Analyzers for two of the fields. This turned out to be my >problem when using MultiFieldQueryParser: I told my >MultiFieldQueryParser instance to use only the standard analyzer. >Instead I discovered that I needed to make use of >org.apache.lucene.analysis.PerFieldAnalyzerWrapper and feed that to the >MultiFieldQueryParser. Unless you do this, your problem is whats >described here: >http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi?file=chapter.in >dexing&toc=faq#q15. > >Most likely, if your scoring is off, you're "doing something wrong" in >the way you use the Lucene API -- at least, thats what I've discovered >to be the case when my ranking is off. > >If you're interested in the nitty-gritty of how scoring is done, check >this FAQ entry: >http://lucene.sourceforge.net/cgi-bin/faq/faqmanager.cgi?file=chapter.se >arch&toc=faq#q31 > >cheers, >Gerret > >Pleasant, Tracy wrote: > > > >>Hi, >> >>I'm using the Multi Field Search to search all the fields of my >>documents during the search. >> >>When it returns results the scores are numerically low - .06, .17, etc. >>I would think if I searched for "Dog" and there was a doc with "Dog" in >>the title and several times in the contents of a document that it would >>receive a score more like 1.0 or close to it. >> >>Is there a way that I can tweak the score? >> >>I tried using Boost but that did absolutely nothing. >> >>--------------------------------------------------------------------- >>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org >>For additional commands, e-mail: lucene-user-help@jakarta.apache.org >> >> >> >> >> >> > > > >--------------------------------------------------------------------- >To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org >For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > >--------------------------------------------------------------------- >To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org >For additional commands, e-mail: lucene-user-help@jakarta.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org