Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@www.apache.org Received: (qmail 44253 invoked from network); 21 Oct 2004 22:16:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 21 Oct 2004 22:16:19 -0000 Received: (qmail 32502 invoked by uid 500); 21 Oct 2004 22:16:17 -0000 Delivered-To: apmail-jakarta-lucene-dev-archive@jakarta.apache.org Received: (qmail 32431 invoked by uid 500); 21 Oct 2004 22:16:15 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 32405 invoked by uid 99); 21 Oct 2004 22:16:15 -0000 X-ASF-Spam-Status: No, hits=1.0 required=10.0 tests=SPF_HELO_SOFTFAIL X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: local policy) Received: from [64.78.19.14] (HELO reh001-1.REX001.ExchangeByRegister.com) (64.78.19.14) by apache.org (qpsmtpd/0.28) with ESMTP; Thu, 21 Oct 2004 15:16:15 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----_=_NextPart_001_01C4B7BA.E7FB37EF" Subject: RE: Normalized Scoring -- was RE: idf and explain(), was Re: Search and Scoring Date: Thu, 21 Oct 2004 15:11:24 -0700 Message-ID: X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Thread-Topic: Normalized Scoring -- was RE: idf and explain(), was Re: Search and Scoring Thread-Index: AcS3tTGqHgvjHBfpQyyutMVZLZr4iwABE1Aw From: "Chuck Williams" To: "Lucene Developers List" X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N ------_=_NextPart_001_01C4B7BA.E7FB37EF Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable The idf's are indeed computed locally, but I believe it is a simple bug = in MultiSearcher. The attached version of the test adds explain()'s to = verify the problem is the idf's (and changes the Field construction to = something that works in my 1.4.2 sources). MultiSearcher.search() calls the separate searchers for each index. = That makes the IndexSearcher the current searcher when Similarity.idf() = is reached. Thus IndexSearcher.docFreq() is used instead of = MultiSearcher.docFreq(), yielding the index-local idf's. The best fix is not obvious to me, but it is just a code-structure = issue. Chuck > -----Original Message----- > From: Daniel Naber [mailto:daniel.naber@t-online.de] > Sent: Thursday, October 21, 2004 2:35 PM > To: Lucene Developers List > Subject: Re: Normalized Scoring -- was RE: idf and explain(), was = Re: > Search and Scoring >=20 > On Thursday 21 October 2004 23:03, Doug Cutting wrote: >=20 > > Idf's are already computed globally across all indexes. =A0Tf's = are > local > > to the document. =A0In short, scores from a MultiSearcher are the = same > as > > when searching an IndexReader with the same documents. >=20 > That doesn't seem to be the case in the attached test -- am I using > MultiSearcher in the wrong way or what might be the problem? > The output of the attached test is: >=20 > 1+2 searched with Multisearcher: > two blah three score=3D0.70273256 > one blah three score=3D0.35615897 > one foo three score=3D0.35615897 > one foobar three score=3D0.35615897 >=20 > 1+2 indexed together: > one blah three score=3D0.5911608 > one foo three score=3D0.5911608 > one foobar three score=3D0.5911608 > two blah three score=3D0.5911608 >=20 > -- > http://www.danielnaber.de ------_=_NextPart_001_01C4B7BA.E7FB37EF Content-Type: text/plain; charset=us-ascii --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org ------_=_NextPart_001_01C4B7BA.E7FB37EF--