Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 32909 invoked from network); 19 Nov 2007 18:25:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 19 Nov 2007 18:25:43 -0000 Received: (qmail 7479 invoked by uid 500); 19 Nov 2007 18:25:24 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 7448 invoked by uid 500); 19 Nov 2007 18:25:24 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 7437 invoked by uid 99); 19 Nov 2007 18:25:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2007 10:25:24 -0800 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [209.73.178.156] (HELO web60413.mail.yahoo.com) (209.73.178.156) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 19 Nov 2007 18:25:26 +0000 Received: (qmail 86831 invoked by uid 60001); 19 Nov 2007 18:25:05 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Message-ID; b=q/Ud7hHqElUCRhcFsNwjSYZClQEUZ5yTmkSYb5CXEOmHAFLD5iJfZflV0cLgcHRFzt1dYpSFQlzGimi67McvmNeoB/Y0JD4NzdzzTBDugLID8PLqB/gk3zpaNxc3I8ISL1XKfwrZzYgla3p1p/uXtTZYSINhCJi07aLURgNFTcI=; X-YMail-OSG: _4bKhnoVM1m2N3hX7DOZI9OpRmCvTSjzy1.VHafd.hHuE79VuwONoeaJwfYO64hsIg-- Received: from [141.217.87.154] by web60413.mail.yahoo.com via HTTP; Mon, 19 Nov 2007 10:25:04 PST X-Mailer: YahooMailRC/818.27 YahooMailWebService/0.7.157 Date: Mon, 19 Nov 2007 10:25:04 -0800 (PST) From: HAIDUC SONIA Subject: Re: Scoring for all the documents in the index relative to a query To: java-user@lucene.apache.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-1617004257-1195496704=:85955" Message-ID: <870644.85955.qm@web60413.mail.yahoo.com> X-Virus-Checked: Checked by ClamAV on apache.org --0-1617004257-1195496704=:85955 Content-Type: text/plain; charset=us-ascii I am trying to order all the documents in the index according to their similarity to a given query. I am interested in having a complete list of *all* the documents in the index with their score. From what I understood by reading some documentation, Lucene internally assigns scores to all the documents in the index according to their similarity to the query, but when returning the hits, all the scores that are less than 0 are rounded to 0 and only the documents with the score > 0 are returned as hits. But what I would like to get is the list before this intermediate processing, so the list of all the documents with their raw score. I am trying to compare Lucene with LSI and for the comparison I want to do, I need the entire list of documents. Is there a way that I can get that with Lucene? I hope I explained it clearly this time. If you need more details let me know. Thank you, Sonia ----- Original Message ---- From: Erick Erickson To: java-user@lucene.apache.org Sent: Monday, November 19, 2007 11:55:00 AM Subject: Re: Scoring for all the documents in the index relative to a query Could you explain a bit more what problem you're trying to solve? The reason I ask is that your question doesn't make sense to me, since I have no idea what you expect by the term "negative score". My simplistic view has been that all the docs returned via Hits or HitCollector have scores > 0, and all the rest have scores of 0, and this view is supported by the explanation of HitCollector.collect " Called once for every non-zero scoring document, with the document number and its score." You might also get value from this page: http://lucene.apache.org/java/docs/scoring.html#Scoring Best Erick On Nov 19, 2007 11:05 AM, HAIDUC SONIA wrote: > Hi everyone, > > I am trying to obtain the score for each document in the index relative to > a given query. For example, if I have the query "search file", I am trying > to get the list of all documents in the index and their scores relative to > the given query. I tried first using Hits, which gave me the normalized > score. I thought that I don't see the whole list of documents and their > scores because of the normalization, so I tried using HitsCollector. But > even after using HitsCollector, I get the same number of matching documents, > so the normalization didn't exclude documents because of negative scoring. > Does Lucene actually compute the score for all the documents in the index or > just for matching documents? I really need to have the scores for all the > documents in the index relative to the query (even if negative), not just > the ones that contain the query terms(this is what Lucene considers > "matching documents", right?). Is this possible using Lucene? > > I really appreciate your time and effort! > Thanks, > Sonia > > > > > > ____________________________________________________________________________________ > Get easy, one-click access to your favorites. > Make Yahoo! your homepage. > http://www.yahoo.com/r/hs > ____________________________________________________________________________________ Get easy, one-click access to your favorites. Make Yahoo! your homepage. http://www.yahoo.com/r/hs --0-1617004257-1195496704=:85955--