Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 16363 invoked from network); 12 Nov 2004 19:48:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 12 Nov 2004 19:48:17 -0000 Received: (qmail 45862 invoked by uid 500); 12 Nov 2004 19:48:12 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 45695 invoked by uid 500); 12 Nov 2004 19:48:11 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 45681 invoked by uid 99); 12 Nov 2004 19:48:11 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=RCVD_BY_IP,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: domain of ken.mccracken@gmail.com designates 64.233.170.193 as permitted sender) Received: from [64.233.170.193] (HELO rproxy.gmail.com) (64.233.170.193) by apache.org (qpsmtpd/0.28) with ESMTP; Fri, 12 Nov 2004 11:48:07 -0800 Received: by rproxy.gmail.com with SMTP id a36so461887rnf for ; Fri, 12 Nov 2004 11:48:04 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=IySC71UKttjMc3EmhQ5K5nao4w1FzkT8Gli7kIa6EETd5adm4q8kEaPGeDdZWk2X1GU23k4N0CRaemsu0colcjE4mQLKDYzjhu0GqzZVzB5cqxd/9kDRknrtzOTfDNpLEYEs8QGG4jPejPFX27jb5vu6V+2fUCzKM1qN2auZGTA= Received: by 10.38.97.10 with SMTP id u10mr2021rnb; Fri, 12 Nov 2004 11:48:04 -0800 (PST) Received: by 10.38.75.11 with HTTP; Fri, 12 Nov 2004 11:48:04 -0800 (PST) Message-ID: <2b2518a404111211481f7ac51c@mail.gmail.com> Date: Fri, 12 Nov 2004 11:48:04 -0800 From: Ken McCracken Reply-To: Ken McCracken To: Lucene Users List Subject: lucene Scorers Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hi, I am looking at the Similarity class overview, and wondering if I can replace the SUM operator with a MAX operator, or any other operator (across the terms in a query). For example, if I search for "car OR automobile", a BooleanScorer is used to add the values from each subexpression together. In the BooleanScorer from lucene_1_4_final, in the inner class Collector, we have in the collect(...) method, the line bucket.score += score; // increment score that I may want replace with a MAX operator such as if (score > bucket.score) bucket.score = score; // take the max I may also want to keep track of both the max and the sum, by extending the inner class Bucket. Do you have any suggestions on how to implement such a change? Ideally, I would like to have the ability to define my choice of scoring algorithm at search time (at run time), and use the Lucene SUM scorer for some searches, and the MAX scorer for other searches. Thanks for you help. -Ken PS. The code I'm talking about falls in the follwoing area, for my example search "car OR automobile". If I walk the code during search, I see that the BooleanScorer$Collector is created by the Weight that was just created, in BooleanQuery$BooleanWeight.scorer(...), as it adds the subscorers for each of the terms in the BooleanScorer. When that collector is asked to collect(...), its bucketTable is filled in. Since the collectors for each of the terms use the same bucketTable, if the document already appears in the bucketTable, then it's score is added to implement a SUM operator. --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org