From java-dev-return-14766-apmail-lucene-java-dev-archive=lucene.apache.org@lucene.apache.org Tue Jun 27 17:56:53 2006 Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 71802 invoked from network); 27 Jun 2006 17:56:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 27 Jun 2006 17:56:53 -0000 Received: (qmail 88257 invoked by uid 500); 27 Jun 2006 17:56:50 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 88211 invoked by uid 500); 27 Jun 2006 17:56:50 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 88200 invoked by uid 99); 27 Jun 2006 17:56:50 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Jun 2006 10:56:50 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [64.34.172.19] (HELO ohana.manawiz.com) (64.34.172.19) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Jun 2006 10:56:49 -0700 Received: from [172.25.1.142] ([::ffff:67.123.81.126]) (AUTH: LOGIN chuck, TLS: TLSv1/SSLv3,256bits,AES256-SHA) by ohana.manawiz.com with esmtp; Tue, 27 Jun 2006 17:56:27 +0000 id 0092409C.44A1714C.00003762 Message-ID: <44A17139.50701@manawiz.com> Date: Tue, 27 Jun 2006 10:56:09 -0700 From: Chuck Williams User-Agent: Thunderbird 1.5.0.2 (X11/20060516) MIME-Version: 1.0 To: java-dev@lucene.apache.org Subject: Re: Combining Hits and HitCollector References: <20060627160823.GA27705@fermat.math.technion.ac.il> In-Reply-To: <20060627160823.GA27705@fermat.math.technion.ac.il> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N IMHO, Hits is the worst class in Lucene. It's atrocities are numerous, including the hardwired "50" and the strange normalization of dividing all scores by the top score if the top score happens to be greater than 1.0 (which destroys any notion of score values having any absolute meaning, although many apps erroneously assume they do). It is quite easy to use a TopDocsCollector or a TopFieldDocCollector and do a better job than Hits does. For faceted search I use a SamplingHitCollector to gather the facet-determination sample. It takes as one of its constructor parameters, rankingCollector, an arbitrary HitCollector to gather the top scoring or top sorted results. Then it only takes one line of code to combine the two collectors: rankingCollector.collect(doc, score) within SamplingHitCollector.collect(). This all notwithstanding, a built-in class that combined Hits with a second HitCollector probably would be used by many people, although I would recommend the approach above as a better alternative. Chuck Nadav Har'El wrote on 06/27/2006 09:08 AM: > Hi, > > Searcher.search(Query) returns a Hits object, useful for the display of top > results. Searcher.search(Query, HitCollector) runs a HitsCollector for doing > some sort of processing over all results. > Unfortunately, there is currently no method to do both at the same time. > > For some uses, for example faceted search (that was discussed on this list > a few times in the past), you need to do both: go over all results (and, > for example, count how many results belong to each value), and at the same > time build a Hits object (for displaying the top search results). > > Changing Searcher, and/or Hits to allow for doing both things at once should > not be too hard, but before I go and do it (and submit the change as a patch), > I was wondering if I'm not reinventing the wheel, and if perhaps someone has > already done this, or there were already discussions on how or how not to do > it. > > Thanks, > Nadav. > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org