lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chuck Williams <>
Subject Re: Combining Hits and HitCollector
Date Tue, 27 Jun 2006 17:56:09 GMT
IMHO, Hits is the worst class in Lucene.  It's atrocities are numerous,
including the hardwired "50" and the strange normalization of dividing
all scores by the top score if the top score happens to be greater than
1.0 (which destroys any notion of score values having any absolute
meaning, although many apps erroneously assume they do).  It is quite
easy to use a TopDocsCollector or a TopFieldDocCollector and do a better
job than Hits does.

For faceted search I use a SamplingHitCollector to gather the
facet-determination sample.  It takes as one of its constructor
parameters, rankingCollector, an arbitrary HitCollector to gather the
top scoring or top sorted results.  Then it only takes one line of code
to combine the two collectors:  rankingCollector.collect(doc, score)
within SamplingHitCollector.collect().

This all notwithstanding, a built-in class that combined Hits with a
second HitCollector probably would be used by many people, although I
would recommend the approach above as a better alternative.


Nadav Har'El wrote on 06/27/2006 09:08 AM:
> Hi,
> returns a Hits object, useful for the display of top
> results., HitCollector) runs a HitsCollector for doing
> some sort of processing over all results.
> Unfortunately, there is currently no method to do both at the same time.
> For some uses, for example faceted search (that was discussed on this list
> a few times in the past), you need to do both: go over all results (and,
> for example, count how many results belong to each value), and at the same
> time build a Hits object (for displaying the top search results).
> Changing Searcher, and/or Hits to allow for doing both things at once should
> not be too hard, but before I go and do it (and submit the change as a patch),
> I was wondering if I'm not reinventing the wheel, and if perhaps someone has
> already done this, or there were already discussions on how or how not to do
> it.
> Thanks,
> Nadav.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message