lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nadav Har'El" <...@math.technion.ac.il>
Subject Re: Combining Hits and HitCollector
Date Tue, 27 Jun 2006 20:37:26 GMT
On Tue, Jun 27, 2006, Chuck Williams wrote about "Re: Combining Hits and HitCollector":
> IMHO, Hits is the worst class in Lucene.  It's atrocities are numerous,
> including the hardwired "50" and the strange normalization of dividing
> all scores by the top score if the top score happens to be greater than
> 1.0 (which destroys any notion of score values having any absolute
> meaning, although many apps erroneously assume they do).  It is quite
> easy to use a TopDocsCollector or a TopFieldDocCollector and do a better
> job than Hits does.

Thanks for the suggestion.

You've made a very good point, and indeed I'm beginning to question the
value in my idea of combining Hits and a HitCollector, when for almost
any application I can think of a TopDocs would be just as good as Hits,
and when (as you said) it's much easier to combine the collector building
a TopDocs (TopDocsCollector or TopFieldDocCollector) with another collector.

Perhaps a "MultiHitCollector" combining several other collectors could be
useful, although you're right and it's very easy to write one when needed
and it doesn't really need to be part of Lucene's core.

> This all notwithstanding, a built-in class that combined Hits with a
> second HitCollector probably would be used by many people, although I
> would recommend the approach above as a better alternative.

I wonder: if Hits is considered a problematic class, should we really go
ahead and expand its capabilities, like I proposed initially? Perhaps not...
Perhaps it's better to recommend other approaches in javadoc, FAQs, or in
the form of new code, say, two new simple methods in Searcher:

	TopDocs search(Query, Filter, int, HitCollector)
	TopFieldDocs search(Query, Filter, int, Sort, HitCollector)

In the long run, perhaps we need to give some thought as to whether we
should continue demonstrating the use of Hits (rather than TopDocs) in most
Lucene examples, and whether perhaps, the Hits API should be deprecated.


Nadav.

-- 
Nadav Har'El                        |      Tuesday, Jun 27 2006, 2 Tammuz 5766
IBM Haifa Research Lab              |-----------------------------------------
                                    |"Never be afraid to tell the world who
http://nadav.harel.org.il           |you are." -- Anonymous

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message