lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: Is TopDocCollector's collect() implementation correct?
Date Mon, 23 Mar 2009 14:43:10 GMT
> If we're already creating a new TopScoreDocCollector (when was it
> added?  I must have been dozing off while this happened...)

This was LUCENE-1483.

> How about if we introduce an abstract ScoringCollector (about the
> name later) which implements topDocs() and getTotalHits() and there
> will be several implementations of it, such as:
> TopScoreDocCollector, which sorts the documents by their score, in
> descending order only, TopFieldDocCollector - for sorting by fields,
> and additional sort-by collectors.

This sounds good... but the challenge is we also need to get both
HitCollector and MultiReaderHitCollector in there.

HitCollector is the simplest way to create a custom collector.
MultiReaderHitCollector (added with LUCENE-1483) is the more
performant way, since it lets your collector operate per-segment.  All
non-deprecated core collectors in Lucene now subclass

So would we make separate subclasses for each of them to add
getTotalHits() / topDocs()?  EG TopDocsHitCollector and
TopDocsMultiReaderHitCollector?  It's getting confusing.

Or maybe we just add totalHits() and topDocs() to HitCollector even
though for advanced case (non-top-N-collection) the methods would not
be used?

Or... maybe this is a time when an interface is the lesser evil: we
could make a TopDocs interface that the necessary classes implement?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message