lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: IndexSearcher.search(query, filter, collector) considered less efficient
Date Fri, 08 Jun 2012 17:40:26 GMT
I think that javadoc is stale; my guess is it was written back when
the collect method took a score, but we changed that so the collector
calls .score() if it really needs the score... so I can't think of why
that search method is inherently inefficient.

I'll fix the javadocs (remove that warning).

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jun 8, 2012 at 1:32 PM, Paul Hill <paul@metajure.com> wrote:
> I noticed today that my code calls
> IndexSearcher.search (Query query, Filter filter, Collector collector)
> But also noticed that the DOCs says
>
> "Applications should only use this if they need all of the matching documents. The high-level
search API (Searcher.search(Query, Filter, int)
> ) is usually more efficient, as it skips non-high-scoring hits."
>   http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/api/core/org/apache/lucene/search/IndexSearcher.html#searchAfter%28org.apache.lucene.search.ScoreDoc,%20org.apache.lucene.search.Query,%20int%29
> Which makes complete sense since I didn't provide it with any count limit.
> My original, but apparently inefficient call is:
>            searcher.search(userQuery, securityFilter, dedupingCollector);
> The userQuery is really an enhanced query based on what the user entered, not really
the usersQuery.
> The duplicateCollector uses one fieldCache (FieldCache.DEFAULT.getStrings(reader, deDupField)
to work out which ones to collect and which ones to reject, saving a list of 1st occurrences
of documents.
> I don't think I can use the contrib DuplicateFilter, because my duplicates are not guaranteed
to be in the same index segment.
>
> So am I being misled by my interpretation of the JavaDoc comment, even though I really
DON'T "need all matching documents" or is there some way to work a count limit and a flitering
into the whole chain of filters and collectors.
>
> -Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message