lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Collector is collecting more than the specified hits
Date Fri, 14 Feb 2014 11:17:41 GMT
This is how Collector works: it is called for every document matching
the query, and then its job is to choose which of those hits to keep.

This is because in general the hits to keep can come at any time, not
just the first N hits you see; e.g. the best scoring hit may be the
very last one.

But if you have prior knowledge, e.g. that your index is already
pre-sorted by the criteria that you sort by at query time, then indeed
after seeing the first N hits you can stop; to do this you must throw
your own exception, and catch it up above.  See Lucene's
TimeLimitingCollector for a similar example ...

Mike McCandless

http://blog.mikemccandless.com


On Fri, Feb 14, 2014 at 2:47 AM, saisantoshi <saisantoshi76@gmail.com> wrote:
> The problem with the below collector is the collect method is not stopping
> after the numHits count has reached. Is there a way to stop the collector
> collecting the docs after it has reached the numHits specified.
>
> For example:
> * TopScoreDocCollector topScore = TopScoreDocCollector.create(numHits,
> true); *
> // TopScoreDocCollector topScore = TopScoreDocCollector.create(30, true);
>
> I would except the below collector to pause/exit out after it has collected
> the specified numHits ( in this case it's 30). But what's happening here is
> the collector is collecting all the docs and thereby causing delay in
> searches. Can we configure the collect method below to collect/stop after it
> has reached numHits specified? PLease let me know if there any issue with
> the collector below?
>
> public class MyCollector extends PositiveScoresOnlyCollector  {
>
>     private IndexReader indexReader;
>
>
>     public MyCollector (IndexReader indexReader,PositiveScoresOnlyCollector
> topScore) {
>         super(topScore);
>         this.indexReader = indexReader;
>     }
>
>     @Override
>     public void collect(int doc) {
>         try {
>                //Custom Logic
>                     super.collect(doc);
>            }
>
>         } catch (Exception e) {
>
>         }
>     }
>
>
>
> //Usage:
>
> MyCollector collector;
>                 TopScoreDocCollector topScore =
> TopScoreDocCollector.create(numHits, true);
>                 IndexSearcher searcher = new IndexSearcher(reader);
>                 try {
>                     collector = new MyCollector(indexReader, new
> PositiveScoresOnlyCollector(topScore));
>                     searcher.search(query, (Filter) null, collector);
>                 } finally {
>
>                 }
>
> Thanks,
> Sai.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Collector-is-collecting-more-than-the-specified-hits-tp4117329.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message