Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EA1C2101DD for ; Fri, 14 Feb 2014 11:24:20 +0000 (UTC) Received: (qmail 61691 invoked by uid 500); 14 Feb 2014 11:24:18 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 61483 invoked by uid 500); 14 Feb 2014 11:24:18 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 61475 invoked by uid 99); 14 Feb 2014 11:24:16 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Feb 2014 11:24:16 +0000 X-ASF-Spam-Status: No, hits=0.6 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [209.85.220.170] (HELO mail-vc0-f170.google.com) (209.85.220.170) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Feb 2014 11:24:10 +0000 Received: by mail-vc0-f170.google.com with SMTP id hu8so9457984vcb.1 for ; Fri, 14 Feb 2014 03:23:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=pLVMuTUGs0l7wJMUQN3yz28/nwBC4oPUx1ezHNarUI0=; b=RsLzgYtmIH5nFVSBALkoJKnnAWIVzs7VwtgiD1f11A56pmpXF94UonQEuxUsjs/GHV oZhonjkTBO1TajAD322m3UIJ3XOKRcSzLW75hdyEFvsLpIy6WQ8cnoC7VfM18eSBjLje 6v7yYPr1nPKge6GsgWTriDonuwjjmAgZQ9JBLeGtcTJ062G1StLxClf3U8lCaahU2Xjb 0whskGuYzeoBnSrjArCG3vUZS3Hi59/gcI+neeCxLu2lX44Hg7MBsgXQRrUzFtYtOLuD zVTU8HIsWV+qWa6ozeWIN+ZsASn93IazTB9kzyMBZXwQocF4A5+ZeIMDhRP2XxPXNNKj imzw== X-Gm-Message-State: ALoCoQnOjbZVMX045lHhhNKlEMAr80HoA1EYxkURHF/vDLYSqH7bDRM+FVrSvPx7DC0ZcKnC6TvB X-Received: by 10.221.40.10 with SMTP id to10mr801655vcb.22.1392376681542; Fri, 14 Feb 2014 03:18:01 -0800 (PST) MIME-Version: 1.0 Received: by 10.221.5.3 with HTTP; Fri, 14 Feb 2014 03:17:41 -0800 (PST) In-Reply-To: <1392364027194-4117329.post@n3.nabble.com> References: <1392364027194-4117329.post@n3.nabble.com> From: Michael McCandless Date: Fri, 14 Feb 2014 06:17:41 -0500 Message-ID: Subject: Re: Collector is collecting more than the specified hits To: Lucene Users Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org This is how Collector works: it is called for every document matching the query, and then its job is to choose which of those hits to keep. This is because in general the hits to keep can come at any time, not just the first N hits you see; e.g. the best scoring hit may be the very last one. But if you have prior knowledge, e.g. that your index is already pre-sorted by the criteria that you sort by at query time, then indeed after seeing the first N hits you can stop; to do this you must throw your own exception, and catch it up above. See Lucene's TimeLimitingCollector for a similar example ... Mike McCandless http://blog.mikemccandless.com On Fri, Feb 14, 2014 at 2:47 AM, saisantoshi wrote: > The problem with the below collector is the collect method is not stopping > after the numHits count has reached. Is there a way to stop the collector > collecting the docs after it has reached the numHits specified. > > For example: > * TopScoreDocCollector topScore = TopScoreDocCollector.create(numHits, > true); * > // TopScoreDocCollector topScore = TopScoreDocCollector.create(30, true); > > I would except the below collector to pause/exit out after it has collected > the specified numHits ( in this case it's 30). But what's happening here is > the collector is collecting all the docs and thereby causing delay in > searches. Can we configure the collect method below to collect/stop after it > has reached numHits specified? PLease let me know if there any issue with > the collector below? > > public class MyCollector extends PositiveScoresOnlyCollector { > > private IndexReader indexReader; > > > public MyCollector (IndexReader indexReader,PositiveScoresOnlyCollector > topScore) { > super(topScore); > this.indexReader = indexReader; > } > > @Override > public void collect(int doc) { > try { > //Custom Logic > super.collect(doc); > } > > } catch (Exception e) { > > } > } > > > > //Usage: > > MyCollector collector; > TopScoreDocCollector topScore = > TopScoreDocCollector.create(numHits, true); > IndexSearcher searcher = new IndexSearcher(reader); > try { > collector = new MyCollector(indexReader, new > PositiveScoresOnlyCollector(topScore)); > searcher.search(query, (Filter) null, collector); > } finally { > > } > > Thanks, > Sai. > > > > -- > View this message in context: http://lucene.472066.n3.nabble.com/Collector-is-collecting-more-than-the-specified-hits-tp4117329.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org