lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafis <ira...@yahoo.com>
Subject Re: Using HitCollector to Collect First N Hits
Date Sat, 22 Aug 2009 12:12:46 GMT


Len Takeuchi-2 wrote:
> 
> I’m using Lucene 2.4.1 and I’m trying to use a custom HitCollector to
> collect
> only the first N hits (not the best hits) for performance.  I saw another
> e-mail in this group where they mentioned writing a HitCollector which
> throws
> an exception after N hits to do this.  So I tried this approach and it
> seems
> as if my HitCollector isn’t called until the hits have been determined,
> i.e.
> the time until my HitCollector is called is dependent on the number of
> hits
> and my performance is no better than when I was not using a custom
> HitCollector.
> 
> 
In my case count of found hits was required, so there was no throwing
exception. 

/**
 * Hit collector to collect only first docs ignoring score
 * @author Rafis
 */
public class LimitedHitCollector extends HitCollector {
	public final int capacity;
	public final int[] docs;
	private int count;
	
	public LimitedHitCollector(int capacity) {
		this.capacity = capacity;
		docs = new int[capacity];
	}
	
	@Override
	public void collect(int doc, float score) {
		if (count < capacity)
			docs[count] = doc;
//		else
//			throw custom exception here to break search, if you do not need real
count
		count++;
	}
	
	/**
	 * @return Number of collected docs
	 */
	public int getSize() {
		return Math.min(count, capacity);
	}

	/**
	 * @return Number of found docs
	 */
	public int getCount() {
		return count;
	}
	
}


Add throwing exception in collect(int doc, float score) to break search:

	@Override
	public void collect(int doc, float score) {
		if (count < capacity)
			docs[count++] = doc;
		else
			throw new SomeRuntimeException();
	}

And catch it from search.

Regards,
Rafis
-- 
View this message in context: http://www.nabble.com/Using-HitCollector-to-Collect-First-N-Hits-tp25090722p25093309.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message