lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Bamford <chris.bamf...@talktalk.net>
Subject Short circuiting Collector
Date Wed, 20 Jul 2011 16:42:37 GMT
Hi there,

I have my own Collector implementation which I use for searching, something like this skeleton:

public class LightweightHitCollector extends Collector {

    private int maxHits;
    private int numHits;
    private int docBase;
    private boolean collecting;
    private Scorer scorer;
    private int[] hits;

    public LightweightHitCollector(int maxHits) {

        this.numHits = 0;
        this.maxHits = maxHits;
        this.collecting = true;
        hits = new int[maxHits];
        for (int i=0; i < maxHits; i++) { hits[i] = -1; }
    }

    public boolean acceptsDocsOutOfOrder() {
        return true;
    }

    public void setScorer(Scorer scorer) {
        this.scorer = scorer;
    }

    public void setNextReader(IndexReader reader, int docBase) {
        this.docBase = docBase;
    }

    public void collect(int docID) {

        if (! collecting) {
            return;
        }

        hits[numHits] = docBase + docID;

        if (++numHits == maxHits) {
            collecting = false;
        }
    }

    public int[] getHits() {
        return hits;
    }
}

Question: is there a way to prevent collect() being called after it has collected its quota
 (i.e. when collecting becomes false)?  On large datasets this would save a lot of time.
In this scenario I have no need for sort / ordering etc.
Thanks.

- Chris

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message