lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Lamprecht <clampre...@gmail.com>
Subject Re: list all terms in a field
Date Fri, 08 Apr 2005 06:38:16 GMT
Mark, Here's a small piece of code that outputs a list of all terms
for a given field, in order of decreasing term frequency:

--- Requires Java 1.5 for PriorityQueue, or you can use Doug Lea's version ---


String field = "myfield".intern();                       // intern
required for != below
IndexReader reader = IndexReader.open(indexPath);
Term startTerm = new Term(field, "");         // set your field here
TermEnum termEnum = reader.terms(startTerm);
PriorityQueue queue = new PriorityQueue();     // using java 1.5's PriorityQueue

while (termEnum.next()) {
   Term term = termEnum.term();
   if (term.field() != field) break;       // lucene interns fields so != works
   int freq = reader.docFreq(term);
   queue.add(new TermFreq(term, freq));
}
reader.close();

while (!queue.isEmpty()) {
   TermFreq termFreq = (TermFreq) queue.remove();
   System.out.println(termFreq.term.text()+": "+termFreq.freq);
}

/* inner class for priority queue */
class TermFreq implements Comparable {
    Term term;
    int freq;
    public TermFreq(Term term, int freq) { this.term = term; this.freq = freq; }
    public int compareTo(Object o) {
        if (this == o) return 0;
        TermFreq other = (TermFreq) o;
        return other.freq - this.freq;
    }
}


On Apr 7, 2005 9:00 PM, Mark Gunnels <markgunnels@gmail.com> wrote:
> Is there a simple way to list all terms in a field?
> The only approach that I see is to use the IndexReader.terms()  method
> and then iterate over all the results and build my list by manually
> filtering. This seems inefficient and there must be a better way that
> my newbie eyes don't see.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message