lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alf Eaton <li...@hubmed.org>
Subject Re: Stemmed terms/common terms
Date Thu, 16 Aug 2007 16:51:47 GMT
On 16 Aug 2007, at 15:17, Alf Eaton wrote:
>
> - Is there a way to get a list of all the terms in the index (or  
> maybe just the top n) ordered by descending frequency of usage? I  
> imagine it's related to docFreq, but can't see how to get a list of  
> terms in all documents.

Thanks to http://tinyurl.com/2gndww I worked out how to do this (to  
get a list of terms and their frequency) with PyLucene:

terms = reader.terms()
while terms.next():
   term = terms.term()
   if term.field() == 'title':
     print '%s - %d' % (term.text(), reader.docFreq(term))


alf.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message