lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernhard Messer <Bernhard.Mes...@intrafind.de>
Subject Re: term frequency data of terms of all documents
Date Tue, 24 Aug 2004 15:11:30 GMT
Serkan,

it's easier using the IndexReader class to get the information you need. 
If you just need the doc frequency of each term you could use the sample.

IndexReader ir = null;
        try {
            if (!IndexReader.indexExists("tmp/index"))
              return;
            ir = IndexReader.open("/tmp/index");
            TermEnum termEnum = ir.terms();
            while (termEnum.next()) {
              Term t = termEnum.term();
              System.out.println(t.text() + " --> " + ir.docFreq(t));
             
            }
        }
        catch (IOException e) {
            System.out.println(e.toString());
        }
        finally {
            if (ir != null) {
                try {
                    ir.close();
                } catch (IOException e) {
                    System.err.println("IOException, opened IndexReader 
can't be closed: " + e.toString());
                }
            }
        }

hope this helps,
Bernhard

Serkan Oktar wrote:

>I want to build a list of terms of all documents and their frequency data. 
>It seems the information I need is in "tis" and "tii" files. However I havent't found
a way to handle them till now.
>
>How can I get the term frequency data?
>
>Thanks ,
>Serkan
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message