lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From wal...@Cyveillance.com
Subject TopTerms on query results
Date Wed, 22 Sep 2004 20:28:50 GMT
Can anyone help me with code to get the topterms of a given field for a
query resultset?

Here is code modified from Luke to get the topterms for a field:

    public TermInfo[] mostCommonTerms( String fieldName, int numberOfTerms )
    {
        //make sure min will get a positive number
        if ( numberOfTerms < 1 )
        {
            numberOfTerms = Integer.MAX_VALUE;
        }
        numberOfTerms = Math.min( numberOfTerms, 50 );
        //String[] commonTerms = new String[numberOfTerms];
        try
        {
            IndexReader reader = IndexReader.open( indexPath );
            TermInfoQueue tiq = new TermInfoQueue( numberOfTerms );
            TermEnum terms = reader.terms();

            int minFreq = 0;
            while ( terms.next() )
            {
                if ( fieldName.equalsIgnoreCase( terms.term().field() ) )
                {
                    if ( terms.docFreq() > minFreq )
                    {
                        tiq.put( new TermInfo( terms.term(), terms.docFreq()
) );
                        if ( tiq.size() >= numberOfTerms ) // if tiq
overfull
                        {
                            tiq.pop(); // remove lowest in tiq
                            minFreq = ( (TermInfo) tiq.top() ).docFreq; //
reset
                            // minFreq
                        }
                    }
                }
            }
            TermInfo[] res = new TermInfo[ tiq.size() ];
            for ( int i = 0; i < res.length; i++ )
            {
                res[ res.length - i - 1 ] = (TermInfo) tiq.pop();
            }
            reader.close();
            return res;

        }
        catch ( IOException ioe )
        {
            logger.error( "IOException: " + ioe.getMessage() );
        }
        return null;
    }


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message