lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@syr.edu>
Subject Re: Max Frequency and Tf/Idf
Date Fri, 14 Apr 2006 11:30:31 GMT
The Term Vector code can be used to get the term frequencies from a 
specific document.  Search this list, see the Lucene In Action book or 
look at http://www.cnlp.org/apachecon2005 for examples on how to use 
Term Vectors

Danilo Cicognani wrote:
> Hello everybody.
> We are building a complex automatic classification system using Lucene.
> We need to manage normalized Tf/Idf (Term Frequency / Inverse Document
> Frequency).
> We understood that Lucene can give us Tf and Df and we are using these
> values to calculate the normalized Tf/Idf but we would like to optimize this
> calculation for better performance.
> Is there any way to expose the maximum term frequency in a document from
> Lucene, and maybe to obtain the normalized Tf/Idf from Lucene?
> There aren't a public methods to get these values, but maybe Lucene holds
> these informations privately and with a modify on Lucene source we could
> have the work done to fasten the system.
>
> P.S. Sorry for MY English: I hope I explained clearly my question.
>
> **** 1000 KBye ****
>
>  [) /\ |\| | |_ ()
>
> web: www.ciconet.it
> Web Portal Now: www.webportalnow.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   

-- 

Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
335 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message