lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From PlusPlus <r.shahidine...@gmail.com>
Subject Re: Why is frequency a float number
Date Thu, 04 Mar 2010 19:49:00 GMT

Thanks for the reply. 
Actually what I'm looking for is to have a kind of fuzzy memberships for the
terms of a document. That is, for each term of a document, I will have a
membership value for that term and each term will be in each document, at
most once.

For that, I will need float TF and IDF values. It seems that Lucene does not
support what I need and I should change Lucene's code which is not an easy
task. Do you have any suggestions for me?

Best,
Reza



hossman wrote:
> 
> 
> :    I was wondering why TF method gets a float parameter. Isn't frequency
> : always considered to be integer? 
> : 
> :    public abstract float tf(float freq)
> 
> Take a look at how PhraseQuery and SPanNearQuery use tf(float).
> 
> For simple terms (and TermQuery) tf is always an integer, but when dealing 
> with phrases the concept of a "sloppy match" (ie: a phrase with a gap in 
> the middle) results in a fractional "frequency" value because it is not as 
> good as an "exact" match on the phrase (which does result in an integer tf 
> value)
> 
> 
> 
> 
> -Hoss
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Why-is-frequency-a-float-number-tp27714523p27785693.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message