lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Fuzzy membership of a term to the document
Date Fri, 26 Feb 2010 00:48:17 GMT
Hello Reza,

I've seen some similar stuff to what you mention, such as
http://ece.ut.ac.ir/dbrg/Hamshahri/Papers/FuFaIR.ppt
In that experiment, the membership was calculated with tf/idf parameters (it
looks like that gave best results).

I am scratching my head as to how this model could be easily implemented in
Lucene, but please report back if you figure something out... its
interesting!

On Wed, Feb 24, 2010 at 11:14 PM, PlusPlus <r.shahidinejad@gmail.com> wrote:

>
> Hi,
>
>   I want to change the Lucene's similarity in a way that I can add Fuzzy
> memberships to the terms of a document. Thus, TF value of a term in one
> document is not always 1, it can add 0.7 to the value of the TF ( (In my
> application, each term is contained in a document at most once). This
> membership value is available before index time.
>
>   On the other hand, each occurrence of a word will not be considered as 1
> documentfrequency for the IDF formula.
>
>   I was wondering if I can change the TF and IDF values of the terms like
> this. So far, I know that I can change the impact of TF values on the
> scoring, but not this thing that I'm looking for.
>
> Best,
> Reza
>
>
> --
> View this message in context:
> http://old.nabble.com/Fuzzy-membership-of-a-term-to-the-document-tp27714347p27714347.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Robert Muir
rcmuir@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message