lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Koch" <TheRan...@gmx.net>
Subject Re: difference in javadoc and faq similarity expression
Date Sun, 18 Jan 2004 22:54:14 GMT
I would rely on the JavaDoc since this one is up to date. The latest version
1.3 final is just a few weeks old. Some entries in the FAQ however are still
from 2001...

Cheers,
Karl

> hy,
> i have troubles in find the correspondance betwwen the javadoc and faq
> similarity expression
> 
> in the Similarity Javadoc
> 
> score(q,d) =Sum [tf(t in d) * idf(t) * getBoost(t.field in d) *
> lengthNorm(t.field in d)  * coord(q,d) * queryNorm(q) ]
> 
> in the FAQ
> 
> score_d = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t * boost_t)
> *
> coord_q_d
> 
> In FAQ | In Javadoc
> 1 / norm_q = queryNorm(q)
> 1 / norm_d_t=lengthNorm(t.field in d)
> coord_q_d=coord(q,d)
> boost_t=getBoost(t.field in d)
> idf_t=idf(t)
> tf_d=tf(t in d)
> 
> but
> where is the javadoc expression for "tf_q" faq expression
> 
> nicolas
> 
> ----- Original Message ----- 
> From: "Nicolas Maisonneuve" <n.maisonneuve@hotPop.com>
> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> Sent: Sunday, January 18, 2004 9:33 PM
> Subject: Re: theorical informations
> 
> 
> > thanks Karl !
> >
> > ----- Original Message ----- 
> > From: "Karl Koch" <TheRanger@gmx.net>
> > To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> > Sent: Sunday, January 18, 2004 9:22 PM
> > Subject: Re: theorical informations
> >
> >
> > > Actually, finding an answer to this question is not really important.
> More
> > > important is if you can do what you want with it. If you result comes
> from
> > a
> > > prob. model or a vector space model, who cares if you just want to
> give
> a
> > > query and back a hit list of results?
> > >
> > > Possibliy some people here will strongly disagree... ;-) (?)
> > >
> > > Karl
> > >
> > > > Hello Nicolas,
> > > >
> > > > I am sure you mean IR (Information Retrieval) Model. Lucene
> implements
> a
> > > > Vector Space Model with integrated Boolean Model. This means the
> Boolean
> > > > model
> > > > is integrated with a Boolean query language but mapped into the
> Vector
> > > > Space.
> > > > Therefore you have ranking even though the traditional Boolean model
> > does
> > > > not
> > > > support this. Cosine similarity is used to measure similarity
> between
> > > > documents and the query. You can find this in a very long dicussion
> here
> > > > when you
> > > > search the archive...
> > > >
> > > > Karl
> > > >
> > > > > hy ,
> > > > > i have 2  theorycal questions :
> > > > >
> > > > > i searched in the mailing list the R.I. model implemented in
> Lucene
> ,
> > > > > but no precise answer.
> > > > >
> > > > > 1) What is the R.I model implemented in Lucene ? (ex: Boolean
> Model,
> > > > > Vector Model,Probabilist Model, etc... )
> > > > >
> > > > > 2) What is the theory Similarity function  implemented in Lucene
> > > > > (Euclidian, Cosine, Jaccard, Dice)
> > > > >
> > > > > (why this important informations is not in the Lucene Web site or
> in
> > the
> > > >
> > > > > faq ? )
> > > > >
> > > >
> > > > -- 
> > > > +++ GMX - die erste Adresse für Mail, Message, More +++
> > > > Bis 31.1.: TopMail + Digicam für nur 29 EUR
> http://www.gmx.net/topmail
> > > >
> > > >
> > > >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > > >
> > >
> > > -- 
> > > +++ GMX - die erste Adresse für Mail, Message, More +++
> > > Bis 31.1.: TopMail + Digicam für nur 29 EUR http://www.gmx.net/topmail
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > >
> > >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

-- 
+++ GMX - die erste Adresse für Mail, Message, More +++
Bis 31.1.: TopMail + Digicam für nur 29 EUR http://www.gmx.net/topmail


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message