lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Maisonneuve" <n.maisonne...@hotPop.com>
Subject Re: difference in javadoc and faq similarity expression
Date Mon, 19 Jan 2004 10:03:26 GMT
but in the javadoc expression, there no the TFIDF weight for query , juste
for the document and the Cosine   use the both.. hmm  strange

i have a report to write about lucene and i don't know
what formula write in the paper and how explain it



----- Original Message ----- 
From: "Karl Koch" <TheRanger@gmx.net>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Sunday, January 18, 2004 11:54 PM
Subject: Re: difference in javadoc and faq similarity expression


> I would rely on the JavaDoc since this one is up to date. The latest
version
> 1.3 final is just a few weeks old. Some entries in the FAQ however are
still
> from 2001...
>
> Cheers,
> Karl
>
> > hy,
> > i have troubles in find the correspondance betwwen the javadoc and faq
> > similarity expression
> >
> > in the Similarity Javadoc
> >
> > score(q,d) =Sum [tf(t in d) * idf(t) * getBoost(t.field in d) *
> > lengthNorm(t.field in d)  * coord(q,d) * queryNorm(q) ]
> >
> > in the FAQ
> >
> > score_d = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t *
boost_t)
> > *
> > coord_q_d
> >
> > In FAQ | In Javadoc
> > 1 / norm_q = queryNorm(q)
> > 1 / norm_d_t=lengthNorm(t.field in d)
> > coord_q_d=coord(q,d)
> > boost_t=getBoost(t.field in d)
> > idf_t=idf(t)
> > tf_d=tf(t in d)
> >
> > but
> > where is the javadoc expression for "tf_q" faq expression
> >
> > nicolas
> >
> > ----- Original Message ----- 
> > From: "Nicolas Maisonneuve" <n.maisonneuve@hotPop.com>
> > To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> > Sent: Sunday, January 18, 2004 9:33 PM
> > Subject: Re: theorical informations
> >
> >
> > > thanks Karl !
> > >
> > > ----- Original Message ----- 
> > > From: "Karl Koch" <TheRanger@gmx.net>
> > > To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> > > Sent: Sunday, January 18, 2004 9:22 PM
> > > Subject: Re: theorical informations
> > >
> > >
> > > > Actually, finding an answer to this question is not really
important.
> > More
> > > > important is if you can do what you want with it. If you result
comes
> > from
> > > a
> > > > prob. model or a vector space model, who cares if you just want to
> > give
> > a
> > > > query and back a hit list of results?
> > > >
> > > > Possibliy some people here will strongly disagree... ;-) (?)
> > > >
> > > > Karl
> > > >
> > > > > Hello Nicolas,
> > > > >
> > > > > I am sure you mean IR (Information Retrieval) Model. Lucene
> > implements
> > a
> > > > > Vector Space Model with integrated Boolean Model. This means the
> > Boolean
> > > > > model
> > > > > is integrated with a Boolean query language but mapped into the
> > Vector
> > > > > Space.
> > > > > Therefore you have ranking even though the traditional Boolean
model
> > > does
> > > > > not
> > > > > support this. Cosine similarity is used to measure similarity
> > between
> > > > > documents and the query. You can find this in a very long
dicussion
> > here
> > > > > when you
> > > > > search the archive...
> > > > >
> > > > > Karl
> > > > >
> > > > > > hy ,
> > > > > > i have 2  theorycal questions :
> > > > > >
> > > > > > i searched in the mailing list the R.I. model implemented in
> > Lucene
> > ,
> > > > > > but no precise answer.
> > > > > >
> > > > > > 1) What is the R.I model implemented in Lucene ? (ex: Boolean
> > Model,
> > > > > > Vector Model,Probabilist Model, etc... )
> > > > > >
> > > > > > 2) What is the theory Similarity function  implemented in Lucene
> > > > > > (Euclidian, Cosine, Jaccard, Dice)
> > > > > >
> > > > > > (why this important informations is not in the Lucene Web site
or
> > in
> > > the
> > > > >
> > > > > > faq ? )
> > > > > >
> > > > >
> > > > > -- 
> > > > > +++ GMX - die erste Adresse für Mail, Message, More +++
> > > > > Bis 31.1.: TopMail + Digicam für nur 29 EUR
> > http://www.gmx.net/topmail
> > > > >
> > > > >
> > > > >
> > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > > > > For additional commands, e-mail:
lucene-user-help@jakarta.apache.org
> > > > >
> > > >
> > > > -- 
> > > > +++ GMX - die erste Adresse für Mail, Message, More +++
> > > > Bis 31.1.: TopMail + Digicam für nur 29 EUR
http://www.gmx.net/topmail
> > > >
> > > >
> > >
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > > >
> > > >
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
>
> -- 
> +++ GMX - die erste Adresse für Mail, Message, More +++
> Bis 31.1.: TopMail + Digicam für nur 29 EUR http://www.gmx.net/topmail
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message