lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Jensen–Shannon divergence
Date Mon, 14 Dec 2015 23:02:53 GMT
Hi,

Next to BM25 and TF-IDF, Lucene also privides many more similarity implementations:

https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/LMDirichletSimilarity.html
https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/LMJelinekMercerSimilarity.html
https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/IBSimilarity.html
https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/DFRSimilarity.html

If you want to implement your own, choose the closest one and implement the formula as you
described. I'll start with SimilarityBase, which is ideal base class for such types like Dirichlet
/ DFR /..., because it has a default implementation for stuff like phrases.

> LMDiricletbut its feasibilit

I am not sure what you want to say with this mistyped sentence fragment.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Jack Krupansky [mailto:jack.krupansky@gmail.com]
> Sent: Monday, December 14, 2015 11:21 PM
> To: java-user@lucene.apache.org
> Subject: Re: Jensen–Shannon divergence
> 
> Is there any particular reason that you find Lucene's builtin TF/IDF and
> BM25 similarity models insufficient for your needs? In any case,
> examination of their source code should get you started if you with to do
> your own:
> 
> https://lucene.apache.org/core/5_3_0/core/org/apache/lucene/search/simi
> larities/TFIDFSimilarity.html
> https://lucene.apache.org/core/5_3_0/core/org/apache/lucene/search/simi
> larities/BM25Similarity.html
> 
> -- Jack Krupansky
> 
> On Sun, Dec 13, 2015 at 8:30 AM, Shay Hummel <shay.hummel@gmail.com>
> wrote:
> 
> > Hi
> >
> > I need help to implement similarity between query model and document
> model.
> > I would like to use the JS-Divergence
> > <https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence>
> for
> > ranking documents. The documents and the query will be represented
> > according to the language models approach - specifically the LMDiriclet.
> > The similarity will be calculated using the JS-Div between the document
> > model and the query model.
> > Is it possible?
> > if so how?
> >
> > Thank you,
> > Shay Hummel
> > --
> > Regards,
> > Shay Hummel
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message