lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From will martin <wmartin...@gmail.com>
Subject Re: Jensen–Shannon divergence
Date Mon, 14 Dec 2015 23:35:59 GMT
cool list. Thanks Uwe.

Opportunities to gain competitive advantage in selected domains.

> On Dec 14, 2015, at 6:02 PM, Uwe Schindler <uwe@thetaphi.de> wrote:
> 
> Hi,
> 
> Next to BM25 and TF-IDF, Lucene also privides many more similarity implementations:
> 
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/LMDirichletSimilarity.html
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/LMJelinekMercerSimilarity.html
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/IBSimilarity.html
> https://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/DFRSimilarity.html
> 
> If you want to implement your own, choose the closest one and implement the formula as
you described. I'll start with SimilarityBase, which is ideal base class for such types like
Dirichlet / DFR /..., because it has a default implementation for stuff like phrases.
> 
>> LMDiricletbut its feasibilit
> 
> I am not sure what you want to say with this mistyped sentence fragment.
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
>> -----Original Message-----
>> From: Jack Krupansky [mailto:jack.krupansky@gmail.com]
>> Sent: Monday, December 14, 2015 11:21 PM
>> To: java-user@lucene.apache.org
>> Subject: Re: Jensen–Shannon divergence
>> 
>> Is there any particular reason that you find Lucene's builtin TF/IDF and
>> BM25 similarity models insufficient for your needs? In any case,
>> examination of their source code should get you started if you with to do
>> your own:
>> 
>> https://lucene.apache.org/core/5_3_0/core/org/apache/lucene/search/simi
>> larities/TFIDFSimilarity.html
>> https://lucene.apache.org/core/5_3_0/core/org/apache/lucene/search/simi
>> larities/BM25Similarity.html
>> 
>> -- Jack Krupansky
>> 
>> On Sun, Dec 13, 2015 at 8:30 AM, Shay Hummel <shay.hummel@gmail.com>
>> wrote:
>> 
>>> Hi
>>> 
>>> I need help to implement similarity between query model and document
>> model.
>>> I would like to use the JS-Divergence
>>> <https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence>
>> for
>>> ranking documents. The documents and the query will be represented
>>> according to the language models approach - specifically the LMDiriclet.
>>> The similarity will be calculated using the JS-Div between the document
>>> model and the query model.
>>> Is it possible?
>>> if so how?
>>> 
>>> Thank you,
>>> Shay Hummel
>>> --
>>> Regards,
>>> Shay Hummel
>>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message